How does ChatGPT work?

How does it reply with such coherent answers?
24 January 2023

Interview with 

Michael Wooldridge, University of Oxford

COMPUTER_NETWORK

A stylised computer network.

Share

It's sensible not to take everything ChatGPT told you at face value, not least because the software admitted itself that you shouldn’t take everything it says as gospel. Oxford University’s Mike Wooldridge, who spoke to us when this story was breaking, is with us now to provide the human touch, and hopefully help us understand how all this is possible…

Mike - What's happened is that people have realised that scale matters in artificial intelligence. And what scale means for these systems is three things. Firstly, it means how big are your neural networks? Literally the larger your neural networks are, the more elements that they have. That matters. The amount of training data that you use to train your system - modern artificial intelligence absolutely relies on training data so that matters. And finally, the amount of compute effort that you are prepared to throw at training these programmes - that matters. And so there was this move that started around about five years ago that just said, "let's see how far we can take scale. Let's see how big our neural networks can get. Let's see how much data we can throw at these problems and let's see how much compute resource we are prepared to use." And the first system, the ancestor of ChatGPT was GPT-2, which I think appeared in 2018 or 19. Famously, it was supposedly so good that they were not prepared to release it to the public because this unprecedented power was too much for us to handle. But what happened with GPT-3, the successor system, is basically it was an order of magnitude bigger, an order of magnitude more data, an order of magnitude more compute power. And that's the way things are going. There's been a race for scale. That's what we're seeing. We're seeing the benefits of that.

Chris - I was just pressing James on how quickly it responded because normally you're used to your computer taking a while to load a game or something, and it's generating this output almost instantly as though it were just a human spouting a result back at you. What sort of computing grunt have they got on the back end of that to make that possible?

Mike - Okay, so you've got to distinguish two different things. Firstly, there's building, or training, the model; throwing the data at it to train it so that it learns how to be able to respond. That takes AI supercomputers running for months. Computationally it's one of the heaviest tasks that people are doing in computing. Now there's a big concern here about the amount of CO2 that's being generated while you're doing that. We believe that GPT-3, which is the technology that underpins ChatGPT, uses something like 24,000 GPUs - graphics processing units. And these are high performance AI computers running for a number of months in order to be able to churn through that data. So that's the training part. But once you've got that, you've got your neural network structures, actually using them, the runtime as we call it, what you were doing when you had your conversation, that's much cheaper but you're not going to do it on a desktop computer, you don't need anything like the scale. You don't need super computers to do that but you still need a lot more than a desktop computer. And the reason is those neural networks are very, very big. GPT-3 is 175 billion parameters. Basically, these are the 175 billion numbers that make up that neural network.

Chris - That's what I wanted to ask you about, because what has it actually learned? What is sitting in that machine that means when James asks it for its opinion on colors and it says, "well, I don't have one." How is it doing that?

Mike - There's a long answer and a short answer. The short answer is that we don't exactly know. The long answer is that basically what these things are doing is exactly the same as your smartphone does when it suggests a completion for you. So if you open up your smartphone and you start sending a text message to your partner saying, "I'm going to be...", it might suggest "late" or "in the pub." How is it doing that? Because it's looked at all the text messages that I've ever sent and it's seen that whenever I type "I'm going to be..." the likeliest next thing is going to be "late" or "in the pub." GPT systems are doing the same thing, but on a vastly larger scale. The training data for them is not the text messages you've sent, it's every bit of digital text that they could get their hands on when they wanted to train it. They download the entire internet and they train it using all of that text to try to make the prediction of what would be the next likeliest thing in the sentence.

Chris - The problem is, Mike, that the internet is full of rubbish. There's tons and tons of unreliable data out there. So how do you make sure that your system can sort wheat from chaff?

Mike - So you've put your finger on one of the big issues with this technology at the moment. There is so much data that it can't all be checked by humans before it's fed to the machine. Again, the details are sketchy on exactly how it happened in these public systems, but there will be some screening, probably automatic screening, looking for toxic content that will work to a certain extent. But it won't be reliable. It will get things wrong. It will allow through some things that really, ideally, we wouldn't allow. It will not be able to check the veracity of an awful lot of stuff that it's fed. What we're getting out of this is some kind of aggregate picture. It's like an average of what it's seen out there on the internet. But, to be honest, we need to do a ton more work to understand exactly what's going on there and exactly how we can deal with those issues. These are brand new tools that have landed on planet Earth and we've got a lot more work to do to understand them.

Chris - What can we expect to see this do next?

Mike - So the things that they're phenomenally good at are things to do with text. I urge you to try it: right? Go to to the BBC News website, cut and paste a story, and ask it to summarise it. In my experience, it usually does a very, very good job of coming out with the summary. Ask for a one paragraph summary, ask it to extract the top three bullet points from the news story, and it will do that. Take two news stories about the same thing and ask it to find what are the commonalities in the news story? What are the points of difference? It's in my experience also, the technology is very, very good at that. It's not perfect. You have to check it. It comes out with falsehoods, but it's very good. Where are you going to see it? You're going to see it in your email system. So instead of showing you every email, you're going to get the top three bullet points from your email. I think that would be quite a useful thing to be able to do.

Comments

Add a comment