Yesterday, the Centre for AI Safety released a landmark statement. Signed by over 350 leading experts and boffins from the world of tech, the succinct statement warns of a genuine extinction risk as a result of AI development.
Signed by the CEOs of Google Deepmind, OpenAI, Anthropic AI, as well as Geoffrey Hinton - the man often referred to as ‘the godfather of AI’ - the statement reads:
“Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”
Shortly after this statement was published, I sat down with Connor Leahy, the CEO of AI alignment company Conjecture and another one of those 350 odd signatories to the statement. He’s the kind of guy who OpenAI CEO Sam Altman follows on Twitter.
For those not keeping up with the lingo, AI alignment is the thing that basically everyone agrees should be a higher priority, whether or not they are in favour of harsher curbs or brakes on AI development. Alignment is the process of ensuring that the behaviour of an artificial intelligence system aligns with human values, goals, and ethics, so that it acts in ways that are beneficial and safe for us.
By contrast, unaligned AI refers to artificial intelligence systems that doesn't fully understand or follow human values, goals, and ethics, which might lead them to act in ways that could be harmful or undesirable to us.
As AI becomes more and more powerful, it becomes more and more important that we understand how to grow AI aligned with human goals and needs. And as things stand, we really do not understand how to do that.
The boffins behind the biggest AI companies do not understand how their AI works. But more on that in a second.
I sat down with Connor Leahy for a package on GB News. As things go in telly, we could only use a short snippet of this conversation in our three minute package, and of course news demands that platitudes from Rishi Sunak and Wes Streeting get more airtime. But that’s why I like Substack newsletters. We can get far more nerdy here than on broadcast media.
So here in full is the transcript of my conversation with Leahy, clearly setting out what he sees as the dangers of rapid unregulated AI development. And setting out in impressively clear language how we got to where we are.
—
TH: So first of all, can you just sort of lay out, in layman's terms, what are the existential risks of artificial intelligence?
CL: So it's really quite simple. AI systems are general purpose intelligence as reasoning systems or problem solvers. This is what we're trying to build. The general system or the general thing we're trying to build is things that solve problems. That achieve goals of some kind and we're getting very good at this. Our systems are getting very powerful, very general, very many different things and this is happening faster and faster.
It's not taking years for these systems to improve. They can get radically better in months or even weeks and there's no limit to what we're currently seeing at this.
And the real problem here is, is that it's very important to make clear that these systems are not like normal computer programs. This isn’t like this code written by a human. Normal computer programs are written by humans.
But AI systems are very different. They are more, like, grown. It's more like an organic thing. This is the process called training, where instead of writing the code of what your AI does, you instead give it lots of data. And it learns from this data. It grows from this data. And so the final system, the final AI you get out of this is this black box. It's like full of numbers. And it works. You know, these things do many useful things. We don't know how they work internally. No one does. This is an unsolved scientific problem.
So what this means is, is that we don't know how they work and we do not know how to control them. We do not actually know how to get these systems to really do what we want to do. And maybe if this is, you know, a funny chat bot, maybe that's okay. Maybe this is not a huge problem. But when these are extremely intelligent, goal seeking systems that are as smart as humans or even smarter than humans, that are fundamentally alien…
And these are not humans. They don't have emotions. They're like weird black boxes. And if they're very powerful, they're very smart and they're very good at achieving goals and we don't know how to control their goals. Well, who knows what they're capable of? They will be able to do basically anything.
TH: What do you say to those who say this sounds like something out of science fiction. You say that ChatGPT and other systems like it to just predicting the the next word in a series or in a sentence and they're not really intelligent that this isn't really AI?
CL: Well if you would have described ChatGPT to someone four years ago, they would have also said that sounds like scifi to me. That seems impossible because four years ago it was impossible to do this the same way that you know. A year before the first airplane took off, people thought heavier than air flight was impossible. So it is true in a narrow technical sense that these are predicting the next token. But this is kind of irrelevant because, you know, your brain is also just inputting neuron signals and outputting neuron signals. So what?
That's not the interesting thing. It's inputting tokens. It's importing information, processing that information and outputting results of this information, they’re general purpose symbol predictors. And symbols can represent anything, you know, it doesn't have to be language. It can be math or code or you know, anything. Humans use language to describe all kinds of things and you know, to make plans or to reason, to develop science, to do whatever.
So similarly, the real shocking thing about these new systems, these large language models, is how general they are, is that we took these big piles of numbers. And we just threw a bunch of random Internet text at them and suddenly they learned these general purpose cognition abilities. They learned to solve problems, to write text, to explain jokes and whatever. And sure, a lot of that is memorization, but not all of it. A lot of it is general purpose algorithms, general purpose cognition that these things just picked up from the general patterns. They're, they're pattern recognitions, the same way that the brain is just pattern recognition.
TH: So how close do you think we are really to what's known as artificial general intelligence, AGI, which seems like a bit of a step change from where we are right now?
CL: It really depends on what you mean by the word AGI. I really just like the word AGI, because the word AGI seems to imply, oh, it's like a chat bot, you know, it's like a really smart chat bot or something. Like a robot buddy or something. I’m going to be very clear here. When people in the industry, you know, like OpenAI or DeepMind talk about AGI, this is not what they mean.
When they say AGI, what they mean is God-like intelligence. They mean systems that are so vastly beyond human intelligence, they're like gods. They can solve all kinds of science. They can trick humans. They can deceive. They could develop new technologies. They could outsmart humans. They could hack any computer system. This is the thing that these people truly think is possible, and that they are building towards.
And the truth is that we don't have any scientific reason to disbelieve this. This seems straightforwardly possible.
Jeffrey Hinton, who is really the godfather of this field, one of the greatest scientific living scientists in the field of AI, spent a large part of his career thinking that the algorithms in the brain are so much more advanced than what we have in our computers that it would take decades before our computers will be able to catch up with the with the brain. Recently, he's gone on the record to say that he no longer believes this.
He actually thinks that the algorithms in computers are more powerful than the those in our brains. They are more efficient, they are more powerful. They're different. And there are still many problems to be solved, but there are no fundamental limits that we know about. You know, maybe we will find one. But so far, we put more data into these systems, we improve our algorithms, and they just get smarter and smarter and smarter and smarter.
Every six to twelve months the top running companies like DeepMind, Anthropic, and OpenAI release a new system that are 10 to 40 X more powerful. This is an insane rate of progress. It's really an exponential. And you know, as we learned with Covid, there are only two times you can react to an exponential, too early or too late.
TH: So when we're looking at God like AI, an AI that is just so much more intelligent, unfathomably more intelligent than human beings. Generally the risk gets put in two categories. Number one, that they're going to be unaligned with human aims and that somehow we will be crushed under some sort of giant boot. Or on the other hand, that bad actors, bad human actors will use this sort of AI to create bioweapons or stoke division, break the Internet or whatever else it is. How likely would you rank those two possibilities?
CL: I think they're both extremely likely. The only reason I expect misuse to be less likely is I expect accidents to kill us first. It's if we build these systems at to this pace of, which again is exponential. We're not talking about every year we add, you know, one point or something. We're talking double or, you know, quadruple the power of these systems or even more every single year. If this continues, we continue this insane improvement. We're going to jump from sort of smart systems to extremely smart systems extremely quickly, which is what we saw.
Five years ago our systems couldn't talk. Nowadays you can talk to chatbot and it's not perfect, but they can hold a conversation and it's quite good. This was impossible five years ago and now we're here and this happened very quickly and it's only speeding up from here.
So accidents and misuse are both extremely relevant. I think these are both extremely relevant risks. The accident risk comes from if we get to superhuman systems and we cannot control them by default, they will do things that we don't like and it will be dangerous. Because no matter what their goals are, they will want to gain power. They will want to gain resources, money, energy, influence, whatever.
When humans build a hydroelectric dam and there's an anthill there, well, too bad for the ants. And that's exactly how AI will feel about us as well. It's not going to hate us. This won't be Terminators trying to do evil things, it will just want to build factories or solar panels or whatever, and humans happen to be in the way and so just have to get rid of them.
Misuse is even worse in a sense in that even if we manage to solve this problem, which is a huge unsolved scientific problem. And again, Jeffrey Hinton, in a recent lecture just basically admitted that he has no idea how to solve this problem. Even if we should control these systems - we should actually give them goals that they actually follow - we're not out of the woods because then we still have the problem. What if dangerous actors get access to this technology?
There are people who have very bad intentions, who want to harm other people, whether it's for political reasons or general mental illness or whatever. And if such people have access to technology... As an example, I have once talked to a very senior person, national security, and I gave them the hypothetical what would happen if your adversaries had access to 100 Einsteins who never slept, never got tired, were perfectly loyal and were completely perfectly sociopathic, had no emotions and they were trying to attack your nation. How how would you defend against that? And he said we couldn't. There will be no defense against such a system and this is the minimum that AGI will be capable of doing.
TH: What do you say to those who say that, sure, there's a risk to AI, but it's it's nowhere near 100%. It's maybe a 2% to 3% likelihood that an AGI will be nefarious in its means, and that the bigger risk might come from over-regulation; that AI is doing things like like solving diseases and and allowing paraplegics to walk and that to slow down that progress could harm humanity more?
CL: So it's very important to be clear that what I'm talking about is a very, very narrow subset of the large field of AI. Most AI, for example, the work you're citing about allowing paraplegics to walk and doing medicine, is using extremely narrow, extremely small AI models. It's like comparing the engine of your scooter to a hypersonic jet engine. These are simply not the same kind of thing. Sure, they're both engines, kind of, but we regulate them very differently. It's fine for you to own a scooter, but we generally don't allow anybody to own a hypersonic fighter jet, and this is I think, very reasonable. I think this is exactly what we should be doing.
We should be regulating this forward running, general purpose AGI, super intelligence type research the same way we regulate fighter jets or nuclear weapons or stuff like this. I'm not saying no research maybe happen, maybe in government controlled facilities or an international CERN like thing, we could be focused on very careful safe research while still reaping the benefits of not of much less dangerous non super intelligence research in the in the wider public.
Personally, if I can give my own opinion here, 2% risk of killing literally everybody. No, I'm not taking that risk. Like, who are you to… like, have you asked every single person on the planet whether they consent to a 2% chance of them and their family being killed? No, I'm not consenting to that. No way.
—
Well, it’s certainly food for thought. I am particularly interested in the distinction Leahy draws between extremely small AI models and ‘hypersonic fighter jet AI’, and how regulation should be treating these two things very differently. I think this is a distinction that is yet to puncture the nerd bubble and make its way into media discussions.
But what is exciting is that the general issue has punctured through the nerd bubble this week. It’s on four front pages of British newspapers today. This is the start of something incredibly exciting and also concerning.
While Rishi Sunak is beyond out of his depth on housing, on AI risks he is making sensible noises. The question is no longer whether or not AI is regulated, but how it is regulated. And the British PM bringing this up at the G7 earlier this month really matters. Setting ‘guardrails’ for the most powerful AI systems - whilst not smothering the little companies - really matters too.
As ChatGPT told me yesterday, “the race isn't just to build the most advanced AI, but to ensure its rise doesn't precipitate our own fall.”
Bonus: Here’s me discussing all of this on GB News this morning.
I was excited when I saw the topic you were tackling & you didn't disappoint! Great interview & altogether a really good, extremely terrifying & thought-provoking read 👍👍👍