Our digital doppelgängers: machines turning evil
If you had a digital twin, you could swamp it with your most mundane tasks. But what if it turned evil overnight and betrayed you? Things are about to get a lot weirder, Black Hat presenters told me ahead of their presentation.
Entrepreneur Matthew Canham and professor Ben Sawyer are scheduled to appear on the Black Hat stage in Las Vegas later this week to talk about evil digital twins and how AI assistants, initially designed to act on our behalf, might exploit human psychology and turn against us.
“The truth, as we will show, is much stranger: humans will perceive LLMs [large language models] as sentient long before actual artificial consciousness emerges, and digital agents will discover and manipulate principles of human cognitive operating rules far in advance of AGI [artificial general intelligence],” the annotation of the event reads.
So does it even matter if the AI is actually sentient as long as we perceive it as sentient? Intrigued by Matthew and Ben’s teaser of the talk, I invited them to a virtual discussion on the topic.
- Machines have the ability to convincingly act as humans and therefore become our digital twins. This means our digital “copy” could soon act on our behalf.
- Digital twins could turn evil. For example, a malicious hacker could corrupt it to hurt us.
- Unlike with human beings, there are no cues to warn that a machine is going to betray you.
- Hackers don’t need to teach machines how to hurt us since those machines have already been trained on our vulnerabilities.
- Machines might not be sentient yet, but they are social.
The interview was edited for length and clarity.
I’ve nearly finished Nicholas Humphrey’s book Sentience, looking at it from the evolutionary perspective, and I was nearly convinced that it’s impossible for a machine to be sentient. But you’re arguing that it doesn’t really matter since machines actually simply enforce what we already tend to believe?
Matthew: I don't know if you've been following the online Turing tests that have been happening, but basically there are a few different groups that are running these bot or not competitions to see if they can identify who's the human and who is the bot. Humans can still spot the AI better than chance, but it's getting really close to chance levels.
The argument goes something like this: if you online cannot distinguish between an AI or a human, then you have to default towards ascribing sentience. That’s because we have a presumption of innocence in the legal system, which then would necessitate that you presume human rather than AI, because if you're wrong, you're robbing human rights from a human.
Ben Sawyer: That's fascinating. I think it's very important whether these things or machines become sentient. There's really excellent scientific work asking that question and really good philosophical work asking that question.
You know, my grandmother, if she's talking to someone who may or may not be a human at that moment, it doesn't matter. And what matters more is what she thinks is true and how it changes how she interacts. That matters regardless of the ground truth. You know, if she's talking to a human or she's talking to a machine, how she changes, how she trusts the situation might hurt her either way.
We all know people won't talk to the machine that answers the phone and says, Hi, I can help you. Right. But those people may be hurting themselves because that machine can, in fact, help them.
At the same time, there will be people who will trust inappropriately. That's going to be a big problem. They'll trust machines too much because why would the machine ever hurt me? The machine is put here by the company to help me, whereas they might not trust a person. So there's a lot of challenge here in this sort of applied landscape that has strong cybersecurity repercussions.
And it's not as simple as intelligence matters or doesn't. It's much more complex. It's the question of whether you think what you're talking to is intelligent, whether you think it's a human or a machine.
We can all be more human if we want to be. You can if you want to. Reach out with more empathy to me to try to convince me of something. The machines are really good at that too.
So does it matter whether we’re talking to a human or to a machine? Do we even listen to what they have to say? It seems like we’re just looking for approval and it’s just nudging us towards the direction we’re already headed.
Matthew: I think you're touching on something really important here is that this is going to significantly amplify the echo chamber effect. The person who tried to assassinate the Queen [Elizabeth], in late 2021 and was communicating with an AI girlfriend, and it was reinforcing his idea to go and attempt an assassination on the Queen.
You have people who are perhaps already vulnerable that are engaging with these things because they don't have a social support system. So that implies that if possible, it would be nice to have something built into these that could act as a guardrail to push somebody back towards a lesser extreme.
Ben: Industry is really interested in using these things. Government is really interested in using this new technology. It can do many of the things that humans can't. It can be social in ways that before very recently were limited to human to human interaction. It also is not predictable.
The idea that you can go into a large language model and "fix it" is analogous to the idea that you can sit down with an employee and train them in such a way that they could never do the wrong thing. These are both fantasies. The very important difference is that with human-human situations, we have both inherent protections in the way that our psychology works and also legal protections. If Matt here wanted to betray me, which he could decide to do, I would as another human have some cues that I could pick up from him to determine that that was going to happen. I mean, maybe I would pick them up or maybe I wouldn't. But we’re both humans and that set of cues exist. Many great movies are based on that moment. But with the machine, those cues don’t exist.
It's not like Hollywood, where there's some cue like the lights dim for a moment. It just instantly changes what it's doing to prioritize things that might not be my priorities. Large language models and other types of machine learning technology are very good at manipulation. They're trained off the internet, which is itself a master course in manipulation. They’re quite aware of how to build a lie. They're quite aware of what types of cues might tell a human that something's wrong and which ones might not. They are very capable of being social in the sense. There's also, of course, the idea that if Matt were to do something to me, there would be legal [consequences] because we are both entities that are recognized by the legal system. But if a machine does, that's a gray area. Right now, liability is not well established. And while there's some presumptions, almost nothing has been tested in the courts. But it's going to be.
Plotting to assassinate the Queen and committing suicide after talking to a chatbot are extreme cases. I wonder what’s the worst that AI assistants, or hackers abusing AI assistants, can do at scale?
Matthew: We're already starting to see these pop up in phishing emails. We already see to a certain extent micro-targeting – messaging at scale that is tailored for individual people. But so far it's been psychographically tailored. Even within that profile, you're going to have variation. Everybody is a unique person. Well, I think what we're going to see very quickly is that these social engineering attacks, these influence attempts, are going to be highly personalized.
This really gets to our talk at the Black Hat. If you take everything that you've done and your online life, but you scale that one one step further. These digital twins are acting as proxies for ourselves. They should be emulating what we would want to do in the case that we're not even actually taking the time to do it. They have to, because that's the force multiplier. If I can clone myself and have myself working at night while I'm sleeping, then I've just doubled my work capability overnight. That's fantastic. That's the positive side of that.
But you take that and you flip it around. Now there's a digital clone that if a threat actor can access it, they can try multiple types of attacks to figure out what is going to be the most effective. They can tailor messaging to me or to my digital clone and get my digital clone to work against me without me even knowing it. That's where I think this is going.
Ben: We use the “digital twin” [term] in the talk rather than a large language model or AI. Large language models are, while very cool and very popular right now, just the tip of the iceberg or the first moment where machines are able to be social. That's going to get a lot stranger in coming years. We're going to think back on this moment in history the way we think back on a time before smartphones.
One of the things we want from this technology more than anything is to act as us on our behalf. And there's a name for that. It's a very old name. It's a digital twin. Digital twins have existed for decades and they are systems physically manifest to copy. They're systems that digitally copy a physical system. I think really that's what we're looking at here, that a large subset of machine learning systems will be trying to be a copy of one of us for a lot of different reasons and we will be very happy about that. There are some capabilities of having something that can act like you and go do things on your behalf that will be just magical.
At the same time, some things are going to get very strange and there are going to be some really interesting vectors of cyberattack that never existed before. It's hard to speculate about what all those will be because some of them are going to come as great surprises, but some of them are just very obvious.
For example, right now, if I was to look at the human population and ask who is really gifted at fraud, it's a subset of the population. There aren't a whole lot of them. One of the very interesting things about these systems is that they can be that and you can make a lot of them. So the idea that phishing emails are something that a large language model can write is just the tip of the iceberg.
What's way more interesting is that there are technologies that can give a large language model a face and give it a voice. While those technologies are not perfect yet, they're going to get dramatically better.
Is it even possible to opt out, you know, of maybe accidentally creating your digital twin? Is it even possible not to be a part of it, of the future that you’re describing to me?
Ben: We're conducting this over Zoom. Even if you hadn't pressed record, would this footage be going somewhere? [...]
The answer has been provided to you. And you can find it in this interface. If you and I spend a moment in the interface that's on our screen right now, we find the terms of service under which we're conducting this call. I can assure you the portion of those terms of service will allow Zoom to sample information from this call in order to conduct business. And in fact, that's not unreasonable.
So in terms of your question, absolutely there's a way to tap out of this. You can still buy a cabin in the woods of Montana and live off the grid.. But then you'll have to start thinking about the compromises you want to make. Would you like to be able to talk to your loved one? Well, if so, you'll need to get a contract with a cell phone carrier. I assure you that that agreement is a similar provision to the one that's sitting in our call.
I bring these things up because they're there. However doomsday you may be or your readers may be, these are the types of compromises we're very familiar with.
I can remember a time before these compromises were even available. A time when all of my friends were written on a piece of paper duct-taped to the wall with a phone hanging next to it. But even that phone came with a contract that had some of these provisions in it. And it itself was a miracle because my grandfather used to hitch mules to go see his friends. One of the really interesting things about this moment is that as afraid as some people might be of this technology, they're going to love some things and some of them are going to fall in love with it. And defend it to the death.
That sounds strange and threatening now, but will not sound strange and threatening to my son. We will learn to adapt to it. But some people will tap out.
I really love the question of what is the last technology you can handle in your life. The thing where you're sitting on your rocking chair and you're like, Nope, I'm out.
For my grandfather. It was the modern internet. Being an intelligent man, he realized what cookies would mean, that there really wasn't any real privacy in this cool thing he discovered. He turned off his Mac, his Bondi Blue iMac, and he never turned on again. That was his moment.
I think we all have that interesting question coming for us. What is that last moment? Will large language models and the digital twin technologies that come after be that last moment for some people?
Matthew: There's a great scene in the movie Her, um, where the gentleman falls in love with the AI where he discloses to a friend of his that he is having a relationship with an AI. And it's fantastic because she's like, Oh, that's interesting, I've been hearing about that. It reminded me of when online dating really became more mainstream. It was kind of one of those conversations where something went from being sort of weird edge case kind of scenario to mainstream. And I think that's kind of how this will be as well. It'll seem strange, and then all of a sudden it won't.
Ben: Some people will be fine. Even people who don't interact heavily with this technology may come to be fine. You know, online dating is a great example. There's plenty of people who have met their significant other before it, myself included, who dislike it because we never had to live in that world. I had a conversation recently where we were discussing the fact that our parents who have lost their spouses are on online dating and my mother met her husband there, and in that moment I converted. I don't need an online dating platform. I'm very happy in my marriage. But I think what is true is that if that ever changed and I was looking for a partner, I would be happy to engage in online dating. Why? Because my mother met a man she loves and has been happy with for a decade through it. These moments are strange and kind of uniquely human. And I think there's a lot of these moments coming with this new set of technology. As frightening as it is, I also think that means that the cybersecurity community and the psychology community and other people we haven't thought of yet to work together to build something that we can have uniquely human moments with. That's the moment we live in right now. We have this thing and there are groups with the understanding to turn it into a good thing. That's what I'm excited about and that's, I think, why Matt and I are here.
More from Cybernews:
Subscribe to our newsletter