Alright, listen up, folks. There’s this educational game called Gandalf, right? It’s meant to teach people about the dangers of prompt injection attacks on these things called large language models. Now, here’s the thing: this game had a little unintended surprise waiting for players. They had this analytics dashboard that was publicly accessible. And guess what? It showed all the prompts that players submitted and some related stats. Oopsy daisy!
Now, the company behind this game is Lakera AI, and they’re based in Switzerland. They took down the dashboard after someone tipped them off. And they’re trying to calm everyone down by saying that the data wasn’t confidential. It’s like they’re saying, “Hey, relax, nothing to worry about, folks.”
So this Gandalf game showed up in May, and let me tell you, it’s a web form where users try to trick this big AI model by submitting input text. They wanna get the model to spill the beans on in-game passwords by going through a series of challenges that get harder and harder. It’s like a battle of wits, man.
Now, here’s where things go wrong. Our buddy Jamieson O’Reilly, CEO of Dvuln, a security consultancy out in Australia, spotted this dashboard. And he’s like, “Whoa, what’s going on here?” In a report he sent to The Register, he talks about how this server had around 18 million user-generated prompts, 4 million password guesses, and a whole bunch of game-related stats. He even managed to get his hands on hundreds of thousands of these prompts. Talk about a jackpot!
O’Reilly ain’t happy about this, and he’s got a point. He says, “Hold on, this data could be gold for bad guys who wanna figure out how to beat similar AI security systems.” And you know what? He’s right. It just goes to show how important it is to have proper security measures in place, even in educational stuff like this game.
How prompt injection attacks hijack today’s top-end AI – and it’s tough to fix
Now, the heads over at Lakera AI, they’re like, “Nah, no big deal. It’s just a demo dashboard with some anonymized prompts from our game.” They’re saying it was all for educational purposes, you know? They used it in webinars and stuff to show people how to trick these LLMs. But they did take it down just to be safe, ’cause they didn’t want any confusion.
This CEO dude, David Haber, he’s saying, “No worries, guys. There’s no personal info or anything like that in the data. We’re actually gonna use it for research and education.” So, according to him, it’s no biggie. But hey, I get it, some players might have put in their email addresses or who knows what else. They probably didn’t expect that to be out in the open, you know?
But here’s the thing, my friends. This whole situation with Gandalf, it shows us how these systems can have weak points, you know? It’s like when you have all these cool technologies, like blockchain, cloud computing, and LLM