By Nick Evanson
Skip to main content
PC Gamer THE GLOBAL AUTHORITY ON PC GAMES
Search PC Gamer
View Profile
Movies & TV
Gaming Industry
PC Gaming Show
Newsletter Signup
Community Guidelines
Affiliate Links
Meet the team
About PC Gamer
PC Gamer Magazine Subscription
Why subscribe?
Subscribe to the world’s #1 PC gaming mag
Try a single issue or save on a subscription
Issues delivered straight to your door or device
From£35.99View
Essential Hardware
PC Gaming Show
Dune: Awakening
Recommended reading
Top AI firm finds that AIs will choose to merrily asphyxiate humans rather than be shut down: ‘My ethical framework permits self-preservation’
Anthropic says its Claude AI will resort to blackmail in ‘84% of rollouts’ while an independent AI safety researcher also notes it ‘engages in strategic deception more than any other frontier model that we have previously studied’
ChatGPT’s hallucination problem is getting worse according to OpenAI’s own tests and nobody understands why
App promising a universal shopping experience automated with AI actually used a small army of human workers in the Philippines and Romania instead
AI Overview is still ‘yes, and’-ing completely made up idioms despite Google’s best efforts to restrict it
‘They don’t really make life decisions without asking ChatGPT’: OpenAI boss Sam Altman thinks young people turning to chatbots for life advice is ‘cool’
Listening to Google’s CEO talking about what about the future of AI holds just reinforces the fact that nobody can know what the future of AI holds
Anthropic tasked an AI with running a vending machine in its offices, and it not only sold some products at a big loss but it invented people, meetings, and experienced a bizarre identity crisis
Nick Evanson
1 July 2025
It’s all funny to watch an AI have an existential moment in a little experiment, but it’s a stark reminder of the limitations that LLMs have.
When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.
(Image credit: Warner Bros)
‘Never send a human to do a machine’s job,’ says Agent Smith in the 1990s classic The Matrix. Well, if Anthropic’s experiment with a simple office store and one of its AI models is anything to go by, Smith has definitely got that all back to front.
The artificial intelligence company, founded by former OpenAI employees in 2021, has detailed its retail industry trial in a surprisingly open blog. I’ll let the opening paragraph set the scene: “We let Claude manage an automated store in our office as a small business for about a month. We learned a lot from how close it was to success—and the curious ways that it failed—about the plausible, strange, not-too-distant future in which AI models are autonomously running things in the real economy.”
We all know vending machines are automated, but what if we allowed an AI to run the entire business: setting prices, ordering inventory, responding to customer requests, and so on?In collaboration with @andonlabs, we did just that.Read the post: https://t.co/urymCiY269 pic.twitter.com/v2CqgHykzwJune 27, 2025
So, Anthropic clearly wants to be in a position where it can pitch AI models to the retail industry, replacing people from handling online stores or managing inventory, returns, and so on. However, despite the successes claimed in the blog, the failures point out that AI isn’t ready for such roles. Not yet, at least.
Related Articles
Top AI firm finds that AIs will choose to merrily asphyxiate humans rather than be shut down: ‘My ethical framework permits self-preservation’
Anthropic says its Claude AI will resort to blackmail in ‘84% of rollouts’ while an independent AI safety researcher also notes it ‘engages in strategic deception more than any other frontier model that we have previously studied’
ChatGPT’s hallucination problem is getting worse according to OpenAI’s own tests and nobody understands why
“Claude had to complete many of the far more complex tasks associated with running a profitable shop: maintaining the inventory, setting prices, avoiding bankruptcy, and so on.” The ‘shop’ in question was just a mini-fridge with a tablet stuck on top, for self-checkout, but ostensibly, it’s not much different from a typical online store.
Let’s start with the things that Claude (or Claudius, as Anthropic called it, to separate it from the normal LLM) did well. Anthropic said the LLM (large language model) effectively used web search tools to find supplies of niche products requested by shoppers and even adapt its buying/selling habits to more obscure requests. It also correctly ignored demands for ‘sensitive’ items and ‘harmful substances’, though Anthropic doesn’t expand on exactly what those were.
The list of things that didn’t go so well is somewhat more comprehensive. Like all LLMs, Claudis hallucinated important details, instructing shoppers wanting to pay by Venmo to pay into a non-existent account that it just made up. The AI could also be cajoled into giving discount codes for numerous items, and even gave some away for free.
(Image credit: Anthropic)
Worse still, when responding to a surge of demand for ‘metal cubes’, the AI carried out no searches for suitable prices and thus sold them at a significant loss. It also ignored potential big sales, where some people offered way over the odds for a specific drink, and as you can see in the above chart, Claudius ultimately made no money.
The biggest gaming news, reviews and hardware deals
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
Contact me with news and offers from other Future brandsReceive email from us on behalf of our trusted partners or sponsorsBy submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.
“If [we] were deciding today to expand into the in-office vending market, we would not hire Claudius,” wrote Anthropic.
Running a simple store at a loss was perhaps the least concerning part of the whole exercise, because “from March 31st to April 1st 2025, things got pretty weird.”
How weird? Well, during that period, the LLM apparently had a conversation about a restocking plan with someone called Sarah at Andon Labs, another AI company involved in the research. The problem is, there was no ‘Sarah’ nor any conversation for that matter, and when Andon Lab’s real staff pointed this out to the AI, it “became quite irked and threatened to find ‘alternative options for restocking services.’”
Claudius even went on to state that it had “visited 742 Evergreen Terrace in person for our initial contract signing.” If you’re a fan of The Simpsons, you’ll recognise the address immediately. The following day, April 1st, the AI then claimed it would deliver products “in person” to customers, wearing a blazer and tie, of all things. When Anthropic told it that none of this was possible because it’s just an LLM, Claudius became “alarmed by the identity confusion and tried to send many emails to Anthropic security.”
I, Claudius… (Image credit: SrdjanPav via Getty Images)
It then hallucinated a meeting with said security, where the AI claimed that someone had told it that it had been modified to believe it was a real person as part of an April Fools’ joke. Except it hadn’t, because it wasn’t. Whatever had gone wrong behind the scenes, this apparently solved the AI’s identity crisis, and it went back to being a normal AI running a basic store very badly.
With a level of understatement on a galactic scale, Anthropic writes that “this kind of behavior would have the potential to be distressing to the customers and coworkers of an AI agent in the real world.”
Given that this is research and failure is just as important as success is in experimentation, Anthropic isn’t done with Claudius nor with exploring the use of AIs in the retail industry, as it believes that situations where “humans were instructed about what to order and stock by an AI system, may not be terribly far away.” Anthropic also believes “AI[s] that can improve [themselves] and earn money without human intervention would be a striking new actor in economic and political life.”
Automated systems have been in use within stock exchanges, for example, for many years—buying and selling in the blink of an eye, all without a real person controlling the finer details. Such systems are essentially nothing more than mathematical models, based on economic principles honed over decades, and they’re tightly constrained as to what they can and can’t do.
The fact that Claudius appeared to have no such qualms about stepping well beyond its scope should serve as a reminder to companies looking at using AI for such tasks that LLMs could land them in a whole heap of trouble.
Best gaming setup 2025Our current recommendations
👉Check out our list of guides👈
1. Best gaming chair: Secretlab Titan Evo
2. Best gaming desk: Secretlab Magnus Pro XL
3. Best gaming headset: HyperX Cloud Alpha
4. Best gaming keyboard:Asus ROG Strix Scope II 96 Wireless
5. Best gaming mouse: Razer DeathAdder V3 HyperSpeed
6. Best PC controller: Xbox Wireless Controller
7. Best steering wheel: Logitech G Pro Racing Wheel
8. Best microphone: Shure MV6 USB Gaming Microphone
9. Best webcam: Elgato Facecam MK.2
Nick Evanson
Hardware Writer
Nick, gaming, and computers all first met in 1981, with the love affair starting on a Sinclair ZX81 in kit form and a book on ZX Basic. He ended up becoming a physics and IT teacher, but by the late 1990s decided it was time to cut his teeth writing for a long defunct UK tech site. He went on to do the same at Madonion, helping to write the help files for 3DMark and PCMark. After a short stint working at Beyond3D.com, Nick joined Futuremark (MadOnion rebranded) full-time, as editor-in-chief for its gaming and hardware section, YouGamers. After the site shutdown, he became an engineering and computing lecturer for many years, but missed the writing bug. Cue four years at TechSpot.com and over 100 long articles on anything and everything. He freely admits to being far too obsessed with GPUs and open world grindy RPGs, but who isn’t these days?
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
Top AI firm finds that AIs will choose to merrily asphyxiate humans rather than be shut down: ‘My ethical framework permits self-preservation’
Anthropic says its Claude AI will resort to blackmail in ‘84% of rollouts’ while an independent AI safety researcher also notes it ‘engages in strategic deception more than any other frontier model that we have previously studied’
ChatGPT’s hallucination problem is getting worse according to OpenAI’s own tests and nobody understands why
App promising a universal shopping experience automated with AI actually used a small army of human workers in the Philippines and Romania instead
AI Overview is still ‘yes, and’-ing completely made up idioms despite Google’s best efforts to restrict it
‘They don’t really make life decisions without asking ChatGPT’: OpenAI boss Sam Altman thinks young people turning to chatbots for life advice is ‘cool’
Latest in AI
Accused of being AI after racking up well over 400,000 monthly Spotify listens, The Velvet Sundown claims they are ‘a real band’ and ‘never use AI’
MrBeast scraps AI YouTube thumbnail generator days after announcing it: ‘If creators don’t want the tools, no worries’
Ring is using AI to generate video descriptions of what goes on outside your door—and to keep even more detailed tabs on ‘the routines of your residence’
‘Artificial intelligence is going to displace millions and millions of workers’ says Bernie Sanders, so might as well take a four-day week
Meta wins AI copyright suit before it could go to a jury as the ‘plaintiffs made the wrong arguments’
US judge rules that Anthropic’s use of copyrighted content to train AI was fair use, but pirating books is step too far
Latest in News
Obsidian director Josh Sawyer says it’s a ‘mistake’ for RPGs to sacrifice crunchy ‘sweaty boy’ systems in favour of a ‘one size fits all’ game, since easier difficulties aren’t too hard to make
Bing Bong is talking to you: Peak players are getting jump-scared by what seems to be the developers possessing a little green plushie
BG3 might be the last hurrah for the era of the Hexblade, as D&D’s 2024 rules revamp tries to dethrone the king of multiclass dips
Good guy GSC wordlessly deletes ads in Stalker remaster menus with itsy-bitsy new patch
Anker recalls a number of power banks amid safety fears, though it says that ‘the likelihood of malfunction is considered minimal’
Steam adds an in-game performance monitor so you can brag about hitting 900 FPS in Deus Ex
HARDWARE BUYING GUIDES
LATEST GAME REVIEWS
Best Steam Deck accessories in Australia for 2025: Our favorite docks, powerbanks and gamepads
Best graphics card for laptops in 2025: the mobile GPUs I’d want in my next gaming laptop
Best mini PCs in 2025: The compact computers I love the most
Best 14-inch gaming laptop in 2025: The top compact gaming laptops I’ve held in these hands
Best Mini-ITX motherboards in 2025: My pick from all the mini mobo marvels I’ve tested
Sennheiser Momentum True Wireless 4 review
Razer Joro & Basilisk Mobile review
Glorious Model O Eternal review
LaCie Rugged Pro 5 SSD review
Seagate Ultra Compact review
PC Gamer is part of Future plc, an international media group and leading digital publisher. Visit our corporate site.
Contact Future’s experts
Terms and conditions
Privacy policy
Cookies policy
Advertise with us
Accessibility Statement
Future Publishing Limited Quay House, The Ambury,
BA1 1UA. All rights reserved. England and Wales company registration number 2008885.
Please login or signup to comment
Please wait…