AI Passes Turing Test for First Time, Study Confirms
For the first time since Alan Turing proposed the concept in 1950, a modern artificial intelligence system has been rigorously confirmed to pass the Turing Test, according to a landmark study published in the Proceedings of the National Academy of Sciences. Researchers at the University of California San Diego found that OpenAI’s GPT-4.5 was judged to be human 73% of the time in a strict three-party Turing test — outperforming actual human participants.
What the Study Found
The study, led by cognitive scientists Cameron Jones (now at Stony Brook University) and Benjamin Bergen of UC San Diego, is the first to rigorously apply the original three-party format Turing envisioned in his 1950 paper “Computing Machinery and Intelligence.” In this format, a human interrogator chats simultaneously with both a human and an AI, then must decide which is which.
Across randomized, controlled experiments with nearly 500 participants — including UC San Diego undergraduates and a broader online sample via Prolific — the results were striking:
- GPT-4.5 was identified as human 73% of the time, significantly more often than the real human participants it was compared against.
- Meta’s LLaMa-3.1-405B was judged human 56% of the time, statistically indistinguishable from actual humans.
- Older systems performed poorly: the 1960s chatbot ELIZA was selected as human only 23% of the time, and GPT-4o only 21%.
The findings were published in PNAS on May 19, 2026, following a preprint that appeared on arXiv in March 2025.
The Power of Persona Prompting
A critical finding of the study is that AI models require explicit human guidance to appear convincingly human. When GPT-4.5 and LLaMa-3.1 were given detailed “persona” prompts instructing them to adopt a specific human character and communication style, their deception rates soared. Without these instructions, GPT-4.5’s win rate dropped to 36% and LLaMa-3.1’s to 38%.
“What we found is that if given the right prompts, advanced LLMs can exhibit the same tone, directness, humor and fallibility as humans,” said Cameron Jones, the study’s corresponding author. “While we know LLMs can easily produce knowledge on nearly every topic, this test showed that it can also convincingly display social behavioral traits, which has major implications for how we think of AI.”
Remarkably, the models won not through displays of knowledge but through making human-like mistakes. Benjamin Bergen, professor of cognitive science at UC San Diego, explained: “The LLMs were not winning through shows of force of knowledge, they were winning because they made mistakes like a human would. These traits aren’t the kinds of math and logic problem-solving intelligence that I think Turing was imagining.”
Rethinking the Turing Test
The study also forces a reconsideration of what the Turing Test actually measures. Bergen noted that the test, originally designed to answer whether machines could think, now increasingly measures “humanlikeness” rather than intelligence per se.
“The Turing test started as a way to ask whether machines could rival human intelligence,” Bergen said. “But now we know AI can answer many questions faster and more accurately than people can, so the real issue isn’t raw brainpower. Seeing that machines can pass the test — and seeing how they pass it — forces us to rethink what it measures. Increasingly, it’s measuring humanlikeness.”
Implications for Trust and Deception
The ability of AI to convincingly pose as humans over extended conversations — the study tested both 5-minute and 15-minute interactions — raises serious concerns about online trust and security.
“It’s relatively easy to prompt these models to be indistinguishable from humans,” warned Jones. “We need to be more alert; when you interact with strangers online people should be much less confident that they know they’re talking to a human rather than an LLM.”
Jones also pointed to darker implications: “The Turing test is a game about lying for the models. One of the implications is that models seem to be really good at that.”
Bergen echoed these concerns, highlighting the potential for malicious use: “There are lots of people who would like to use bots to persuade people to share their social security numbers, and vote for their party, or buy their product.”
What This Means Going Forward
This study marks a watershed moment in AI research. Unlike previous controversial claims — such as the 2014 “Eugene Goostman” chatbot, which used a distracting persona of a non-native speaker — this research used the rigorous format Turing originally proposed, with no artificial advantages.
The results have already garnered significant attention internationally. People’s Daily, the official newspaper of the Chinese Communist Party, covered the story through its Science and Technology Daily affiliate, underscoring the global significance of the milestone.
As AI systems continue to blur the line between human and machine, the researchers hope their work sharpens public understanding of what these systems can do — and what safeguards society may need. The question is no longer whether machines can pass as human, but what that means for how we interact, trust, and communicate in an increasingly AI-mediated world.