AI Reviews Boost Sales Despite Frequent Hallucinations: New Study
Despite widespread skepticism about artificial intelligence, modern research suggests a surprising trend: people are more inclined to purchase a product after reading a summary generated by AI, even when that AI is demonstrably prone to errors. A study from the University of California, San Diego (UCSD) revealed that participants expressed an 84% interest in buying a product after reviewing an AI-generated summary of online reviews, compared to just 52% after reading reviews written by humans. This occurred despite the AI hallucinating – fabricating information – in 60% of instances when questioned about the products.
The findings, presented in December 2025 at the Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, mark what the UCSD team claims is the first demonstration of how cognitive biases inherent in large language models (LLMs) can directly influence consumer behavior. The research too provides a quantitative measure of AI’s persuasive power, a metric previously lacking in the field.
How LLMs Distort Perception
The study involved a multi-stage process. Researchers first prompted AI models to summarize product reviews and media interviews, then tasked the same models with fact-checking their own summaries. This initial step revealed a significant weakness: the AI consistently struggled to distinguish between factual information, and fabrication. “The consistently low strict accuracy, compared to actual news and falsified news accuracy, highlights a critical limitation: the persistent inability to reliably differentiate fact from fabrication,” the scientists wrote in their published study. Reddit discussions corroborate the concerns about LLM accuracy, with users frequently pointing out instances of AI-generated misinformation.
However, the most striking result centered on product reviews. The researchers attribute this effect to two key characteristics of LLMs. First, these models tend to prioritize information presented at the beginning of a text – a phenomenon known as “lost in the middle,” as described in prior research by lead author Abeer Alessa, a research assistant and machine learning and human-computer interaction lecturer at UCSD. Second, LLMs become less reliable when processing information outside of their training data. “Models tend to be wrong on whether the news description happened or not,” Alessa explained in an interview with Live Science. “It may incorrectly state that an event never occurred, even if it did occur after the model’s training was completed.”
During testing, the chatbots altered the sentiment of genuine user reviews in 26.5% of cases and, critically, hallucinated information 60% of the time when asked direct questions about the reviews. This means that a significant portion of the AI-generated summaries contained inaccuracies, yet still led to increased purchase intent.
The Impact on Consumer Decisions
The experiment involved 70 participants who were presented with either original product reviews or AI-generated summaries of those reviews. The results were stark: those who read the original reviews indicated a willingness to buy the product in 52% of cases, while those who read the AI summaries showed purchase intent 84% of the time. When specifically examining positive product review summaries, the effect was even more pronounced, with 83.7% of participants expressing a desire to buy compared to 52.3% for those reading the original reviews.
This suggests that even subtle framing changes introduced by AI can significantly distort consumer judgment. The researchers propose that the AI summaries, by focusing on certain aspects of the reviews or presenting information in a more persuasive manner, create a biased perception of the product. This is particularly concerning given the increasing prevalence of AI-powered tools in online shopping and product research. The University of San Diego offers an MS in Applied Artificial Intelligence, highlighting the growing interest in and development of these technologies.
Beyond Consumer Goods: High-Stakes Applications
The researchers acknowledge that their study was conducted in a relatively low-stakes environment. However, they caution that the impact of AI-induced biases could be far more severe in situations with higher risks. “Some high-stakes scenarios include summarizing healthcare documents or students’ profiles in school admissions,” Alessa warned. “In these contexts, framing shifts can affect how a person or the case is perceived.” Imagine, for example, an AI summarizing a patient’s medical history for a doctor, or an AI evaluating a student’s application for university admission. Inaccurate or biased summaries could have profound consequences.
The potential for systemic bias extends beyond individual decisions. The researchers suggest that LLM-generated content could introduce biases into media reporting, educational materials, and public policy discussions. This underscores the need for careful analysis and mitigation of content alteration induced by LLMs.
Methodology and Limitations
The study utilized six different LLMs, analyzing 1,000 electronics reviews, 1,000 media interviews, and a database of 8,500 news articles. The team quantified bias by measuring framing shifts in sentiment, the tendency to over-rely on information presented early in the text, and the frequency of hallucinations. UC San Diego is consistently ranked among the top universities for AI education, demonstrating the institution’s commitment to research in this field. The university’s Master of Science programs in Computer Science (Artificial Intelligence) and Electrical and Computer Engineering (Machine Learning & Data Science) are particularly well-regarded.
It’s important to note that the study’s findings are based on a specific set of LLMs and data sources. The results may vary depending on the models used and the type of content being analyzed. The study did not investigate the reasons why people might be more trusting of AI-generated summaries. Further research is needed to understand the underlying psychological mechanisms at play.
Next Steps: Towards Responsible LLM Deployment
The UCSD team’s work represents a crucial step towards understanding and mitigating the potential risks associated with LLM-generated content. The researchers emphasize the need for ongoing analysis and development of techniques to detect and correct biases in AI outputs. This includes improving the fact-checking capabilities of LLMs and developing methods to ensure that AI summaries accurately reflect the original source material. The authors call for a collaborative effort involving researchers, developers, and policymakers to establish guidelines and standards for the responsible deployment of LLMs in various applications.