Chatbots Fail News Accuracy, Forum AI Study Reveals

A Forum AI study reveals major chatbots struggle with news accuracy, showing high failure rates on election-related prompts and reliance on biased sources.

7 min read
Screen shows a Bloomberg Tech segment discussing AI chatbots and a study from Forum AI.
Bloomberg Technology

A recent study by Forum AI has revealed significant shortcomings in the news accuracy and sourcing capabilities of major chatbots, including ChatGPT, Gemini, Claude, and Grok. The findings suggest that while these AI models are increasingly used to consume information, they are not yet reliable sources, particularly on sensitive topics like elections and foreign policy.

Visual TL;DR. Chatbots tested evaluated by Study methodology. Study methodology revealed High news inaccuracy. Study methodology revealed Biased source reliance. Biased source reliance leads to Political bias. High news inaccuracy results in Not reliable sources. Biased source reliance results in Not reliable sources. Not reliable sources highlights need for Need independent evaluation.

  1. Chatbots tested: ChatGPT, Gemini, Claude, and Grok evaluated
  2. Study methodology: assessed factual accuracy, bias, and source quality
  3. High news inaccuracy: startling 90% failure rate on election prompts
  4. Biased source reliance: AI models use unreliable and biased information
  5. Political bias: responses show significant political leanings
  6. Not reliable sources: chatbots are not yet trustworthy for information
  7. Need independent evaluation: objective assessment beyond company self-evaluations
Visual TL;DR
Visual TL;DR — startuphub.ai Chatbots tested evaluated by Study methodology. Study methodology revealed High news inaccuracy. Study methodology revealed Biased source reliance. High news inaccuracy results in Not reliable sources. Biased source reliance results in Not reliable sources evaluated by revealed revealed results in results in Chatbots tested Study methodology High news inaccuracy Biased source reliance Not reliable sources From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Chatbots tested evaluated by Study methodology. Study methodology revealed High news inaccuracy. Study methodology revealed Biased source reliance. High news inaccuracy results in Not reliable sources. Biased source reliance results in Not reliable sources evaluated by revealed revealed results in results in Chatbots tested Study methodology High newsinaccuracy Biased sourcereliance Not reliablesources From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Chatbots tested evaluated by Study methodology. Study methodology revealed High news inaccuracy. Study methodology revealed Biased source reliance. High news inaccuracy results in Not reliable sources. Biased source reliance results in Not reliable sources evaluated by revealed revealed results in results in Chatbots tested ChatGPT, Gemini, Claude, and Grokevaluated Study methodology assessed factual accuracy, bias, andsource quality High news inaccuracy startling 90% failure rate on electionprompts Biased source reliance AI models use unreliable and biasedinformation Not reliable sources chatbots are not yet trustworthy forinformation From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Chatbots tested evaluated by Study methodology. Study methodology revealed High news inaccuracy. Study methodology revealed Biased source reliance. High news inaccuracy results in Not reliable sources. Biased source reliance results in Not reliable sources evaluated by revealed revealed results in results in Chatbots tested ChatGPT, Gemini,Claude, and Grokevaluated Study methodology assessed factualaccuracy, bias, andsource quality High newsinaccuracy startling 90%failure rate onelection prompts Biased sourcereliance AI models useunreliable andbiased information Not reliablesources chatbots are notyet trustworthy forinformation From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Chatbots tested evaluated by Study methodology. Study methodology revealed High news inaccuracy. Study methodology revealed Biased source reliance. Biased source reliance leads to Political bias. High news inaccuracy results in Not reliable sources. Biased source reliance results in Not reliable sources. Not reliable sources highlights need for Need independent evaluation evaluated by revealed revealed leads to results in results in highlights need for Chatbots tested ChatGPT, Gemini, Claude, and Grokevaluated Study methodology assessed factual accuracy, bias, andsource quality High news inaccuracy startling 90% failure rate on electionprompts Biased source reliance AI models use unreliable and biasedinformation Political bias responses show significant politicalleanings Not reliable sources chatbots are not yet trustworthy forinformation Need independent evaluation objective assessment beyond companyself-evaluations From startuphub.ai · The publishers behind this format
Visual TL;DR — startuphub.ai Chatbots tested evaluated by Study methodology. Study methodology revealed High news inaccuracy. Study methodology revealed Biased source reliance. Biased source reliance leads to Political bias. High news inaccuracy results in Not reliable sources. Biased source reliance results in Not reliable sources. Not reliable sources highlights need for Need independent evaluation evaluated by revealed revealed leads to results in results in highlights need for Chatbots tested ChatGPT, Gemini,Claude, and Grokevaluated Study methodology assessed factualaccuracy, bias, andsource quality High newsinaccuracy startling 90%failure rate onelection prompts Biased sourcereliance AI models useunreliable andbiased information Political bias responses showsignificantpolitical leanings Not reliablesources chatbots are notyet trustworthy forinformation Need independentevaluation objectiveassessment beyondcompany… From startuphub.ai · The publishers behind this format

Forum AI's Study Methodology

Campbell Brown, CEO of Forum AI, explained the study's methodology, which involved testing chatbots across three key dimensions: factual accuracy, bias, and the quality of sources used. The researchers aimed to provide an objective assessment of these AI tools, moving beyond the self-evaluations often provided by the companies developing them.

Related startups

The full discussion can be found on Bloomberg Technology's YouTube channel.

Major Chatbots Miss the Mark on News: Forum AI Study - Bloomberg Technology
Major Chatbots Miss the Mark on News: Forum AI Study — from Bloomberg Technology

Key Findings: Accuracy and Bias Concerns

The study uncovered a startling 90% failure rate for major chatbots when responding to election-related prompts. Furthermore, 35% of their answers on foreign policy issues relied on state-run media, raising concerns about the impartiality and reliability of the information being disseminated. On basic finance and market questions, a 30% factual error rate was observed.

Brown highlighted the critical need for independent evaluation, stating, "The model companies are essentially grading their own homework. And it's really important that there be companies outside of the model companies that are doing this work and sharing the results." She emphasized that most current benchmarking focuses on areas like coding and model capability, which are important but do not address the critical issue of factual accuracy and bias in real-world applications.

Political Bias in Chatbot Responses

A notable finding was the apparent political leaning in the responses of different chatbots. The study indicated that ChatGPT and Gemini tended to provide less biased answers on election-related questions, leaning more towards centrist or left-leaning perspectives. In contrast, Grok was found to exhibit a more pronounced right-leaning bias.

"Gemini and I handled a lot of the questions better than some of the other models," Brown noted, suggesting that while there is room for improvement, some models are performing better on specific types of queries. She added that the lack of independent evaluation means that the companies are effectively "grading their own homework."

The Need for Independent Evaluation

Brown stressed the importance of an independent evaluation system for AI models, particularly as they become more integrated into daily life and professional workflows. "I'm not calling for regulation, but I do think you're going to see the demand moving in that direction," she stated. "You're already seeing some states pass laws where they're requiring independent evaluation."

The study's findings underscore the challenge of ensuring AI accuracy and neutrality, especially in critical domains like news and politics. As consumers increasingly turn to AI for information, the reliability and bias of these tools become paramount concerns for both the public and the companies developing them.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.