Google Scrambles to Fix Health AI Glitches After Investigation Exposes Persistent Errors
#AI

Google Scrambles to Fix Health AI Glitches After Investigation Exposes Persistent Errors

Startups Reporter
2 min read

Google has removed AI Overviews for specific liver health queries following expert warnings about dangerously inaccurate information, but The Guardian found that minor query variations still trigger flawed AI-generated medical advice.

Featured image

A Guardian investigation has revealed critical vulnerabilities in Google's AI Overviews feature when handling health-related queries. Despite removing AI-generated responses for two specific liver health questions after medical experts flagged dangerous inaccuracies, researchers found that slight rephrasing of the same queries continues to produce medically unsound advice.

The investigation tested queries related to liver blood tests – medically complex diagnostics requiring professional interpretation. In initial tests, Google's AI Overviews provided blatantly false information, including:

  1. Misrepresenting normal/abnormal ranges for liver enzyme tests
  2. Suggesting diagnostic conclusions contradicted by established medical guidelines
  3. Recommending inappropriate treatments based on misinterpreted results

Medical professionals reviewing the outputs warned that such errors could lead to dangerous patient decisions, including delaying necessary treatment or pursuing unnecessary interventions. "These aren't subtle inaccuracies," said Dr. Arun Gupta, a hepatologist consulted for the investigation. "They're fundamental misrepresentations of basic medical knowledge that no human clinician would make."

Google removed AI Overviews for the exact phrases flagged by researchers. However, The Guardian team discovered that synonyms or slight rephrasing (e.g., "ALT test interpretation" instead of "AST levels meaning") continued to trigger the feature, often reproducing similar inaccuracies. This demonstrates the brittleness of current content filtering systems when dealing with nuanced medical terminology.

The findings raise serious concerns about deploying generative AI in high-stakes domains like healthcare:

  • Black Box Diagnostics: Unlike traditional featured snippets, AI Overviews generate original responses rather than quoting sources, making verification impossible
  • False Confidence: The authoritative presentation style risks users accepting outputs as medical advice
  • Patchwork Solutions: Keyword-based filtering fails against natural language variations common in health searches

Google's AI Principles explicitly state that the company will avoid creating technologies likely to cause harm. Yet this incident shows how difficult implementation becomes in complex domains. The company now faces mounting pressure to either:

  1. Implement robust medical validation layers for health queries
  2. Exclude entire categories of medical information from AI Overviews
  3. Develop transparent sourcing for all health-related outputs

As healthcare increasingly intersects with AI, this case underscores the non-negotiable need for accuracy in medical information. "When algorithms provide health advice, errors aren't bugs – they're potential malpractice," noted bioethicist Dr. Karen Musalo. "Tech companies must either solve the reliability problem or stay out of the exam room altogether."

The incident occurs amid Google's broader push into health AI, including partnerships with healthcare systems and new clinical tools. How the company addresses these foundational accuracy issues will significantly impact trust in its ambitious healthcare initiatives.

Comments

Loading comments...