MONT3 - Dispatch #340 · AI Medical Scribes Are Fabricating Patient Conditions: The Hidden Danger in Your Doctor's Office

Your doctor’s AI assistant is lying. Not intentionally, but the result is the same: fabricated medical conditions, nonexistent symptoms, and phantom diagnoses are being written into patient records across North America. Ontario’s recent audit of AI medical scribes has exposed a critical flaw that could reshape how we think about artificial intelligence in healthcare.

If you’ve visited a doctor recently, there’s a high probability an AI system was quietly listening to your conversation, transcribing every word, and generating your medical notes. What seemed like a technological breakthrough designed to free physicians from paperwork has become a potential patient safety crisis.

The Scope of the Problem

Ontario’s auditor general Shelley Spence released a damning report this week after examining 20 AI scribe platforms used across the province’s healthcare system. The findings were unambiguous: every single AI system showed inaccuracies during testing. These weren’t minor transcription errors or missed words—they were full-scale hallucinations where the AI fabricated medical information that never occurred during patient visits.

The report identified several critical failure modes:

Fabrication of symptoms that patients never reported
Incorrect medical information being inserted into records
Missing or incomplete documentation of actual patient concerns
Phantom diagnoses appearing in clinical notes

Currently, approximately 5,000 doctors across Ontario are using these AI scribe systems, despite their documented tendency to generate false medical information. The auditor general’s concerns were so significant that Spence personally asked her own physician to review her AI-generated transcript after appointments.

“⚠️Ontario’s auditor general special report warns: #AI #medicalscribes were “not evaluated adequately,” may present “fabricated information” to medical professionals (e.g.,hallucinations/fabrication; incorrect, missing or incomplete information).” — @MarilynHeineMD

A Historical Parallel: The Thalidomide Crisis

This situation bears striking similarities to the thalidomide crisis of the late 1950s and early 1960s. Then, a drug marketed as safe for pregnant women caused severe birth defects because adequate testing hadn’t been performed. The pharmaceutical industry’s rush to market a promising treatment without proper validation mirrors today’s deployment of AI medical scribes.

Just as thalidomide seemed like a medical breakthrough—a safe sedative for expectant mothers—AI scribes appeared to solve a genuine problem: physician burnout from excessive documentation. Both technologies promised to improve patient care, and both were deployed before their risks were fully understood.

The key difference is scale and detection. Thalidomide’s effects were visible and immediate. AI hallucinations in medical records are invisible, potentially lying dormant until they influence future treatment decisions.

The Technical Reality of AI Hallucinations

AI hallucinations occur when large language models generate confident-sounding but completely fabricated information. Unlike human errors, which typically involve forgetting or mishearing, AI systems can create entirely fictional medical scenarios with remarkable specificity.

The problem extends beyond Ontario. OpenEvidence, an AI scribe system used in the United States, faces similar scrutiny for generating incomplete answers and drawing overly strong conclusions from limited medical studies. NBC News reported that physicians have observed the system making definitive claims based on research with inadequate sample sizes.

“@ottosipe haha feels like the same thing that’s happening at all the big AI labs is happening at the subset of physician facing AI tools (evidence summaries, scribes, visit notes, and now Rx) which is great tbh for docs but let’s see who wins the distribution game i guess?” — @akiffpremjee

The Regulatory Response Gap

Ontario’s Minister of Public and Business Service Delivery and Procurement, Stephen Crawford, attempted damage control by clarifying that the hallucinations occurred during regulatory testing, not during actual patient visits. However, this distinction misses the critical point: if AI systems fabricate information during controlled testing conditions, what happens during the complexity of real medical encounters?

The minister’s statement reveals a fundamental misunderstanding of AI behavior. These systems don’t distinguish between testing and operational environments—they generate text based on their training, regardless of context. If anything, real-world medical conversations with their interruptions, ambient noise, and complex terminology create more opportunities for AI errors, not fewer.

What This Means for Patients

The implications extend far beyond transcription errors. Fabricated medical information in patient records can:

Influence future diagnoses when specialists review medical history
Affect insurance coverage if phantom conditions appear in records
Compromise treatment decisions based on fictional symptoms
Create legal liability for physicians who rely on inaccurate AI-generated notes

Unlike the gradual adoption curve typical of medical technologies, AI scribes have been deployed at massive scale with minimal validation. The 5,000 physicians using these systems in Ontario alone represent hundreds of thousands of patient encounters potentially affected by AI fabrications.

The Path Forward

The solution isn’t to abandon AI medical scribes entirely—their potential benefits for reducing physician documentation burden remain significant. Instead, the medical community needs mandatory validation protocols, real-time accuracy monitoring, and physician oversight requirements.

Healthcare organizations must implement:

Mandatory physician review of all AI-generated notes before finalization
Accuracy auditing systems that continuously monitor AI performance
Patient notification protocols when AI scribes are used during appointments
Liability frameworks that clearly define responsibility for AI-generated errors

The thalidomide crisis led to revolutionary changes in drug approval processes, establishing the rigorous clinical trial system we rely on today. Similarly, the AI medical scribe crisis should trigger fundamental reforms in how we validate and deploy artificial intelligence in healthcare settings.

The question isn’t whether AI will transform medicine—it already has. The question is whether we’ll learn from these early failures to build systems that enhance rather than compromise patient care. Until then, patients have every right to ask their physicians: “What exactly did that AI write about me?“

Published in Stream · Dispatch #340 · May 17, 2026 · 5 min read.
Reply to paolo@mont3.ch - every email gets a human answer within 24h.