A patient arrives for a visit, holding out a phone screen as evidence: an artificial intelligence (AI) tool has labeled their symptoms "high risk" and recommended a next step. The language is polished, the recommendation is specific, and the patient's anxiety is real. The tool isn't "making a diagnosis" -- technically, it is "providing information." But it has already done what these tools inevitably do: it has shaped the clinical decision before a clinician ever enters the room.
That is the context in which the FDA's new posture toward digital health should be evaluated. At the Consumer Electronics Show, FDA Commissioner Marty Makary, MD, MPH, framed the agency's priorities in Silicon Valley terms -- speed, investors, and rapid iteration -- and announced "sweeping changes" to FDA's approach to clinical decision support software and certain wearables.
Some deregulation is reasonable. Not every step counter or meditation app needs the same scrutiny as a diagnostic test. FDA resources are finite, and prioritizing oversight toward higher-risk tools is defensible.
The question is where the agency now draws the boundary -- and whether the new guidance makes it easier for products that behave like medicine to avoid being treated like medicine.
"General Wellness" Is Not the Same As "Not Medical"
The updated FDA guidance on "General Wellness" reiterates an enforcement approach that many in digital health have wanted for years: the Center for Devices and Radiological Health "does not intend to examine" low-risk general wellness products to determine whether they are devices or whether they comply with pre-market and post-market requirements.
What is new -- and more consequential -- is how far "general wellness" can stretch into clinically familiar territory. The guidance says FDA may consider products that use non-invasive sensing (including optical sensing) to estimate or output physiologic parameters such as blood pressure, oxygen saturation, blood glucose, and heart rate variability to be general wellness products, so long as outputs are intended solely for wellness uses and meet guardrails (non-invasive, no disease intent, not a substitute for an FDA-authorized device, not prompting clinical action, and not mimicking clinical values unless validated).
In one illustrative example, a wearable that estimates pulse rate and blood pressure via photoplethysmography can still be treated as low-risk general wellness -- provided the values are validated, and the claims don't imply medical use.
Those caveats matter. But they also collide with how medicine actually works. Numbers like blood pressure and glucose do not stay in a "wellness" lane. Patients bring them to clinic visits and portal messages. Health systems increasingly ingest consumer-device data into electronic records and remote monitoring workflows. A tool can be labeled "not for medical use" and still become an input to medical decision-making, simply because the output looks and feels clinical.
The FDA's boundary-setting reflects this tension. The same guidance emphasizes that products are not general wellness when intended to measure or report physiologic values for medical or clinical purposes, including screening, diagnosis, monitoring, alerting, or management of disease. In other words, context and use determine risk -- not marketing language alone.
So, when FDA says it won't "examine" low-risk wellness products, we should ask: low-risk for whom, and in what context? A product can be non-invasive and still be harmful if it systematically misleads -- especially when it presents clinical-looking values with the authority of measurement.
A "Single Recommendation" Can be the Most Coercive Recommendation
The FDA also updated its guidance on Clinical Decision Support (CDS) software. The guidance clarifies how some CDS functions can be "non-device" under the 21st Century Cures Act criteria. But it also introduces a critical policy shift: when only one option is clinically appropriate, and the CDS otherwise meets the statutory criteria, the FDA "intends to exercise enforcement discretion" -- meaning it does not intend to enforce Food, Drug, and Cosmetic Act requirements for those functions.
That matters because a single recommendation is not a neutral nudge. It's a directive dressed up as advice.
In the real world, clinicians are busy, inboxes are full, and decisions are made under time pressure. In that setting, software tools that output one "best" choice can quietly reshape clinical judgment. This is a known pattern in decision support: automation bias, the tendency to accept a machine's recommendation even when it is wrong or when contradictory information is available.
Now layer in modern generative AI. These systems can be fluent and persuasive even when they are incorrect, and they can fail in ways that are difficult for end-users to detect in the moment. This could open the door to more physicians defaulting to AI software recommendations -- even when they're flawed.
A regulatory posture that expands "don't enforce" categories for clinical recommendations should therefore be paired with stronger expectations for transparency, monitoring, and evidence -- not fewer.
Withdrawing Clinical Evaluation Guidance Sends the Wrong Signal
On the same day these updates were announced, FDA withdrew its "Software as a Medical Device (SaMD): Clinical Evaluation" guidance from its list of current thinking.
Guidance is not law, and frameworks should evolve. But the symbolism matters. At a moment when AI is entering clinical workflows faster than evaluation standards can keep up, withdrawing a shared clinical evaluation framework risks signaling that speed is the priority -- even when credibility and trust are the true bottlenecks to adoption.
A Better Trade: Faster Evidence, Stronger Guardrails
The standard framing -- innovation versus safety -- is a false dichotomy. In medicine, safety is what makes innovation adoptable. If patients and clinicians cannot trust these tools, adoption will stall and backlash will grow.
A constructive alternative is available:
- Regulate by context-of-use, not marketing labels. If a consumer device outputs clinically meaningful physiologic values, FDA should require clear, standardized disclosure about validation, intended use, and limitations -- and treat predictable clinical use as a risk factor even when products are branded as "wellness."
- Make enforcement discretion a structured fast lane, not a dark lane. When the agency chooses not to enforce pre-market requirements for a category, it should still expect basic transparency: public-facing summaries of what the tool does, what data it was evaluated on, and what it should not be used for, paired with meaningful post-market reporting.
- Accelerate evidence generation instead of simply loosening oversight. The FDA has already described models that pair access with structured learning in real-life settings, including the TEMPO pilot aimed at understanding how digital health technologies perform in practice while safeguarding patients. In this spirit, faster pathways should come with clearer expectations that real-world performance will be measured -- and that products that drift or fail will be corrected quickly.
The goal should not be "less FDA." The goal should be better evidence at the pace of software because AI-generated recommendations and wearable-derived numbers are already showing up in clinic visits, shaping decisions, and changing what patients expect from care. When these tools are wrong, the cost is not absorbed by investors or product teams. It is absorbed by patients, clinicians, and a health system that cannot afford another wave of technology it has to unlearn.
Henry Bair, MD, and Mak Djulbegovic, MD, MSc, are resident physicians at Wills Eye Hospital whose research focuses on the intersection of artificial intelligence and clinical decision-making.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.