AwesomeCapital: GenAI-CDS and Clinical Reasoning Development in Pre-Clinical Medical Education

Thursday, March 26, 2026

GenAI-CDS and Clinical Reasoning Development in Pre-Clinical Medical Education

Contributors

Description

This project houses a research program investigating how generative AI clinical decision support (GenAI-CDS) affects clinical reasoning development in pre-clinical medical students. The program is grounded in Cognitive Load Theory (CLT), the CHART+M behavioral assessment framework, and the Siden input typology extended to pre-clinical learners. The central theoretical contribution is the "never-skilling" hypothesis: that AI access during formative training may prevent foundational schema formation entirely, a mechanism distinct from the deskilling documented in experienced clinicians. The program spans three domains. First, a randomized simulation study testing whether GenAI-CDS access during acute care simulation supports or undermines transfer performance on AI-absent key-feature examinations (funded by NBOME). Second, SAFE-CRS (Scaffolded AI Framework for Evidence-Based Clinical Reasoning in Simulation), a multi-site implementation targeting academic medical partnerships in low- and middle-income countries. Third, dissemination through workshops, peer-reviewed publications, and public scholarship translating findings for medical educators and policymakers. The key-feature examination and licensure board examination (COMLEX-USA, COMAT) function as structural equalizers in this program. Both are AI-absent, standardized, and consequential. They operationalize the never-skilling hypothesis at the individual study level and at population scale, respectively.

Overview

Research questions or hypotheses

Research Question: Does generative AI clinical decision support (GenAI-CDS) access during formative simulation-based learning support or undermine the independent clinical reasoning that licensure examinations measure? Specifically, does unrestricted AI access during an acute stroke simulation impair schema formation in pre-clinical osteopathic medical students, as measured by transfer performance on an AI-absent key-feature examination? \Hypotheses: H1 (Task Performance): CDS-access teams will demonstrate faster time-to-critical-action and equivalent or better diagnostic accuracy during simulation compared to CDS-absent teams. Rationale: GenAI-CDS offloads information search, reducing extraneous cognitive load during the task itself. H2 (Cognitive Load): CDS-access teams will report lower perceived workload on NASA-TLX mental demand and temporal demand subscales than CDS-absent teams. This finding alone is ambiguous. Lower workload may reflect reduced extraneous load (beneficial) or reduced germane engagement (harmful). The Paas mental effort scale administered at three intraprocedural timepoints (post-recognition, post-imaging, post-thrombolytic decision) provides temporal resolution to distinguish these interpretations. H3 (Transfer, PRIMARY): On the AI-absent key-feature examination administered one week post-simulation, CDS-absent and CDS-delayed teams will score equivalently or higher than CDS-access teams on items requiring discriminative clinical reasoning. This is the primary hypothesis. It tests the never-skilling prediction derived from Cognitive Load Theory: that unrestricted AI access during formative learning bypasses germane processing, impairing schema formation measurable as reduced transfer performance when AI is removed. A non-significant result will be evaluated using TOST equivalence testing to distinguish true equivalence from insufficient power. H4 (Verification Behaviors): Within the CDS-access and CDS-delayed arms, learners who demonstrate verification behaviors (counter-evidence articulation, explicit reasoning stated before accepting CDS output, cross-checking CDS recommendations against patient data or pathway references) will score higher on the key-feature examination than learners who demonstrate uncritical adoption of CDS output. This tests the copilot/autopilot distinction at the individual behavioral level, using CHART+M coded indicators as predictors.

Foreknowledge of data or evidence

Data does not yet exist. No part of the data that will be used for this analysis plan exists, and no part will be generated until after this plan is registered.

https://osf.io/8bytc/overview

AwesomeCapital

Search This Blog