Large Language Models in Surgery: Promise, Pitfalls, and Practical Use

Denham, Danette  T; Wang, Colin  Y; Maric, Emil; Hinton, Lucy  R; Heniford, B  Todd

doi:10.3389/jaws.2026.16349

MINI REVIEW

J. Abdom. Wall Surg.

Large Language Models in Surgery: Promise, Pitfalls, and Practical Use

DT
Danette T Denham
CY
Colin Y Wang
EM
Emil Maric
LR
Lucy R Hinton
BT
B Todd Heniford

Endeavor Health, Evanston, United States

The final, formatted version of the article will be published soon.

Abstract

Background: Large Language Models(LLMs) represent a transformative advancement in artificial intelligence(AI) with rapidly expanding applications in medicine. While AI-related medical publications increased 36-fold between 2000-2022, practical guidance for surgeons remains limited. This mini-review delineates pragmatic applications of LLMs in surgical practice while addressing key limitations, implementation considerations, and ethical considerations. Methods: We reviewed contemporary LLM platforms and their integration into clinical workflows, patient communication, academic research and writing, evaluating both benefits, constraints and risk mitigation relevant to practicing surgeons. Findings: LLMs demonstrate significant utility across multiple domains. In clinical workflows, ambient documentation may reduce documentation burden supporting rapid synthesis of complex patient data. These tools can simplify complex medical information, tailor or translate instructions to appropriate reading levels or languages, and generate empathetic responses. In research, LLMs assist with literature summarization, study design optimization, and risk of bias assessment in RCT, allowing surgeons to focus on higher-level scientific reasoning. Despite promising applications, several constraints demand attention. Effective prompting requires specific techniques including clear clinical objectives and iterative refinement. LLM outputs require verification to prevent "hallucinations" - fabricated or inaccurate information. Protected health information(PHI) must never be entered into public platforms to maintain HIPAA compliance. Liability frameworks for AI-generated errors remain ambiguous, with unclear responsibility deferred amongst providers, institutions, and developers. Conclusion: LLMs offer surgeons valuable tools for enhancing workflow efficiency and patient communication when deployed with appropriate oversight. Success requires understanding prompt engineering, maintaining rigorous fact-checking protocols, protecting patient privacy, and recognizing that human judgment remains irreplaceable in clinical decision-making.

Summary

Keywords

academic research, Artificial Intelligence (AI), Large language models, patient outcomes, Surgery

Received

03 February 2026

Accepted

16 March 2026

© 2026 Denham, Wang, Maric, Hinton and Heniford. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: B Todd Heniford

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

MINI REVIEW

Large Language Models in Surgery: Promise, Pitfalls, and Practical Use

Abstract

Summary

Outline

Share article