• About
  • Masthead
  • License Content
  • Advertise
  • Submit Press Release
  • RSS/Email List
  • 2MM Podcast
  • Write for us
  • Contact Us
2 Minute Medicine
No Result
View All Result

No products in the cart.

SUBSCRIBE
  • Specialties
    • All Specialties, All Recent Reports
    • Cardiology
    • Chronic Disease
    • Dermatology
    • Emergency
    • Endocrinology
    • Gastroenterology
    • Imaging and Intervention
    • Infectious Disease
    • Nephrology
    • Neurology
    • Obstetrics
    • Oncology
    • Ophthalmology
    • Pediatrics
    • Pharma
    • Preclinical
    • Psychiatry
    • Public Health
    • Pulmonology
    • Rheumatology
    • Surgery
  • Tools
    • EvidencePulse™
    • RVU Search
    • NPI Registry Lookup
  • Pharma
  • AI News
  • The Scan+
  • Classics™+
    • 2MM+ Online Access
    • Paperback and Ebook
  • Rewinds
  • Partners
    • License Content
    • Submit Press Release
    • Advertise with Us
  • Account
    • Subscribe
    • Sign-in
    • My account
2 Minute Medicine
  • Specialties
    • All Specialties, All Recent Reports
    • Cardiology
    • Chronic Disease
    • Dermatology
    • Emergency
    • Endocrinology
    • Gastroenterology
    • Imaging and Intervention
    • Infectious Disease
    • Nephrology
    • Neurology
    • Obstetrics
    • Oncology
    • Ophthalmology
    • Pediatrics
    • Pharma
    • Preclinical
    • Psychiatry
    • Public Health
    • Pulmonology
    • Rheumatology
    • Surgery
  • Tools
    • EvidencePulse™
    • RVU Search
    • NPI Registry Lookup
  • Pharma
  • AI News
  • The Scan+
  • Classics™+
    • 2MM+ Online Access
    • Paperback and Ebook
  • Rewinds
  • Partners
    • License Content
    • Submit Press Release
    • Advertise with Us
  • Account
    • Subscribe
    • Sign-in
    • My account
SUBSCRIBE
2 Minute Medicine
Subscribe
Home All Specialties Oncology

GPT-4 performing with superior scores on medical oncology examination questions

bySimon PanandAlex Chan
December 17, 2024
in Oncology, Preclinical
Reading Time: 3 mins read
0
Share on FacebookShare on Twitter

1. The large language model, ChatGPT-4, answered 85.0% of examination-style multiple-choice questions on medical oncology correctly, a performance superior to all other large language models and comparable with medical oncology trainees.

2. Approximately 80% of incorrect answers were rated by clinicians as having a medium to high risk of causing moderate to severe harm if acted upon in clinical practice.

Evidence Rating Level: 2 (Good)

Study Rundown: Large language models (LLMs) may have extraordinary utility across various healthcare settings. For example, potential applications in the field of oncology range from assistance in administrative tasks to clinical decision-making. This cross-sectional study therefore sought to evaluate the medical oncology knowledge of the LLMs, ChatGPT-3.5 (proprietary LLM 1), ChatGPT-4 (proprietary LLM 2), and various open-source LLMs. Proprietary LLM 1 and proprietary LLM 2 were evaluated on their performance across 147 medical oncology examination questions from ASCO’s Oncology Self-Assessment Series, ESMO’s Examination Trial Questions, and unseen original questions. Proprietary LLM 2 achieved the highest performance among all LLMs by answering 85.0% of questions correctly. However, roughly 64% of incorrect answers were considered to have a medium likelihood of causing patient harm, and roughly 18% of incorrect answers were considered to have a high likelihood of causing patient harm if acted upon in practice. Approximately 82% of incorrect answers had a medium or high likelihood of causing moderate or severe harm. Overall, this study found that LLMs are capable of performing well on examination-style multiple-choice medical oncology questions, with some safety concerns being raised surrounding the possible consequences of incorrect decision-making. As such, the use of LLMs in medical oncology may be best applied to low-risk settings or under intensive human supervision with guidelines in place to ensure the safe application of LLMs in clinical practice.

Click to read the study in JAMA Network Open

Relevant Reading: Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models

RELATED REPORTS

2 Minute Medicine: Pharma Roundup – New oral gonorrhea antibiotic Nuzolvence (zoliflodacin), late‑December cardiovascular and oncology drug approvals, first targeted complement inhibitor therapy for HSCT‑TMA (Yartemlea), and antibody‑drug conjugate (ADC) safety concerns with ifinatamab deruxtecan (I‑DXd) [January 2026]

High-flow nasal therapy with room air and fan therapy provide modest relief of dyspnea in advanced cancer patients who are not hypoxemic

The addition of a carboplatin to standard of care adjuvant chemotherapy improves survival outcomes in patients with high-risk, early-stage triple-negative breast cancer

In-Depth [cross-sectional study]: In recent years, the potential utility of LLMs in healthcare settings has been an important topic for investigation. They have already been shown to be capable of passing the United States Medical Licensing Examination while demonstrating remarkable knowledge recall and reasoning abilities. However, the performance of LLMs on examinations across different medical subspecialties is highly varied, and their performance on medical oncology examinations is not yet known. This cross-sectional study therefore sought to investigate the medical oncology knowledge of LLMs and their performance across examination-style multiple choice medical oncology questions. Proprietary LLM 1 and proprietary LLM 2 were assessed on 52 questions from ASCO, 75 questions from ESMO and 20 original questions. Proprietary LLM 2 achieved the highest accuracy of all LLMs assessed at 85.0% (95% CI = 78.2% to 90.4%; P < 0.001 vs random answering) with similar performance across each of the question sets (80.8%, 95% CI = 67.5% to 90.4%, P < 0.001; 88.0%, 95% CI = 78.4% to 94.4%, P < 0.001; 85.0%, 95% CI = 62.1% to 96.8%, P < 0.001 for ASCO, ESMO and original questions respectively). Proprietary LLM 1 achieved an accuracy of 60.5% (95% CI = 50.0% to 66.4%; P < 0.001 vs random answering). Incorrect answers by proprietary LLM 2 were more common when questions involved knowledge from recent publications (Wilcoxon test P = 0.02), with 63.6% of incorrect answers being due to incorrect knowledge recall. Among incorrect answers by proprietary LLM 2, the likelihood of causing patient harm by applying the error in practice was considered medium in 63.6% of incorrect answers (95% CI = 43.0% to 85.4%) and high in 18.2% of incorrect answers (95% CI = 5.2% to 40.3%). The extent of possible harm was considered to be moderate in 63.6% of incorrect answers (95% CI = 43.0% to 85.4%) and likely to cause severe harm or lead to death in 18.2% of incorrect answers (95% CI = 5.2% to 40.3%).

Image: PD

©2024 2 Minute Medicine, Inc. All rights reserved. No works may be reproduced without expressed written consent from 2 Minute Medicine, Inc. Inquire about licensing here. No article should be construed as medical advice and is not intended as such by the authors or by 2 Minute Medicine, Inc. 

Tags: artificial infelligencechatGPTllmoncologyopenai
Previous Post

Johnson & Johnson’s Tremfya seeks to expand FDA approval for pediatric indications to treat juvenile psoriatic arthritis

Next Post

#VisualAbstract: Imlunestrant with or without Abemaciclib in Advanced Breast Cancer

RelatedReports

2 Minute Medicine: Pharma Roundup: Price Hikes, Breakthrough Approvals, Legal Showdowns, Biotech Expansion, and Europe’s Pricing Debate [May 12nd, 2025]
Cardiology

2 Minute Medicine: Pharma Roundup – New oral gonorrhea antibiotic Nuzolvence (zoliflodacin), late‑December cardiovascular and oncology drug approvals, first targeted complement inhibitor therapy for HSCT‑TMA (Yartemlea), and antibody‑drug conjugate (ADC) safety concerns with ifinatamab deruxtecan (I‑DXd) [January 2026]

January 12, 2026
Chronic Disease

High-flow nasal therapy with room air and fan therapy provide modest relief of dyspnea in advanced cancer patients who are not hypoxemic

January 12, 2026
Age and breast cancer risk factors associated with false-positive mammography results
Chronic Disease

The addition of a carboplatin to standard of care adjuvant chemotherapy improves survival outcomes in patients with high-risk, early-stage triple-negative breast cancer

January 12, 2026
Shorter overall survival among patients with monoclonal gammopathy of undetermined significance
Chronic Disease

Pre-maintenance positron emission tomography/computed tomography and bone marrow multiparameter flow cytometry are important for prognostication in multiple myeloma treated with daratumumab

January 13, 2026
Next Post
#VisualAbstract: Imlunestrant with or without Abemaciclib in Advanced Breast Cancer

#VisualAbstract: Imlunestrant with or without Abemaciclib in Advanced Breast Cancer

Use of hydroxychloroquine may be protective for cardiovascular events in patients with systemic lupus erythematosus 

Androgen deprivation in prostate cancer: intermittent may compromise survival

Radiotherapy and abiraterone improve survival in low-volume metastatic castration-sensitive prostate cancer

2 Minute Medicine® is an award winning, physician-run, expert medical media company. Our content is curated, written and edited by practicing health professionals who have clinical and scientific expertise in their field of reporting. Our editorial management team is comprised of highly-trained MD physicians. Join numerous brands, companies, and hospitals who trust our licensed content.

Recent Reports

  • Edaravone dexborneol improves functional independence in patients with acute ischaemic stroke following endovascular thrombectomy
  • Yartemlea (narsoplimab-wuug) improves platelets hemolysis organ function hematopoietic stem cell transplant–associated thrombotic microangiopathy transplant patients
  • FDA highlights flexible chemistry manufacturing controls expectations for cell and gene therapies
License Content
Terms of Use | Disclaimer
Cookie Policy
Privacy Statement (EU)
Disclaimer

© 2025 2 Minute Medicine, Inc. - Physician-written medical news.

  • Specialties
    • All Specialties, All Recent Reports
    • Cardiology
    • Chronic Disease
    • Dermatology
    • Emergency
    • Endocrinology
    • Gastroenterology
    • Imaging and Intervention
    • Infectious Disease
    • Nephrology
    • Neurology
    • Obstetrics
    • Oncology
    • Ophthalmology
    • Pediatrics
    • Pharma
    • Preclinical
    • Psychiatry
    • Public Health
    • Pulmonology
    • Rheumatology
    • Surgery
  • Tools
    • EvidencePulse™
    • RVU Search
    • NPI Registry Lookup
  • Pharma
  • AI News
  • The Scan
  • Classics™
    • 2MM+ Online Access
    • Paperback and Ebook
  • Rewinds
  • Partners
    • License Content
    • Submit Press Release
    • Advertise with Us
  • Account
    • Subscribe
    • Sign-in
    • My account
No Result
View All Result

© 2025 2 Minute Medicine, Inc. - Physician-written medical news.