• About
  • Masthead
  • License Content
  • Advertise
  • Submit Press Release
  • RSS/Email List
  • 2MM Podcast
  • Write for us
  • Contact Us
2 Minute Medicine
No Result
View All Result

No products in the cart.

SUBSCRIBE
  • Specialties
    • All Specialties, All Recent Reports
    • Cardiology
    • Chronic Disease
    • Dermatology
    • Emergency
    • Endocrinology
    • Gastroenterology
    • Imaging and Intervention
    • Infectious Disease
    • Nephrology
    • Neurology
    • Obstetrics
    • Oncology
    • Ophthalmology
    • Pediatrics
    • Pharma
    • Preclinical
    • Psychiatry
    • Public Health
    • Pulmonology
    • Rheumatology
    • Surgery
  • AI EvidencePulse™
  • Pharma
  • AI News
  • The Scan+
  • Classics™+
    • 2MM+ Online Access
    • Paperback and Ebook
  • Rewinds
  • Podcasts
  • Partners
    • License Content
    • Submit Press Release
    • Advertise with Us
  • Account
    • Subscribe
    • Sign-in
    • My account
2 Minute Medicine
  • Specialties
    • All Specialties, All Recent Reports
    • Cardiology
    • Chronic Disease
    • Dermatology
    • Emergency
    • Endocrinology
    • Gastroenterology
    • Imaging and Intervention
    • Infectious Disease
    • Nephrology
    • Neurology
    • Obstetrics
    • Oncology
    • Ophthalmology
    • Pediatrics
    • Pharma
    • Preclinical
    • Psychiatry
    • Public Health
    • Pulmonology
    • Rheumatology
    • Surgery
  • AI EvidencePulse™
  • Pharma
  • AI News
  • The Scan+
  • Classics™+
    • 2MM+ Online Access
    • Paperback and Ebook
  • Rewinds
  • Podcasts
  • Partners
    • License Content
    • Submit Press Release
    • Advertise with Us
  • Account
    • Subscribe
    • Sign-in
    • My account
SUBSCRIBE
2 Minute Medicine
Subscribe
Home 2 Minute Medicine

Large language models show potential to provide feedback on research papers on a large-scale

byCheng En XiandDeepti Shroff
June 18, 2025
in 2 Minute Medicine
Reading Time: 3 mins read
0
Share on FacebookShare on Twitter

1. Liang and colleagues retrospectively compared Generative Pretrained Transformer 4 (GPT-4)’s comments on scientific papers to those of human peer reviewers.

2. Appropriate overlap between the comments made by GPT-4 and human reviewers was found, and researchers found GPT-4-generated feedback beneficial.

Evidence Rating Level: 2 (Good)

Study Rundown: Effective feedback from peer reviewers is crucial for rigorous scientific research, but it is a time and resource-intensive process. Large language models (LLMs) have the potential to automate feedback generation. Liang and colleagues developed a GPT-4-based scientific feedback generator and retrospectively evaluated its feedback against feedback from human reviewers. Research papers and their comments from reviewers were collected, and the LLM generated structured feedback from the PDFs of the papers. Extractive text summarization was conducted on the LLM- and human-generated feedback and semantic text matching were used to identify overlaps. Additionally, the authors surveyed 308 researchers who used LLM-generated feedback to evaluate their utility. For papers from the Nature family, the study found that more than half of the comments made by GPT-4 were also made by at least one human reviewer. The survey found that 50.3% of the researchers who had used LLM found the feedback helpful, and 20.1% considered it similarly helpful to human feedback. This study demonstrated the potential for LLM to provide useful and timely comments to researchers when human expert feedback is unavailable.

Click here to read the study in NEJM AI

Relevant Reading: Leveraging artificial intelligence to enhance systematic reviews in health research: advanced tools and challenges

RELATED REPORTS

Development of a Clinical Prediction Model for Anastomotic Leakage in Colorectal Surgery

2 Minute Medicine Rewind Oct 27, 2025

Artificial intelligence (AI) may reduce cognitive load for sonographers in fetal ultrasound scans without affecting diagnostic performance

In-Depth [retrospective cohort]: Two datasets, one consisting of 3096 scientific papers and 8745 comments from 15 Nature family journals, and another consisting of 1709 papers and 6505 comments from the International Conference on Learning Representations (ICLR), were produced. The research papers were given to the LLM to generate structured feedback, and feedback and comments from human reviewers underwent extractive text summarization and semantic text matching. Further, 308 researchers from 110 institutions who had received LLM-generated feedback on their papers were asked to evaluate the LLM’s utility and performance. For the Nature dataset, 57.55% of the LLM-generated comments overlapped with at least one human reviewer. 30.85% of the LLM-generated comments overlapped with an individual reviewer, which was similar to the degree of overlap between two human reviewers (28.58%). For the ICLR dataset, 77.18% of the LLM-generated comments overlapped with at least one human reviewer. The overlap between LLM and individual reviewers was also similar to that of two human reviewers in this dataset. Additionally, LLMs commented on research implications 7.27 times more frequently than humans, while focusing less on the study’s novelty. For the prospective user survey, 50.3% of the respondents found the feedback to be helpful, and 7.1% found it very helpful. 50.5% were willing to reuse the system, and the respondents were optimistic about the LLM’s continued use. The study’s limitations included a lack of fine-tuning for the GPT-4 model and the restriction to English-language studies. In summary, this study provided promising evidence for LLMs to provide feedback on research papers.

Image: PD

©2025 2 Minute Medicine, Inc. All rights reserved. No works may be reproduced without expressed written consent from 2 Minute Medicine, Inc. Inquire about licensing here. No article should be construed as medical advice and is not intended as such by the authors or by 2 Minute Medicine, Inc.

Tags: artificial infelligencelarge language modelsmachine learningresearch
Previous Post

#VisualAbstract: Elinzanetant Effectively Reduces Vasomotor Symptoms from Endocrine Therapy for Breast Cancer

Next Post

#VisualAbstract: Tarlatamab Improves Survival in Small-Cell Lung Cancer after Platinum-Based Chemotherapy

RelatedReports

Survival greater in cervical cancer patients undergoing abdominal hysterectomy compared to minimally invasive techniques: the LACC trial
Gastroenterology

Development of a Clinical Prediction Model for Anastomotic Leakage in Colorectal Surgery

October 31, 2025
UTI associated with increased risk of preeclampsia
Weekly Rewinds

2 Minute Medicine Rewind Oct 27, 2025

October 27, 2025
Paternal factors associated with short interpregnancy interval
2 Minute Medicine

Artificial intelligence (AI) may reduce cognitive load for sonographers in fetal ultrasound scans without affecting diagnostic performance

October 22, 2025
Significant number of wrong-patient errors in radiology reports
2 Minute Medicine

Juror perception of radiologist liability can be affected by artificial intelligence (AI) use in diagnosis

October 8, 2025
Next Post
#VisualAbstract: Tarlatamab Improves Survival in Small-Cell Lung Cancer after Platinum-Based Chemotherapy

#VisualAbstract: Tarlatamab Improves Survival in Small-Cell Lung Cancer after Platinum-Based Chemotherapy

Lisinopril and carvedilol reduce cardiotoxicity in breast cancer patients receiving trastuzumab and anthracyclines

Artificial intelligence may assist in early detection of decreased ejection fraction on echocardiograms

Non-alcoholic fatty liver disease and risk of incident acute myocardial infarction and stroke: findings from matched cohort study of 18 million European adults

Early screening for emotional and cognitive issues may improve psychiatric outcomes for stroke patients

2 Minute Medicine® is an award winning, physician-run, expert medical media company. Our content is curated, written and edited by practicing health professionals who have clinical and scientific expertise in their field of reporting. Our editorial management team is comprised of highly-trained MD physicians. Join numerous brands, companies, and hospitals who trust our licensed content.

Recent Reports

  • Self-Administered Hypnosis vs Sham Hypnosis for Hot Flashes: A Randomized Clinical Trial
  • Association between baseline BMI and in-hospital mortality in critically ill cardiac surgery patients: a retrospective cohort study
  • Wide awake local anesthesia no tourniquet (WALANT) versus ultrasound-guided axillary block in carpal tunnel release: a non-inferiority randomized controlled trial
License Content
Terms of Use | Disclaimer
Cookie Policy
Privacy Statement (EU)
Disclaimer

© 2025 2 Minute Medicine, Inc. - Physician-written medical news.

  • Specialties
    • All Specialties, All Recent Reports
    • Cardiology
    • Chronic Disease
    • Dermatology
    • Emergency
    • Endocrinology
    • Gastroenterology
    • Imaging and Intervention
    • Infectious Disease
    • Nephrology
    • Neurology
    • Obstetrics
    • Oncology
    • Ophthalmology
    • Pediatrics
    • Pharma
    • Preclinical
    • Psychiatry
    • Public Health
    • Pulmonology
    • Rheumatology
    • Surgery
  • AI EvidencePulse™
  • Pharma
  • AI News
  • The Scan
  • Classics™
    • 2MM+ Online Access
    • Paperback and Ebook
  • Rewinds
  • Podcasts
  • Partners
    • License Content
    • Submit Press Release
    • Advertise with Us
  • Account
    • Subscribe
    • Sign-in
    • My account
No Result
View All Result

© 2025 2 Minute Medicine, Inc. - Physician-written medical news.