1. Adnan and colleagues evaluated machine learning models’ ability to screen for Parkinson’s disease using self-recorded smile videos.
2. The models achieved high sensitivity and specificity among patient cohorts from the United States (US) and Bangladesh.
Evidence Rating Level: 2 (Good)
Study Rundown: Parkinson’s disease (PD) is the fastest-growing neurodegenerative disorder. However, access to clinical diagnosis is challenging due to a shortage of trained neurologists, especially in underprivileged areas. Adnan and colleagues evaluated machine learning models’ ability to screen for PD using self-recorded videos. The videos included participants making facial expressions that correspond to three emotions: smile, disgust, and surprise. Data was split into validation and test sets. 18 models were trained on facial expressions, and the best-performing model was evaluated on two external test data sets. The metrics were accuracy, Area Under the Receiver Operating Characteristics Curve (AUROC), sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). For the test data that consisted of participants from the US, the study found that the best-performing model achieved an accuracy of 80.3 ± 1.6%, an AUROC of 83.3%, a sensitivity of 80.0%, a specificity of 80.5%, a PPV of 71.1%, and an NPV of 87.1%. The model achieved similar performance with data from Bangladeshi participants, except for a lower PPV. This study demonstrated that machine learning models can offer an accessible and cost-efficient way to screen PD in resource-limited areas.
Click here to read the study in NEJM AI
Relevant Reading: Explainable artificial intelligence to diagnose early Parkinson’s disease via voice analysis
In-Depth [retrospective cohort]: Facial-expression videos were collected from four settings: participants’ homes worldwide (Home–Global cohort); a US-based clinic (Clinic cohort); a US-based PD wellness center (PD Care Facility cohort); and participants’ homes in Bangladesh (Home–BD cohort). A total of 1452 participants (391 with PD and 1061 controls) were included. A 10-fold cross-validation was conducted on the data from the Home–Global and PD Care Facility cohorts, then the best-performing model underwent generalizability testing on the Clinic and Home-BD data sets. Model performance metrics included accuracy, AUROC, sensitivity, specificity, PPV, and NPV in all testing stages. The study extracted 126 facial features from the videos, and 43 features differed significantly between the PD and control groups. These features were predominantly from the smile expression, and smile-only models outperformed models that used other combinations in cross-validation. During generalizability testing, the best-performing model achieved an accuracy of 80.3 ± 1.6%, an AUROC of 83.3 ± 1.4%, a sensitivity of 80.0 ± 2.5%, a specificity of 80.5 ± 2.0%, a PPV of 71.1 ± 2.2%, and an NPV of 87.1 ± 1.5% on the Clinic cohort. For the Home-BD cohort, it achieved an accuracy of 85.3 ± 1.4%, an AUROC of 81.5 ± 1.8%, a sensitivity of 71.4 ± 2.2%, a specificity of 86.8 ± 3.0%, a PPV of 35.7 ± 4.8%, and an NPV of 96.7 ± 1.2%. These metrics compared favorably with clinician accuracy (69.7-92.6%). Overall, this study offered evidence that machine learning models trained on smile videos can accurately detect PD.
Image: PD
©2025 2 Minute Medicine, Inc. All rights reserved. No works may be reproduced without expressed written consent from 2 Minute Medicine, Inc. Inquire about licensing here. No article should be construed as medical advice and is not intended as such by the authors or by 2 Minute Medicine, Inc.