Low statistical power leads to false reassurance when examining surgical outcomes

1. The ability to identify surgeons with high mortality rates is poor in surgical specialties with low overall procedure numbers. 

2. Solutions to the reporting bias created by low case numbers include: using data from care units rather than individual surgeons in statistical analysis, identifying common negative postoperative outcomes, and avoiding the assumption that a lack of poor outcomes indicates good performance. 

Study Rundown: Complication and morality statistics have become increasingly important in an era of pay-for-performance medicine. Outcomes data not only drive reimbursements by insurance companies but also care-related recommendations published by specialized governing bodies and medical decisions made by patients and their physicians. The authors of this study sought to identify potential pitfalls encountered in the reporting of outcome for individual surgeons across several sub-specialty fields. Hip fracture surgery, esophagectomy/gastrectomy, bowel cancer resection, and cardiac surgery were selected for study as they were procedures where case numbers may affect outcomes. The authors noted that in specialties where the total number of procedures is low, identifying a surgeon with poor performance is difficult due to statistical constraints. Furthermore, they found that few surgeons performed the studied procedures at frequencies that would provide adequate power in the analyses of their surgical skill. A key weakness of this study is that it was carried out utilizing only four surgical sub-specialties. Therefore, its applicability to other fields, such as head and neck surgery or neurosurgery, is limited.

Study author, Dr. Jenny Neuburger, BSc MSc PhD, talks to 2 Minute Medicine: The London School of Hygiene and Tropical Medicine:

“Our study reveals that although mortality rates may reflect the performance of individual surgeons for some procedures like cardiac surgeries which are performed more frequently, they may be far less effective for other procedures such as bowel cancer resection which is done less commonly. The danger is that low numbers will mean that chance factors overwhelm the influence of surgeon performance on the number of deaths.  This could mask poor performance and lead to false complacency.”

In-Depth [retrospective cohort study]: The authors of this study utilized individual surgeon outcome information published in June, 2013 by the English National Health Service for their statistical analysis. To calculate statistical power among the four specialty areas the authors looked at four variables: 1) national overall mortality, 2) the mortality rate at which performance is deemed to be poor [defined as double the national overall mortality rate], 3) the statistical threshold used to test the individual surgeon’s rate against the overall national morality rate [5% significance level], and 4) the number of procedures performed by the surgeon. Utilizing this method the authors identified that the median number of bowel surgeries performed is roughly one-tenth the number necessary to achieve 60% power, and the median number of esophagectomy/ gastrectomy procedures is one-tenth of that needed to reach 70% power. Meanwhile, the median number of hip fracture and cardiac surgeries approach half that needed to achieve 70% power. Additionally the authors noted that reporting case numbers over a longer period of time could improve the power of outcome measures from between 0-1% for 70% power for all four specialties using a 1-year timeframe to 17-79% for 70% power if using a 5-year reporting period. The increased power in this interpretation, however, comes at the expense of timeliness of the data being reported.

By Devin Miller and Mimmie Kwong

