Researchers at Vanderbilt University Medical Center have built and refined a machine learning-based model for lung cancer prediction to support lung specialists in diagnosing and evaluating indeterminate pulmonary nodules (IPNs).
The team developed the model for more accurate disease prediction in higher-risk populations evaluated in pulmonology and surgical specialty clinics. The team’s data and findings, recently published in CHEST, demonstrate the model’s usefulness in these often difficult cases.
“According to current guidelines, the preferred approach for estimating cancer risk in IPNs is using clinical expertise or a validated prediction model,” said senior author Eric Grogan, M.D., an associate professor of thoracic surgery at Vanderbilt.
“Our ultimate goal is to have an easy-to-use calculator that reduces unnecessary invasive procedures and quickens the time to diagnosis for cancer.”
In collaboration with Stephen Deppen, Ph.D., an epidemiologist and an associate professor of thoracic surgery at Vanderbilt, Grogan and his team designed the Thoracic Research Evaluation and Treatment (TREAT) version 2.0 model as a robust yet flexible tool for use by pulmonologists and thoracic surgeons.
Designing the 2.0 Model
According to Grogan and Deppen, many of the existing models were developed using single-site cohorts of only a few hundred patients, limiting generalizability.
Additionally, patients examined in specialty clinics typically have more information from past workup available for researchers and clinicians to tap into, unlike with conventional tools, such as the Mayo Clinic, Herder, and Brock models.
Outside of specialty clinics, patients may lack information for one or more variables used in the complex analysis, potentially “compromising the predictive accuracy of these models,” Grogan said.
To overcome this challenge, the TREAT 2.0 model incorporates a novel method addressing missing-data in predictive modeling: the pattern submodel approach.
Superior to Existing Models
The researchers used data from clinical populations with high cancer prevalence and who had been referred for specialty evaluation. The study included 1,401 IPNs.
Team member Caroline Godrey, M.D., a thoracic surgery resident at Vanderbilt, updated and expanded their original TREAT version 1.0 model to begin the work.
With the pattern submodel approach, their TREAT 2.0 model mean area under the receiver operating characteristic curve (AUC) was 0.85, compared with their original (0.80), Mayo Clinic (0.72), Herder (0.73), and Brock (0.68) models, with improved calibration.
“The 2.0 model that Dr. Godfrey refined is more accurate and better calibrated for predicting lung cancer in a variety of high-risk IPN evaluation environments compared to the other existing models,” Deppen said.
While these findings are encouraging, Grogan and Deppen caution that their team’s tool is not yet ready for widespread use. Further testing is needed in populations with lower cancer prevalence, such as a general lung-cancer screening population, to refine the model and to determine how the scoring system could be applied in non-specialty clinical settings.
“The inclusion of other parameters, such as blood-based cancer and fungal biomarkers and imaging-based radiomic biomarkers, may refine the model further,” added Grogan.
Next Steps
The TREAT 2.0 model, like most prediction models, is not easily calculated manually. To facilitate widespread adoption of the model by clinicians, a web-based calculator is being developed by the research team. The researchers plan to release this calculator for public use.
Funding for the project came from several sources, including the Agency for Healthcare Research.
“Our ultimate goal is to have an easy-to-use calculator that reduces unnecessary invasive procedures and quickens the time to diagnosis for cancer,” Grogan said.