Data from synthesized patients enables complex simulation studies to assess new treatment and device safety.

A team of investigators at Vanderbilt University Medical Center has shown that data from synthesized patients can help clinicians understand how a new device or procedure may contribute to poor patient outcomes.

Synthetic data are information sets artificially created by computer algorithms, as opposed to real data collected from naturally occurring events. In health care, such content may be drawn from EHR datasets that include substitutions to avoid revealing patient identities or other sensitive information.

“Complex simulation studies are required to disentangle safety signals for medical treatments from the effects of experiential learning.”

“We conducted this study due to the need for evidence-based information to support device safety and innovation,” said Sharon Davis, Ph.D., a biomedical informatician who led the study along with Michael Matheny, M.D.

Their study was published in BMC Medical Research Methodology.

The FDA, the Brookings Institute and major cardiovascular societies all have called for new approaches to generate timely, evidence-based information for the examination of device safety and innovation.

“Complex simulation studies are required to develop and test new algorithms that disentangle safety signals for medical treatments from the effects of experiential learning,” Davis said.

Economic and Patient Impacts

The study’s authors highlighted the fact that mistakes can be costly: Medicare spent a minimum of $1.5 billion over 10 years to replace seven recalled models of faulty cardiovascular devices.

In addition, experiential learning – as providers and health care institutions master new technologies and treatments – confers additional risk that diminishes over time, complicating efforts to detect safety signals of new devices.

Davis noted that medical devices are approved based on a small and controlled population.

“Once a device is approved and used in patients of varying health, age and sex, real world post-market safety surveillance data is generated,” she said. “But when issues develop, it is not always clear whether it is due to an inherent problem with the new device or procedure or whether it is due instead to unfamiliarity and a need for clinical teams to master new techniques. This information is critical and is a priority research area by FDA.”

Devices used in a variety of specialties are being evaluated for safety issues, according to Davis.

“As new models of medical devices are made or updated, it is important to know their safety in long-term use. Data from synthesized patients allow us to test novel methods for assessing safety in the context that real world evidence studies must navigate,” she said.

Study Design and Key Findings

The study team designed a multi-step, hierarchical data-generating process with customizable options at each step. Among all synthetic datasets simulated, the number of patients ranged from 559 to 14,690.

The processes generated synthetic patient features from either defined distributions or EHR-based data cubes reflecting the complexity of data distributions and correlations. Each patient was assigned to an institution, provider and time of treatment.

The patient data was then sorted into a treatment-specific case series that could be associated with additional risk of an adverse outcome based on a provider’s or the institution’s depth of experience. Outcomes were assigned based on patient-risk profiles, treatment assignments and provider or institutional-learning effects. To further reflect real-world complexity in clinical data, the authors included features to create random omissions of variables and inserted noise.

“This highly customizable framework can aid in the development and testing of algorithms by allowing analysis teams to evaluate findings against known true patterns in the data while also challenging methods with a context that reflects the real world,” Davis said.

“Our team is using these data to evaluate novel algorithms for distinguishing learning and treatment effects, thereby helping to identify training opportunities and hasten treatment improvements to promote both patient safety and access to novel treatments.”

Other researchers on the study from Vanderbilt University Medical Center include Dax Westerman, M.S., and Theodore Speroff, Ph.D. The Vanderbilt researchers were joined by researchers from seven other institutions. The research was supported by the National Heart, Lung, and Blood Institute.

About the Expert

Sharon Davis, Ph.D.

Sharon Davis, Ph.D., M.S., is a research assistant professor of biomedical informatics at Vanderbilt University Medical Center. Her research focuses on the development and maintenance of predictive models to support clinical prediction tools.