Researchers are exploring how to mine EHRs for genetic and other data to develop better risk phenotypes for addiction.

Opioid addiction and other substance use disorders are not well documented in the EHR, with only about one in five people with a substance use disorder having an ICD code indicating the condition.

This deficit limits geneticists and other researchers in the field as they routinely use EHR data to conduct  studies.

To tackle this issue, investigators at Vanderbilt University Medical Center are part of a five-year grant from the National Institute on Drug Abuse to study the genetics and epigenetics of substance use disorders. The goal is to uncover markers in the EHR that will help researchers develop better phenotypes for these conditions and, ultimately, better predict who is at higher risk.

“Genetic studies currently depend on billing codes, as conducting chart reviews at the scale required to see who has substance abuse and who doesn’t is extremely costly,” said Alvin Jeffery, Ph.D., an informatics researcher at Vanderbilt. “Our work is not to definitively conclude that a patient has or doesn’t have substance use disorder, but rather that the probability falls along a continuum.” 

Jeffery, a pediatric critical care nurse turned biomedical informatician, lends his expertise in developing back-end computational work to address problems across health care.

“A lot of my research has been centered around finding nursing-sensitive outcomes, which are not always well-described in the structured data of the EHR. Over time I’ve developed data-science methods and informatics approaches that allow me to play with data in unique ways.”

Tracking Genetic Markers

Jeffery’s previous work focused on predicting risk of in-hospital patient deterioration.

“What got me interested in substance abuse is a study we did to understand the genetic contribution of opioid-induced respiratory depression in hospital patients,” he said. In this retrospective study, which is in press, Jeffery and colleagues used machine learning to identify individuals with this hard-to-find outcome from EHR data.

“We’ve used ICD codes for so long – everyone has ICD codes. We need models that are highly attuned to providers’ notes and other information in the record.”

Hearing of this work, a group of geneticists approached Jeffery and asked for his help.

“In their own research, they had found that even a simple natural language processing [NLP] model was better than using ICD codes,” he said.

Together, they hope to create a learning framework that allows providers or researchers in any organization to develop their own probable substance use disorder phenotypes — then to leverage data collected by multiple organizations.

“The big question is: How do we share phenotype information? That’s why we’ve used ICD codes for so long – everyone has ICD codes,” Jeffery explained. “We need models that are highly attuned to providers’ notes and other information in the record.”

Vanderbilt is part of the eMERGE network, which combines DNA biorepositories with EHR systems from multiple organizations for large-scale, high-throughput genetic research. Having this access allows Jeffery’s team to test nearly any hypothesis and to conduct research that would be impossible using just one institution’s data.

“If we could identify one or two genetic variations through this research, they might be key markers we could add into blood work for a better prediction model.”

Better Decision Support Tools

A second grant from Betty Irene Moore Fellowship for Nurse Leaders and Innovators will fund Jeffery’s work to develop better clinical decision support information for providers.

“It seems like these grants are for different things, but actually they’re connected,” Jeffery said. “Both will result in better decision support tools, help to advance predictive modeling, and improve how we take complex information and embed it in clinician workflows.”

There is a great deal of quantitative information bombarding today’s clinician, Jeffery says, and some advanced predictive tools haven’t helped patients because they’re not delivered in a manner conducive to the health care workflow.

“In displaying information related to risk, providers may need to see information in different ways, but right now, we treat everyone the same,” Jeffery noted. “With this grant, we’re developing a method that allows us to test new methods for displaying information and hopefully, to converge on solutions that are more meaningful.”

Bios

Portrait of Alvin Jeffery, Ph.D., R.N., a white man in a white bowtie and blue shirt/jacket standing in front of a gray background

Alvin D. Jeffery, Ph.D.

Alvin D. Jeffery, Ph.D., R.N., is an assistant professor at Vanderbilt University School of Nursing and in the Department of Biomedical Informatics at Vanderbilt University Medical Center. Jeffery’s research leverages machine learning and data science to develop prediction models. He incorporates qualitative methods for clinical-decision support tools within nurses' cognitive and physical workflows.