Welcome to Impact Factor, your weekly dose of commentary on a new medical study. I’m Dr F. Perry Wilson from the Yale School of Medicine.
I was really struggling to think of a good analogy to explain the glaring problem of polygenic risk scores (PRS) this week. But I think I have it now. Go with me on this.
An alien spaceship parks itself, Independence Day style, above a local office building.
But unlike the aliens that gave such a hard time to Will Smith and Brent Spiner, these are benevolent, technologically superior guys. They shine a mysterious green light down on the building and then announce, maybe via telepathy, that 6% of the people in that building will have a heart attack in the next year.
They move on to the next building. “Five percent will have a heart attack in the next year.” And the next, 7%. And the next, 2%.
Let’s assume the aliens are entirely accurate. What do you do with this information?
Most of us would suggest that you find out who was in the buildings with the higher percentages. You check their cholesterol levels, get them to exercise more, do some stress tests, and so on.
But that said, you’d still be spending a lot of money on a bunch of people who were not going to have heart attacks. So, a crack team of spies — in my mind, this is definitely led by a grizzled Ian McShane — infiltrate the alien ship, steal this predictive ray gun, and start pointing it, not at buildings but at people.
In this scenario, one person could have a 10% chance of having a heart attack in the next year. Another person has a 50% chance. The aliens, seeing this, leave us one final message before flying into the great beyond: “No, you guys are doing it wrong.”
This week: the people and companies using an advanced predictive technology, PRS , wrong — and a study that shows just how problematic this is.
We all know that genes play a significant role in our health outcomes. Some diseases (Huntington disease, cystic fibrosis, sickle cell disease, hemochromatosis, and Duchenne muscular dystrophy, for example) are entirely driven by genetic mutations.
The vast majority of chronic diseases we face are not driven by genetics, but they may be enhanced by genetics. Coronary heart disease (CHD) is a prime example. There are clearly environmental risk factors, like smoking, that dramatically increase risk. But there are also genetic underpinnings; about half the risk for CHD comes from genetic variation, according to one study.
But in the case of those common diseases, it’s not one gene that leads to increased risk; it’s the aggregate effect of multiple risk genes, each contributing a small amount of risk to the final total.
The promise of PRS was based on this fact. Take the genome of an individual, identify all the risk genes, and integrate them into some final number that represents your genetic risk of developing CHD.
The way you derive a PRS is take a big group of people and sequence their genomes. Then, you see who develops the disease of interest — in this case, CHD. If the people who develop CHD are more likely to have a particular mutation, that mutation goes in the risk score. Risk scores can integrate tens, hundreds, even thousands of individual mutations to create that final score.
There are literally dozens of PRS for CHD. And there are companies that will calculate yours right now for a reasonable fee.
The accuracy of these scores is assessed at the population level. It’s the alien ray gun thing. Researchers apply the PRS to a big group of people and say 20% of them should develop CHD. If indeed 20% develop CHD, they say the score is accurate. And that’s true.
But what happens next is the problem. Companies and even doctors have been marketing PRS to individuals. And honestly, it sounds amazing. “We’ll use sophisticated techniques to analyze your genetic code and integrate the information to give you your personal risk for CHD.” Or dementia. Or other diseases. A lot of people would want to know this information.
It turns out, though, that this is where the system breaks down. And it is nicely illustrated by this study, appearing this week in JAMA.
The authors wanted to see how PRS, which are developed to predict disease in a group of people, work when applied to an individual.
They identified 48 previously published PRS for CHD. They applied those scores to more than 170,000 individuals across multiple genetic databases. And, by and large, the scores worked as advertised, at least across the entire group. The weighted accuracy of all 48 scores was around 78%. They aren’t perfect, of course. We wouldn’t expect them to be, since CHD is not entirely driven by genetics. But 78% accurate isn’t too bad.
But that accuracy is at the population level. At the level of the office building. At the individual level, it was a vastly different story.
This is best illustrated by this plot, which shows the score from 48 different PRS for CHD within the same person. A note here: It is arranged by the publication date of the risk score, but these were all assessed on a single blood sample at a single point in time in this study participant.
The individual scores are all over the map. Using one risk score gives an individual a risk that is near the 99th percentile — a ticking time bomb of CHD. Another score indicates a level of risk at the very bottom of the spectrum — highly reassuring. A bunch of scores fall somewhere in between. In other words, as a doctor, the risk I will discuss with this patient is more strongly determined by which PRS I happen to choose than by his actual genetic risk, whatever that is.
This may seem counterintuitive. All these risk scores were similarly accurate within a population; how can they all give different results to an individual? The answer is simpler than you may think. As long as a given score makes one extra good prediction for each extra bad prediction, its accuracy is not changed.
Let’s imagine we have a population of 40 people.
Risk score model 1 correctly classified 30 of them for 75% accuracy. Great.
Risk score model 2 also correctly classified 30 of our 40 individuals, for 75% accuracy. It’s just a different 30.
Risk score model 3 also correctly classified 30 of 40, but another different 30.
I’ve colored this to show you all the different overlaps. What you can see is that although each score has similar accuracy, the individual people have a bunch of different colors, indicating that some scores worked for them and some didn’t. That’s a real problem.
This has not stopped companies from advertising PRS for all sorts of diseases. Companies are even using PRS to decide which fetuses to implant during IVF therapy, which is a particularly egregiously wrong use of this technology that I have written about before.
How do you fix this? Our aliens tried to warn us. This is not how you are supposed to use this ray gun. You are supposed to use it to identify groups of people at higher risk to direct more resources to that group. That’s really all you can do.
It’s also possible that we need to match the risk score to the individual in a better way. This is likely driven by the fact that risk scores tend to work best in the populations in which they were developed, and many of them were developed in people of largely European ancestry.
It is worth noting that if a PRS had perfect accuracy at the population level, it would also necessarily have perfect accuracy at the individual level. But there aren’t any scores like that. It’s possible that combining various scores may increase the individual accuracy, but that hasn’t been demonstrated yet either.
Look, genetics is and will continue to play a major role in healthcare. At the same time, sequencing entire genomes is a technology that is ripe for hype and thus misuse. Or even abuse. Fundamentally, this JAMA study reminds us that accuracy in a population and accuracy in an individual are not the same. But more deeply, it reminds us that just because a technology is new or cool or expensive doesn’t mean it will work in the clinic.
F. Perry Wilson, MD, MSCE, is an associate professor of medicine and public health and director of Yale’s Clinical and Translational Research Accelerator. His science communication work can be found in the Huffington Post, on NPR, and here on Medscape. He posts at @fperrywilson and his book, How Medicine Works and When It Doesn’t, is available now.
ReplyReply to allForward |
Leave a Reply