User login
We Haven’t Kicked Our Pandemic Drinking Habit
This transcript has been edited for clarity.
You’re stuck in your house. Work is closed or you’re working remotely. Your kids’ school is closed or is offering an hour or two a day of Zoom-based instruction. You have a bit of cabin fever which, you suppose, is better than the actual fever that comes with COVID infections, which are running rampant during the height of the pandemic. But still — it’s stressful. What do you do?
We all coped in our own way. We baked sourdough bread. We built that tree house we’d been meaning to build. We started podcasts. And ... we drank. Quite a bit, actually.
During the first year of the pandemic, alcohol sales increased 3%, the largest year-on-year increase in more than 50 years. There was also an increase in drunkenness across the board, though it was most pronounced in those who were already at risk from alcohol use disorder.
Alcohol-associated deaths increased by around 10% from 2019 to 2020. Obviously, this is a small percentage of COVID-associated deaths, but it is nothing to sneeze at.
But look, we were anxious. And say what you will about alcohol as a risk factor for liver disease, heart disease, and cancer — not to mention traffic accidents — it is an anxiolytic, at least in the short term.
But as the pandemic waned, as society reopened, as we got back to work and reintegrated into our social circles and escaped the confines of our houses and apartments, our drinking habits went back to normal, right?
Americans’ love affair with alcohol has been a torrid one, as this graph showing gallons of ethanol consumed per capita over time shows you.
What you see is a steady increase in alcohol consumption from the end of prohibition in 1933 to its peak in the heady days of the early 1980s, followed by a steady decline until the mid-1990s. Since then, there has been another increase with, as you will note, a notable uptick during the early part of the COVID pandemic.
What came across my desk this week was updated data, appearing in a research letter in Annals of Internal Medicine, that compared alcohol consumption in 2020 — the first year of the COVID pandemic — with that in 2022 (the latest available data). And it looks like not much has changed.
This was a population-based survey study leveraging the National Health Interview Survey, including around 80,000 respondents from 2018, 2020, and 2022.
They created two main categories of drinking: drinking any alcohol at all and heavy drinking.
In 2018, 66% of Americans reported drinking any alcohol. That had risen to 69% by 2020, and it stayed at that level even after the lockdown had ended, as you can see here. This may seem like a small increase, but this was a highly significant result. Translating into absolute numbers, it suggests that we have added between 3,328,000 and 10,660,000 net additional drinkers to the population over this time period.
This trend was seen across basically every demographic group, with some notably larger increases among Black and Hispanic individuals, and marginally higher rates among people under age 30.
But far be it from me to deny someone a tot of brandy on a cold winter’s night. More interesting is the rate of heavy alcohol use reported in the study. For context, the definitions of heavy alcohol use appear here. For men, it’s any one day with five or more drinks or 15 or more drinks per week. For women it’s four or more drinks on a given day or eight drinks or more per week.
The overall rate of heavy drinking was about 5.1% in 2018 before the start of the pandemic. That rose to more than 6% in 2020 and it rose a bit more into 2022. The net change here, on a population level, is from 1,430,000 to 3,926,000 new heavy drinkers. That’s a number that rises to the level of an actual public health issue.
Again, this trend was fairly broad across demographic groups. Although in this case, the changes were a bit larger among White people and those in the 40- to 49-year age group. This is my cohort, I guess. Cheers.
The information we have from this study is purely descriptive. It tells us that people are drinking more since the pandemic. It doesn’t tell us why, or the impact that this excess drinking will have on subsequent health outcomes, although other studies would suggest that it will contribute to certain chronic conditions, both physical and mental.
Maybe more important is that it reminds us that habits are sticky. Once we become accustomed to something — that glass of wine or two with dinner, and before bed — it has a tendency to stay with us. There’s an upside to that phenomenon as well, of course; it means that we can train good habits too. And those, once they become ingrained, can be just as hard to break. We just need to be mindful of the habits we pick. New Year 2025 is just around the corner. Start brainstorming those resolutions now.
Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
You’re stuck in your house. Work is closed or you’re working remotely. Your kids’ school is closed or is offering an hour or two a day of Zoom-based instruction. You have a bit of cabin fever which, you suppose, is better than the actual fever that comes with COVID infections, which are running rampant during the height of the pandemic. But still — it’s stressful. What do you do?
We all coped in our own way. We baked sourdough bread. We built that tree house we’d been meaning to build. We started podcasts. And ... we drank. Quite a bit, actually.
During the first year of the pandemic, alcohol sales increased 3%, the largest year-on-year increase in more than 50 years. There was also an increase in drunkenness across the board, though it was most pronounced in those who were already at risk from alcohol use disorder.
Alcohol-associated deaths increased by around 10% from 2019 to 2020. Obviously, this is a small percentage of COVID-associated deaths, but it is nothing to sneeze at.
But look, we were anxious. And say what you will about alcohol as a risk factor for liver disease, heart disease, and cancer — not to mention traffic accidents — it is an anxiolytic, at least in the short term.
But as the pandemic waned, as society reopened, as we got back to work and reintegrated into our social circles and escaped the confines of our houses and apartments, our drinking habits went back to normal, right?
Americans’ love affair with alcohol has been a torrid one, as this graph showing gallons of ethanol consumed per capita over time shows you.
What you see is a steady increase in alcohol consumption from the end of prohibition in 1933 to its peak in the heady days of the early 1980s, followed by a steady decline until the mid-1990s. Since then, there has been another increase with, as you will note, a notable uptick during the early part of the COVID pandemic.
What came across my desk this week was updated data, appearing in a research letter in Annals of Internal Medicine, that compared alcohol consumption in 2020 — the first year of the COVID pandemic — with that in 2022 (the latest available data). And it looks like not much has changed.
This was a population-based survey study leveraging the National Health Interview Survey, including around 80,000 respondents from 2018, 2020, and 2022.
They created two main categories of drinking: drinking any alcohol at all and heavy drinking.
In 2018, 66% of Americans reported drinking any alcohol. That had risen to 69% by 2020, and it stayed at that level even after the lockdown had ended, as you can see here. This may seem like a small increase, but this was a highly significant result. Translating into absolute numbers, it suggests that we have added between 3,328,000 and 10,660,000 net additional drinkers to the population over this time period.
This trend was seen across basically every demographic group, with some notably larger increases among Black and Hispanic individuals, and marginally higher rates among people under age 30.
But far be it from me to deny someone a tot of brandy on a cold winter’s night. More interesting is the rate of heavy alcohol use reported in the study. For context, the definitions of heavy alcohol use appear here. For men, it’s any one day with five or more drinks or 15 or more drinks per week. For women it’s four or more drinks on a given day or eight drinks or more per week.
The overall rate of heavy drinking was about 5.1% in 2018 before the start of the pandemic. That rose to more than 6% in 2020 and it rose a bit more into 2022. The net change here, on a population level, is from 1,430,000 to 3,926,000 new heavy drinkers. That’s a number that rises to the level of an actual public health issue.
Again, this trend was fairly broad across demographic groups. Although in this case, the changes were a bit larger among White people and those in the 40- to 49-year age group. This is my cohort, I guess. Cheers.
The information we have from this study is purely descriptive. It tells us that people are drinking more since the pandemic. It doesn’t tell us why, or the impact that this excess drinking will have on subsequent health outcomes, although other studies would suggest that it will contribute to certain chronic conditions, both physical and mental.
Maybe more important is that it reminds us that habits are sticky. Once we become accustomed to something — that glass of wine or two with dinner, and before bed — it has a tendency to stay with us. There’s an upside to that phenomenon as well, of course; it means that we can train good habits too. And those, once they become ingrained, can be just as hard to break. We just need to be mindful of the habits we pick. New Year 2025 is just around the corner. Start brainstorming those resolutions now.
Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
You’re stuck in your house. Work is closed or you’re working remotely. Your kids’ school is closed or is offering an hour or two a day of Zoom-based instruction. You have a bit of cabin fever which, you suppose, is better than the actual fever that comes with COVID infections, which are running rampant during the height of the pandemic. But still — it’s stressful. What do you do?
We all coped in our own way. We baked sourdough bread. We built that tree house we’d been meaning to build. We started podcasts. And ... we drank. Quite a bit, actually.
During the first year of the pandemic, alcohol sales increased 3%, the largest year-on-year increase in more than 50 years. There was also an increase in drunkenness across the board, though it was most pronounced in those who were already at risk from alcohol use disorder.
Alcohol-associated deaths increased by around 10% from 2019 to 2020. Obviously, this is a small percentage of COVID-associated deaths, but it is nothing to sneeze at.
But look, we were anxious. And say what you will about alcohol as a risk factor for liver disease, heart disease, and cancer — not to mention traffic accidents — it is an anxiolytic, at least in the short term.
But as the pandemic waned, as society reopened, as we got back to work and reintegrated into our social circles and escaped the confines of our houses and apartments, our drinking habits went back to normal, right?
Americans’ love affair with alcohol has been a torrid one, as this graph showing gallons of ethanol consumed per capita over time shows you.
What you see is a steady increase in alcohol consumption from the end of prohibition in 1933 to its peak in the heady days of the early 1980s, followed by a steady decline until the mid-1990s. Since then, there has been another increase with, as you will note, a notable uptick during the early part of the COVID pandemic.
What came across my desk this week was updated data, appearing in a research letter in Annals of Internal Medicine, that compared alcohol consumption in 2020 — the first year of the COVID pandemic — with that in 2022 (the latest available data). And it looks like not much has changed.
This was a population-based survey study leveraging the National Health Interview Survey, including around 80,000 respondents from 2018, 2020, and 2022.
They created two main categories of drinking: drinking any alcohol at all and heavy drinking.
In 2018, 66% of Americans reported drinking any alcohol. That had risen to 69% by 2020, and it stayed at that level even after the lockdown had ended, as you can see here. This may seem like a small increase, but this was a highly significant result. Translating into absolute numbers, it suggests that we have added between 3,328,000 and 10,660,000 net additional drinkers to the population over this time period.
This trend was seen across basically every demographic group, with some notably larger increases among Black and Hispanic individuals, and marginally higher rates among people under age 30.
But far be it from me to deny someone a tot of brandy on a cold winter’s night. More interesting is the rate of heavy alcohol use reported in the study. For context, the definitions of heavy alcohol use appear here. For men, it’s any one day with five or more drinks or 15 or more drinks per week. For women it’s four or more drinks on a given day or eight drinks or more per week.
The overall rate of heavy drinking was about 5.1% in 2018 before the start of the pandemic. That rose to more than 6% in 2020 and it rose a bit more into 2022. The net change here, on a population level, is from 1,430,000 to 3,926,000 new heavy drinkers. That’s a number that rises to the level of an actual public health issue.
Again, this trend was fairly broad across demographic groups. Although in this case, the changes were a bit larger among White people and those in the 40- to 49-year age group. This is my cohort, I guess. Cheers.
The information we have from this study is purely descriptive. It tells us that people are drinking more since the pandemic. It doesn’t tell us why, or the impact that this excess drinking will have on subsequent health outcomes, although other studies would suggest that it will contribute to certain chronic conditions, both physical and mental.
Maybe more important is that it reminds us that habits are sticky. Once we become accustomed to something — that glass of wine or two with dinner, and before bed — it has a tendency to stay with us. There’s an upside to that phenomenon as well, of course; it means that we can train good habits too. And those, once they become ingrained, can be just as hard to break. We just need to be mindful of the habits we pick. New Year 2025 is just around the corner. Start brainstorming those resolutions now.
Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
Aliens, Ian McShane, and Heart Disease Risk
This transcript has been edited for clarity.
I was really struggling to think of a good analogy to explain the glaring problem of polygenic risk scores (PRS) this week. But I think I have it now. Go with me on this.
An alien spaceship parks itself, Independence Day style, above a local office building.
But unlike the aliens that gave such a hard time to Will Smith and Brent Spiner, these are benevolent, technologically superior guys. They shine a mysterious green light down on the building and then announce, maybe via telepathy, that 6% of the people in that building will have a heart attack in the next year.
They move on to the next building. “Five percent will have a heart attack in the next year.” And the next, 7%. And the next, 2%.
Let’s assume the aliens are entirely accurate. What do you do with this information?
Most of us would suggest that you find out who was in the buildings with the higher percentages. You check their cholesterol levels, get them to exercise more, do some stress tests, and so on.
But that said, you’d still be spending a lot of money on a bunch of people who were not going to have heart attacks. So, a crack team of spies — in my mind, this is definitely led by a grizzled Ian McShane — infiltrate the alien ship, steal this predictive ray gun, and start pointing it, not at buildings but at people.
In this scenario, one person could have a 10% chance of having a heart attack in the next year. Another person has a 50% chance. The aliens, seeing this, leave us one final message before flying into the great beyond: “No, you guys are doing it wrong.”
This week: The people and companies using an advanced predictive technology, PRS , wrong — and a study that shows just how problematic this is.
We all know that genes play a significant role in our health outcomes. Some diseases (Huntington disease, cystic fibrosis, sickle cell disease, hemochromatosis, and Duchenne muscular dystrophy, for example) are entirely driven by genetic mutations.
The vast majority of chronic diseases we face are not driven by genetics, but they may be enhanced by genetics. Coronary heart disease (CHD) is a prime example. There are clearly environmental risk factors, like smoking, that dramatically increase risk. But there are also genetic underpinnings; about half the risk for CHD comes from genetic variation, according to one study.
But in the case of those common diseases, it’s not one gene that leads to increased risk; it’s the aggregate effect of multiple risk genes, each contributing a small amount of risk to the final total.
The promise of PRS was based on this fact. Take the genome of an individual, identify all the risk genes, and integrate them into some final number that represents your genetic risk of developing CHD.
The way you derive a PRS is take a big group of people and sequence their genomes. Then, you see who develops the disease of interest — in this case, CHD. If the people who develop CHD are more likely to have a particular mutation, that mutation goes in the risk score. Risk scores can integrate tens, hundreds, even thousands of individual mutations to create that final score.
There are literally dozens of PRS for CHD. And there are companies that will calculate yours right now for a reasonable fee.
The accuracy of these scores is assessed at the population level. It’s the alien ray gun thing. Researchers apply the PRS to a big group of people and say 20% of them should develop CHD. If indeed 20% develop CHD, they say the score is accurate. And that’s true.
But what happens next is the problem. Companies and even doctors have been marketing PRS to individuals. And honestly, it sounds amazing. “We’ll use sophisticated techniques to analyze your genetic code and integrate the information to give you your personal risk for CHD.” Or dementia. Or other diseases. A lot of people would want to know this information.
It turns out, though, that this is where the system breaks down. And it is nicely illustrated by this study, appearing November 16 in JAMA.
The authors wanted to see how PRS, which are developed to predict disease in a group of people, work when applied to an individual.
They identified 48 previously published PRS for CHD. They applied those scores to more than 170,000 individuals across multiple genetic databases. And, by and large, the scores worked as advertised, at least across the entire group. The weighted accuracy of all 48 scores was around 78%. They aren’t perfect, of course. We wouldn’t expect them to be, since CHD is not entirely driven by genetics. But 78% accurate isn’t too bad.
But that accuracy is at the population level. At the level of the office building. At the individual level, it was a vastly different story.
This is best illustrated by this plot, which shows the score from 48 different PRS for CHD within the same person. A note here: It is arranged by the publication date of the risk score, but these were all assessed on a single blood sample at a single point in time in this study participant.
The individual scores are all over the map. Using one risk score gives an individual a risk that is near the 99th percentile — a ticking time bomb of CHD. Another score indicates a level of risk at the very bottom of the spectrum — highly reassuring. A bunch of scores fall somewhere in between. In other words, as a doctor, the risk I will discuss with this patient is more strongly determined by which PRS I happen to choose than by his actual genetic risk, whatever that is.
This may seem counterintuitive. All these risk scores were similarly accurate within a population; how can they all give different results to an individual? The answer is simpler than you may think. As long as a given score makes one extra good prediction for each extra bad prediction, its accuracy is not changed.
Let’s imagine we have a population of 40 people.
Risk score model 1 correctly classified 30 of them for 75% accuracy. Great.
Risk score model 2 also correctly classified 30 of our 40 individuals, for 75% accuracy. It’s just a different 30.
Risk score model 3 also correctly classified 30 of 40, but another different 30.
I’ve colored this to show you all the different overlaps. What you can see is that although each score has similar accuracy, the individual people have a bunch of different colors, indicating that some scores worked for them and some didn’t. That’s a real problem.
This has not stopped companies from advertising PRS for all sorts of diseases. Companies are even using PRS to decide which fetuses to implant during IVF therapy, which is a particularly egregiously wrong use of this technology that I have written about before.
How do you fix this? Our aliens tried to warn us. This is not how you are supposed to use this ray gun. You are supposed to use it to identify groups of people at higher risk to direct more resources to that group. That’s really all you can do.
It’s also possible that we need to match the risk score to the individual in a better way. This is likely driven by the fact that risk scores tend to work best in the populations in which they were developed, and many of them were developed in people of largely European ancestry.
It is worth noting that if a PRS had perfect accuracy at the population level, it would also necessarily have perfect accuracy at the individual level. But there aren’t any scores like that. It’s possible that combining various scores may increase the individual accuracy, but that hasn’t been demonstrated yet either.
Look, genetics is and will continue to play a major role in healthcare. At the same time, sequencing entire genomes is a technology that is ripe for hype and thus misuse. Or even abuse. Fundamentally, this JAMA study reminds us that accuracy in a population and accuracy in an individual are not the same. But more deeply, it reminds us that just because a technology is new or cool or expensive doesn’t mean it will work in the clinic.
Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Connecticut. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
I was really struggling to think of a good analogy to explain the glaring problem of polygenic risk scores (PRS) this week. But I think I have it now. Go with me on this.
An alien spaceship parks itself, Independence Day style, above a local office building.
But unlike the aliens that gave such a hard time to Will Smith and Brent Spiner, these are benevolent, technologically superior guys. They shine a mysterious green light down on the building and then announce, maybe via telepathy, that 6% of the people in that building will have a heart attack in the next year.
They move on to the next building. “Five percent will have a heart attack in the next year.” And the next, 7%. And the next, 2%.
Let’s assume the aliens are entirely accurate. What do you do with this information?
Most of us would suggest that you find out who was in the buildings with the higher percentages. You check their cholesterol levels, get them to exercise more, do some stress tests, and so on.
But that said, you’d still be spending a lot of money on a bunch of people who were not going to have heart attacks. So, a crack team of spies — in my mind, this is definitely led by a grizzled Ian McShane — infiltrate the alien ship, steal this predictive ray gun, and start pointing it, not at buildings but at people.
In this scenario, one person could have a 10% chance of having a heart attack in the next year. Another person has a 50% chance. The aliens, seeing this, leave us one final message before flying into the great beyond: “No, you guys are doing it wrong.”
This week: The people and companies using an advanced predictive technology, PRS , wrong — and a study that shows just how problematic this is.
We all know that genes play a significant role in our health outcomes. Some diseases (Huntington disease, cystic fibrosis, sickle cell disease, hemochromatosis, and Duchenne muscular dystrophy, for example) are entirely driven by genetic mutations.
The vast majority of chronic diseases we face are not driven by genetics, but they may be enhanced by genetics. Coronary heart disease (CHD) is a prime example. There are clearly environmental risk factors, like smoking, that dramatically increase risk. But there are also genetic underpinnings; about half the risk for CHD comes from genetic variation, according to one study.
But in the case of those common diseases, it’s not one gene that leads to increased risk; it’s the aggregate effect of multiple risk genes, each contributing a small amount of risk to the final total.
The promise of PRS was based on this fact. Take the genome of an individual, identify all the risk genes, and integrate them into some final number that represents your genetic risk of developing CHD.
The way you derive a PRS is take a big group of people and sequence their genomes. Then, you see who develops the disease of interest — in this case, CHD. If the people who develop CHD are more likely to have a particular mutation, that mutation goes in the risk score. Risk scores can integrate tens, hundreds, even thousands of individual mutations to create that final score.
There are literally dozens of PRS for CHD. And there are companies that will calculate yours right now for a reasonable fee.
The accuracy of these scores is assessed at the population level. It’s the alien ray gun thing. Researchers apply the PRS to a big group of people and say 20% of them should develop CHD. If indeed 20% develop CHD, they say the score is accurate. And that’s true.
But what happens next is the problem. Companies and even doctors have been marketing PRS to individuals. And honestly, it sounds amazing. “We’ll use sophisticated techniques to analyze your genetic code and integrate the information to give you your personal risk for CHD.” Or dementia. Or other diseases. A lot of people would want to know this information.
It turns out, though, that this is where the system breaks down. And it is nicely illustrated by this study, appearing November 16 in JAMA.
The authors wanted to see how PRS, which are developed to predict disease in a group of people, work when applied to an individual.
They identified 48 previously published PRS for CHD. They applied those scores to more than 170,000 individuals across multiple genetic databases. And, by and large, the scores worked as advertised, at least across the entire group. The weighted accuracy of all 48 scores was around 78%. They aren’t perfect, of course. We wouldn’t expect them to be, since CHD is not entirely driven by genetics. But 78% accurate isn’t too bad.
But that accuracy is at the population level. At the level of the office building. At the individual level, it was a vastly different story.
This is best illustrated by this plot, which shows the score from 48 different PRS for CHD within the same person. A note here: It is arranged by the publication date of the risk score, but these were all assessed on a single blood sample at a single point in time in this study participant.
The individual scores are all over the map. Using one risk score gives an individual a risk that is near the 99th percentile — a ticking time bomb of CHD. Another score indicates a level of risk at the very bottom of the spectrum — highly reassuring. A bunch of scores fall somewhere in between. In other words, as a doctor, the risk I will discuss with this patient is more strongly determined by which PRS I happen to choose than by his actual genetic risk, whatever that is.
This may seem counterintuitive. All these risk scores were similarly accurate within a population; how can they all give different results to an individual? The answer is simpler than you may think. As long as a given score makes one extra good prediction for each extra bad prediction, its accuracy is not changed.
Let’s imagine we have a population of 40 people.
Risk score model 1 correctly classified 30 of them for 75% accuracy. Great.
Risk score model 2 also correctly classified 30 of our 40 individuals, for 75% accuracy. It’s just a different 30.
Risk score model 3 also correctly classified 30 of 40, but another different 30.
I’ve colored this to show you all the different overlaps. What you can see is that although each score has similar accuracy, the individual people have a bunch of different colors, indicating that some scores worked for them and some didn’t. That’s a real problem.
This has not stopped companies from advertising PRS for all sorts of diseases. Companies are even using PRS to decide which fetuses to implant during IVF therapy, which is a particularly egregiously wrong use of this technology that I have written about before.
How do you fix this? Our aliens tried to warn us. This is not how you are supposed to use this ray gun. You are supposed to use it to identify groups of people at higher risk to direct more resources to that group. That’s really all you can do.
It’s also possible that we need to match the risk score to the individual in a better way. This is likely driven by the fact that risk scores tend to work best in the populations in which they were developed, and many of them were developed in people of largely European ancestry.
It is worth noting that if a PRS had perfect accuracy at the population level, it would also necessarily have perfect accuracy at the individual level. But there aren’t any scores like that. It’s possible that combining various scores may increase the individual accuracy, but that hasn’t been demonstrated yet either.
Look, genetics is and will continue to play a major role in healthcare. At the same time, sequencing entire genomes is a technology that is ripe for hype and thus misuse. Or even abuse. Fundamentally, this JAMA study reminds us that accuracy in a population and accuracy in an individual are not the same. But more deeply, it reminds us that just because a technology is new or cool or expensive doesn’t mean it will work in the clinic.
Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Connecticut. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
I was really struggling to think of a good analogy to explain the glaring problem of polygenic risk scores (PRS) this week. But I think I have it now. Go with me on this.
An alien spaceship parks itself, Independence Day style, above a local office building.
But unlike the aliens that gave such a hard time to Will Smith and Brent Spiner, these are benevolent, technologically superior guys. They shine a mysterious green light down on the building and then announce, maybe via telepathy, that 6% of the people in that building will have a heart attack in the next year.
They move on to the next building. “Five percent will have a heart attack in the next year.” And the next, 7%. And the next, 2%.
Let’s assume the aliens are entirely accurate. What do you do with this information?
Most of us would suggest that you find out who was in the buildings with the higher percentages. You check their cholesterol levels, get them to exercise more, do some stress tests, and so on.
But that said, you’d still be spending a lot of money on a bunch of people who were not going to have heart attacks. So, a crack team of spies — in my mind, this is definitely led by a grizzled Ian McShane — infiltrate the alien ship, steal this predictive ray gun, and start pointing it, not at buildings but at people.
In this scenario, one person could have a 10% chance of having a heart attack in the next year. Another person has a 50% chance. The aliens, seeing this, leave us one final message before flying into the great beyond: “No, you guys are doing it wrong.”
This week: The people and companies using an advanced predictive technology, PRS , wrong — and a study that shows just how problematic this is.
We all know that genes play a significant role in our health outcomes. Some diseases (Huntington disease, cystic fibrosis, sickle cell disease, hemochromatosis, and Duchenne muscular dystrophy, for example) are entirely driven by genetic mutations.
The vast majority of chronic diseases we face are not driven by genetics, but they may be enhanced by genetics. Coronary heart disease (CHD) is a prime example. There are clearly environmental risk factors, like smoking, that dramatically increase risk. But there are also genetic underpinnings; about half the risk for CHD comes from genetic variation, according to one study.
But in the case of those common diseases, it’s not one gene that leads to increased risk; it’s the aggregate effect of multiple risk genes, each contributing a small amount of risk to the final total.
The promise of PRS was based on this fact. Take the genome of an individual, identify all the risk genes, and integrate them into some final number that represents your genetic risk of developing CHD.
The way you derive a PRS is take a big group of people and sequence their genomes. Then, you see who develops the disease of interest — in this case, CHD. If the people who develop CHD are more likely to have a particular mutation, that mutation goes in the risk score. Risk scores can integrate tens, hundreds, even thousands of individual mutations to create that final score.
There are literally dozens of PRS for CHD. And there are companies that will calculate yours right now for a reasonable fee.
The accuracy of these scores is assessed at the population level. It’s the alien ray gun thing. Researchers apply the PRS to a big group of people and say 20% of them should develop CHD. If indeed 20% develop CHD, they say the score is accurate. And that’s true.
But what happens next is the problem. Companies and even doctors have been marketing PRS to individuals. And honestly, it sounds amazing. “We’ll use sophisticated techniques to analyze your genetic code and integrate the information to give you your personal risk for CHD.” Or dementia. Or other diseases. A lot of people would want to know this information.
It turns out, though, that this is where the system breaks down. And it is nicely illustrated by this study, appearing November 16 in JAMA.
The authors wanted to see how PRS, which are developed to predict disease in a group of people, work when applied to an individual.
They identified 48 previously published PRS for CHD. They applied those scores to more than 170,000 individuals across multiple genetic databases. And, by and large, the scores worked as advertised, at least across the entire group. The weighted accuracy of all 48 scores was around 78%. They aren’t perfect, of course. We wouldn’t expect them to be, since CHD is not entirely driven by genetics. But 78% accurate isn’t too bad.
But that accuracy is at the population level. At the level of the office building. At the individual level, it was a vastly different story.
This is best illustrated by this plot, which shows the score from 48 different PRS for CHD within the same person. A note here: It is arranged by the publication date of the risk score, but these were all assessed on a single blood sample at a single point in time in this study participant.
The individual scores are all over the map. Using one risk score gives an individual a risk that is near the 99th percentile — a ticking time bomb of CHD. Another score indicates a level of risk at the very bottom of the spectrum — highly reassuring. A bunch of scores fall somewhere in between. In other words, as a doctor, the risk I will discuss with this patient is more strongly determined by which PRS I happen to choose than by his actual genetic risk, whatever that is.
This may seem counterintuitive. All these risk scores were similarly accurate within a population; how can they all give different results to an individual? The answer is simpler than you may think. As long as a given score makes one extra good prediction for each extra bad prediction, its accuracy is not changed.
Let’s imagine we have a population of 40 people.
Risk score model 1 correctly classified 30 of them for 75% accuracy. Great.
Risk score model 2 also correctly classified 30 of our 40 individuals, for 75% accuracy. It’s just a different 30.
Risk score model 3 also correctly classified 30 of 40, but another different 30.
I’ve colored this to show you all the different overlaps. What you can see is that although each score has similar accuracy, the individual people have a bunch of different colors, indicating that some scores worked for them and some didn’t. That’s a real problem.
This has not stopped companies from advertising PRS for all sorts of diseases. Companies are even using PRS to decide which fetuses to implant during IVF therapy, which is a particularly egregiously wrong use of this technology that I have written about before.
How do you fix this? Our aliens tried to warn us. This is not how you are supposed to use this ray gun. You are supposed to use it to identify groups of people at higher risk to direct more resources to that group. That’s really all you can do.
It’s also possible that we need to match the risk score to the individual in a better way. This is likely driven by the fact that risk scores tend to work best in the populations in which they were developed, and many of them were developed in people of largely European ancestry.
It is worth noting that if a PRS had perfect accuracy at the population level, it would also necessarily have perfect accuracy at the individual level. But there aren’t any scores like that. It’s possible that combining various scores may increase the individual accuracy, but that hasn’t been demonstrated yet either.
Look, genetics is and will continue to play a major role in healthcare. At the same time, sequencing entire genomes is a technology that is ripe for hype and thus misuse. Or even abuse. Fundamentally, this JAMA study reminds us that accuracy in a population and accuracy in an individual are not the same. But more deeply, it reminds us that just because a technology is new or cool or expensive doesn’t mean it will work in the clinic.
Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Connecticut. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
Is Being ‘Manly’ a Threat to a Man’s Health?
When my normally adorable cat Biscuit bit my ankle in a playful stalking exercise gone wrong, I washed it with soap and some rubbing alcohol, slapped on a Band-Aid, and went about my day.
The next morning, when it was swollen, I told myself it was probably just a hematoma and went about my day.
The next day, when the swelling had increased and red lines started creeping up my leg, I called my doctor. Long story short, I ended up hospitalized for intravenous antibiotics.
This is all to say that, yes, I’m sort of an idiot, but also to introduce the idea that maybe I minimized my very obvious lymphangitis because I am a man.
This week, we have empirical evidence that men downplay their medical symptoms — and that manlier men downplay them even more.
I’m going to talk about a study that links manliness (or, scientifically speaking, “male gender expressivity”) to medical diagnoses that are based on hard evidence and medical diagnoses that are based on self-report. You see where this is going but I want to walk you through the methods here because they are fairly interesting.
This study used data from the US National Longitudinal Study of Adolescent to Adult Health. This study enrolled 20,000 adolescents who were in grades 7-12 in the 1994-1995 school year and has been following them ever since — about 30 years so far.
The authors wanted to link early gender roles to long-term outcomes, so they cut that 20,000 number down to the 4230 males in the group who had complete follow-up.
Now comes the first interesting question. How do you quantify the “male gender expressivity” of boys in 7th-12th grade? There was no survey item that asked them how masculine or manly they felt. What the authors did was look at the surveys that were administered and identify the questions on those surveys where boys and girls gave the most disparate answers. I have some examples here.
Some of these questions make sense when it comes to gender expressivity: “How often do you cry?” for example, has a lot of validity for the social construct that is gender. But some questions where boys and girls gave very different answers — like “How often do you exercise?” — don’t quite fit that mold. Regardless, this structure allowed the researchers to take individual kids’ responses to these questions and combine them into what amounts to a manliness score — how much their answers aligned with the typical male answer.
The score was established in adolescence — which is interesting because I’m sure some of this stuff may change over time — but notable because adolescence is where many gender roles develop.
Now we can fast-forward 30 years and see how these manliness scores link to various outcomes. The authors were interested in fairly common diseases: diabetes, hypertension, and hyperlipidemia.
Let’s start simply. Are males with higher gender expressivity in adolescence more or less likely to have these diseases in the future?
Not really. Those above the average in male gender expressivity had similar rates of hypertension and hyperlipidemia as those below the median. They were actually a bit less likely to have diabetes.
But that’s not what’s really interesting here.
I told you that there was no difference in the rate of hypertension among those with high vs low male gender expressivity. But there was a significant difference in their answer to the question “Do you have hypertension?” The same was seen for hyperlipidemia. In other words, those with higher manliness scores are less likely to admit (or perhaps know) that they have a particular disease.
You can see the relationship across the manliness spectrum here in a series of adjusted models. The x-axis is the male gender expressivity score, and the y-axis is the percentage of people who report having the disease that we know they have based on the actual laboratory tests or vital sign measurements. As manliness increases, the self-report of a given disease decreases.
There are some important consequences of this systematic denial. Specifically, men with the diseases of interest who have higher male gender expressivity are less likely to get treatment. And, as we all know, the lack of treatment of something like hypertension puts people at risk for bad downstream outcomes.
Putting this all together, I’m not that surprised. Society trains boys from a young age to behave in certain ways: to hide emotions, to eschew vulnerability, to not complain when we are hurt. And those lessons can persist into later life. Whether the disease that strikes is hypertension or Pasteurella multocida from a slightly psychotic house cat, men are more likely to ignore it, to their detriment.
So, gents, be brave. Get your blood tests and check your blood pressure. If there’s something wrong, admit it, and fix it. After all, fixing problems — that’s a manly thing, right?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
When my normally adorable cat Biscuit bit my ankle in a playful stalking exercise gone wrong, I washed it with soap and some rubbing alcohol, slapped on a Band-Aid, and went about my day.
The next morning, when it was swollen, I told myself it was probably just a hematoma and went about my day.
The next day, when the swelling had increased and red lines started creeping up my leg, I called my doctor. Long story short, I ended up hospitalized for intravenous antibiotics.
This is all to say that, yes, I’m sort of an idiot, but also to introduce the idea that maybe I minimized my very obvious lymphangitis because I am a man.
This week, we have empirical evidence that men downplay their medical symptoms — and that manlier men downplay them even more.
I’m going to talk about a study that links manliness (or, scientifically speaking, “male gender expressivity”) to medical diagnoses that are based on hard evidence and medical diagnoses that are based on self-report. You see where this is going but I want to walk you through the methods here because they are fairly interesting.
This study used data from the US National Longitudinal Study of Adolescent to Adult Health. This study enrolled 20,000 adolescents who were in grades 7-12 in the 1994-1995 school year and has been following them ever since — about 30 years so far.
The authors wanted to link early gender roles to long-term outcomes, so they cut that 20,000 number down to the 4230 males in the group who had complete follow-up.
Now comes the first interesting question. How do you quantify the “male gender expressivity” of boys in 7th-12th grade? There was no survey item that asked them how masculine or manly they felt. What the authors did was look at the surveys that were administered and identify the questions on those surveys where boys and girls gave the most disparate answers. I have some examples here.
Some of these questions make sense when it comes to gender expressivity: “How often do you cry?” for example, has a lot of validity for the social construct that is gender. But some questions where boys and girls gave very different answers — like “How often do you exercise?” — don’t quite fit that mold. Regardless, this structure allowed the researchers to take individual kids’ responses to these questions and combine them into what amounts to a manliness score — how much their answers aligned with the typical male answer.
The score was established in adolescence — which is interesting because I’m sure some of this stuff may change over time — but notable because adolescence is where many gender roles develop.
Now we can fast-forward 30 years and see how these manliness scores link to various outcomes. The authors were interested in fairly common diseases: diabetes, hypertension, and hyperlipidemia.
Let’s start simply. Are males with higher gender expressivity in adolescence more or less likely to have these diseases in the future?
Not really. Those above the average in male gender expressivity had similar rates of hypertension and hyperlipidemia as those below the median. They were actually a bit less likely to have diabetes.
But that’s not what’s really interesting here.
I told you that there was no difference in the rate of hypertension among those with high vs low male gender expressivity. But there was a significant difference in their answer to the question “Do you have hypertension?” The same was seen for hyperlipidemia. In other words, those with higher manliness scores are less likely to admit (or perhaps know) that they have a particular disease.
You can see the relationship across the manliness spectrum here in a series of adjusted models. The x-axis is the male gender expressivity score, and the y-axis is the percentage of people who report having the disease that we know they have based on the actual laboratory tests or vital sign measurements. As manliness increases, the self-report of a given disease decreases.
There are some important consequences of this systematic denial. Specifically, men with the diseases of interest who have higher male gender expressivity are less likely to get treatment. And, as we all know, the lack of treatment of something like hypertension puts people at risk for bad downstream outcomes.
Putting this all together, I’m not that surprised. Society trains boys from a young age to behave in certain ways: to hide emotions, to eschew vulnerability, to not complain when we are hurt. And those lessons can persist into later life. Whether the disease that strikes is hypertension or Pasteurella multocida from a slightly psychotic house cat, men are more likely to ignore it, to their detriment.
So, gents, be brave. Get your blood tests and check your blood pressure. If there’s something wrong, admit it, and fix it. After all, fixing problems — that’s a manly thing, right?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
When my normally adorable cat Biscuit bit my ankle in a playful stalking exercise gone wrong, I washed it with soap and some rubbing alcohol, slapped on a Band-Aid, and went about my day.
The next morning, when it was swollen, I told myself it was probably just a hematoma and went about my day.
The next day, when the swelling had increased and red lines started creeping up my leg, I called my doctor. Long story short, I ended up hospitalized for intravenous antibiotics.
This is all to say that, yes, I’m sort of an idiot, but also to introduce the idea that maybe I minimized my very obvious lymphangitis because I am a man.
This week, we have empirical evidence that men downplay their medical symptoms — and that manlier men downplay them even more.
I’m going to talk about a study that links manliness (or, scientifically speaking, “male gender expressivity”) to medical diagnoses that are based on hard evidence and medical diagnoses that are based on self-report. You see where this is going but I want to walk you through the methods here because they are fairly interesting.
This study used data from the US National Longitudinal Study of Adolescent to Adult Health. This study enrolled 20,000 adolescents who were in grades 7-12 in the 1994-1995 school year and has been following them ever since — about 30 years so far.
The authors wanted to link early gender roles to long-term outcomes, so they cut that 20,000 number down to the 4230 males in the group who had complete follow-up.
Now comes the first interesting question. How do you quantify the “male gender expressivity” of boys in 7th-12th grade? There was no survey item that asked them how masculine or manly they felt. What the authors did was look at the surveys that were administered and identify the questions on those surveys where boys and girls gave the most disparate answers. I have some examples here.
Some of these questions make sense when it comes to gender expressivity: “How often do you cry?” for example, has a lot of validity for the social construct that is gender. But some questions where boys and girls gave very different answers — like “How often do you exercise?” — don’t quite fit that mold. Regardless, this structure allowed the researchers to take individual kids’ responses to these questions and combine them into what amounts to a manliness score — how much their answers aligned with the typical male answer.
The score was established in adolescence — which is interesting because I’m sure some of this stuff may change over time — but notable because adolescence is where many gender roles develop.
Now we can fast-forward 30 years and see how these manliness scores link to various outcomes. The authors were interested in fairly common diseases: diabetes, hypertension, and hyperlipidemia.
Let’s start simply. Are males with higher gender expressivity in adolescence more or less likely to have these diseases in the future?
Not really. Those above the average in male gender expressivity had similar rates of hypertension and hyperlipidemia as those below the median. They were actually a bit less likely to have diabetes.
But that’s not what’s really interesting here.
I told you that there was no difference in the rate of hypertension among those with high vs low male gender expressivity. But there was a significant difference in their answer to the question “Do you have hypertension?” The same was seen for hyperlipidemia. In other words, those with higher manliness scores are less likely to admit (or perhaps know) that they have a particular disease.
You can see the relationship across the manliness spectrum here in a series of adjusted models. The x-axis is the male gender expressivity score, and the y-axis is the percentage of people who report having the disease that we know they have based on the actual laboratory tests or vital sign measurements. As manliness increases, the self-report of a given disease decreases.
There are some important consequences of this systematic denial. Specifically, men with the diseases of interest who have higher male gender expressivity are less likely to get treatment. And, as we all know, the lack of treatment of something like hypertension puts people at risk for bad downstream outcomes.
Putting this all together, I’m not that surprised. Society trains boys from a young age to behave in certain ways: to hide emotions, to eschew vulnerability, to not complain when we are hurt. And those lessons can persist into later life. Whether the disease that strikes is hypertension or Pasteurella multocida from a slightly psychotic house cat, men are more likely to ignore it, to their detriment.
So, gents, be brave. Get your blood tests and check your blood pressure. If there’s something wrong, admit it, and fix it. After all, fixing problems — that’s a manly thing, right?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
Why Cardiac Biomarkers Don’t Help Predict Heart Disease
This transcript has been edited for clarity.
It’s the counterintuitive stuff in epidemiology that always really interests me. One intuition many of us have is that if a risk factor is significantly associated with an outcome, knowledge of that risk factor would help to predict that outcome. Makes sense. Feels right.
But it’s not right. Not always.
Here’s a fake example to illustrate my point. Let’s say we have 10,000 individuals who we follow for 10 years and 2000 of them die. (It’s been a rough decade.) At baseline, I measured a novel biomarker, the Perry Factor, in everyone. To keep it simple, the Perry Factor has only two values: 0 or 1.
I then do a standard associational analysis and find that individuals who are positive for the Perry Factor have a 40-fold higher odds of death than those who are negative for it. I am beginning to reconsider ascribing my good name to this biomarker. This is a highly statistically significant result — a P value <.001.
Clearly, knowledge of the Perry Factor should help me predict who will die in the cohort. I evaluate predictive power using a metric called the area under the receiver operating characteristic curve (AUC, referred to as the C-statistic in time-to-event studies). It tells you, given two people — one who dies and one who doesn’t — how frequently you “pick” the right person, given the knowledge of their Perry Factor.
A C-statistic of 0.5, or 50%, would mean the Perry Factor gives you no better results than a coin flip; it’s chance. A C-statistic of 1 is perfect prediction. So, what will the C-statistic be, given the incredibly strong association of the Perry Factor with outcomes? 0.9? 0.95?
0.5024. Almost useless.
Let’s figure out why strength of association and usefulness for prediction are not always the same thing.
I constructed my fake Perry Factor dataset quite carefully to illustrate this point. Let me show you what happened. What you see here is a breakdown of the patients in my fake study. You can see that just 11 of them were Perry Factor positive, but 10 of those 11 ended up dying.
That’s quite unlikely by chance alone. It really does appear that if you have Perry Factor, your risk for death is much higher. But the reason that Perry Factor is a bad predictor is because it is so rare in the population. Sure, you can use it to correctly predict the outcome of 10 of the 11 people who have it, but the vast majority of people don’t have Perry Factor. It’s useless to distinguish who will die vs who will live in that population.
Why have I spent so much time trying to reverse our intuition that strength of association and strength of predictive power must be related? Because it helps to explain this paper, “Prognostic Value of Cardiovascular Biomarkers in the Population,” appearing in JAMA, which is a very nice piece of work trying to help us better predict cardiovascular disease.
I don’t need to tell you that cardiovascular disease is the number-one killer in this country and most of the world. I don’t need to tell you that we have really good preventive therapies and lifestyle interventions that can reduce the risk. But it would be nice to know in whom, specifically, we should use those interventions.
Cardiovascular risk scores, to date, are pretty simple. The most common one in use in the United States, the pooled cohort risk equation, has nine variables, two of which require a cholesterol panel and one a blood pressure test. It’s easy and it’s pretty accurate.
Using the score from the pooled cohort risk calculator, you get a C-statistic as high as 0.82 when applied to Black women, a low of 0.71 when applied to Black men. Non-Black individuals are in the middle. Not bad. But, clearly, not perfect.
And aren’t we in the era of big data, the era of personalized medicine? We have dozens, maybe hundreds, of quantifiable biomarkers that are associated with subsequent heart disease. Surely, by adding these biomarkers into the risk equation, we can improve prediction. Right?
The JAMA study includes 164,054 patients pooled from 28 cohort studies from 12 countries. All the studies measured various key biomarkers at baseline and followed their participants for cardiovascular events like heart attack, stroke, coronary revascularization, and so on.
The biomarkers in question are really the big guns in this space: troponin, a marker of stress on the heart muscle; NT-proBNP, a marker of stretch on the heart muscle; and C-reactive protein, a marker of inflammation. In every case, higher levels of these markers at baseline were associated with a higher risk for cardiovascular disease in the future.
Troponin T, shown here, has a basically linear risk with subsequent cardiovascular disease.
BNP seems to demonstrate more of a threshold effect, where levels above 60 start to associate with problems.
And CRP does a similar thing, with levels above 1.
All of these findings were statistically significant. If you have higher levels of one or more of these biomarkers, you are more likely to have cardiovascular disease in the future.
Of course, our old friend the pooled cohort risk equation is still here — in the background — requiring just that one blood test and measurement of blood pressure. Let’s talk about predictive power.
The pooled cohort risk equation score, in this study, had a C-statistic of 0.812.
By adding troponin, BNP, and CRP to the equation, the new C-statistic is 0.819. Barely any change.
Now, the authors looked at different types of prediction here. The greatest improvement in the AUC was seen when they tried to predict heart failure within 1 year of measurement; there the AUC improved by 0.04. But the presence of BNP as a biomarker and the short time window of 1 year makes me wonder whether this is really prediction at all or whether they were essentially just diagnosing people with existing heart failure.
Why does this happen? Why do these promising biomarkers, clearly associated with bad outcomes, fail to improve our ability to predict the future? I already gave one example, which has to do with how the markers are distributed in the population. But even more relevant here is that the new markers will only improve prediction insofar as they are not already represented in the old predictive model.
Of course, BNP, for example, wasn’t in the old model. But smoking was. Diabetes was. Blood pressure was. All of that data might actually tell you something about the patient’s BNP through their mutual correlation. And improvement in prediction requires new information.
This is actually why I consider this a really successful study. We need to do studies like this to help us find what those new sources of information might be.
We will never get to a C-statistic of 1. Perfect prediction is the domain of palm readers and astrophysicists. But better prediction is always possible through data. The big question, of course, is which data?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
It’s the counterintuitive stuff in epidemiology that always really interests me. One intuition many of us have is that if a risk factor is significantly associated with an outcome, knowledge of that risk factor would help to predict that outcome. Makes sense. Feels right.
But it’s not right. Not always.
Here’s a fake example to illustrate my point. Let’s say we have 10,000 individuals who we follow for 10 years and 2000 of them die. (It’s been a rough decade.) At baseline, I measured a novel biomarker, the Perry Factor, in everyone. To keep it simple, the Perry Factor has only two values: 0 or 1.
I then do a standard associational analysis and find that individuals who are positive for the Perry Factor have a 40-fold higher odds of death than those who are negative for it. I am beginning to reconsider ascribing my good name to this biomarker. This is a highly statistically significant result — a P value <.001.
Clearly, knowledge of the Perry Factor should help me predict who will die in the cohort. I evaluate predictive power using a metric called the area under the receiver operating characteristic curve (AUC, referred to as the C-statistic in time-to-event studies). It tells you, given two people — one who dies and one who doesn’t — how frequently you “pick” the right person, given the knowledge of their Perry Factor.
A C-statistic of 0.5, or 50%, would mean the Perry Factor gives you no better results than a coin flip; it’s chance. A C-statistic of 1 is perfect prediction. So, what will the C-statistic be, given the incredibly strong association of the Perry Factor with outcomes? 0.9? 0.95?
0.5024. Almost useless.
Let’s figure out why strength of association and usefulness for prediction are not always the same thing.
I constructed my fake Perry Factor dataset quite carefully to illustrate this point. Let me show you what happened. What you see here is a breakdown of the patients in my fake study. You can see that just 11 of them were Perry Factor positive, but 10 of those 11 ended up dying.
That’s quite unlikely by chance alone. It really does appear that if you have Perry Factor, your risk for death is much higher. But the reason that Perry Factor is a bad predictor is because it is so rare in the population. Sure, you can use it to correctly predict the outcome of 10 of the 11 people who have it, but the vast majority of people don’t have Perry Factor. It’s useless to distinguish who will die vs who will live in that population.
Why have I spent so much time trying to reverse our intuition that strength of association and strength of predictive power must be related? Because it helps to explain this paper, “Prognostic Value of Cardiovascular Biomarkers in the Population,” appearing in JAMA, which is a very nice piece of work trying to help us better predict cardiovascular disease.
I don’t need to tell you that cardiovascular disease is the number-one killer in this country and most of the world. I don’t need to tell you that we have really good preventive therapies and lifestyle interventions that can reduce the risk. But it would be nice to know in whom, specifically, we should use those interventions.
Cardiovascular risk scores, to date, are pretty simple. The most common one in use in the United States, the pooled cohort risk equation, has nine variables, two of which require a cholesterol panel and one a blood pressure test. It’s easy and it’s pretty accurate.
Using the score from the pooled cohort risk calculator, you get a C-statistic as high as 0.82 when applied to Black women, a low of 0.71 when applied to Black men. Non-Black individuals are in the middle. Not bad. But, clearly, not perfect.
And aren’t we in the era of big data, the era of personalized medicine? We have dozens, maybe hundreds, of quantifiable biomarkers that are associated with subsequent heart disease. Surely, by adding these biomarkers into the risk equation, we can improve prediction. Right?
The JAMA study includes 164,054 patients pooled from 28 cohort studies from 12 countries. All the studies measured various key biomarkers at baseline and followed their participants for cardiovascular events like heart attack, stroke, coronary revascularization, and so on.
The biomarkers in question are really the big guns in this space: troponin, a marker of stress on the heart muscle; NT-proBNP, a marker of stretch on the heart muscle; and C-reactive protein, a marker of inflammation. In every case, higher levels of these markers at baseline were associated with a higher risk for cardiovascular disease in the future.
Troponin T, shown here, has a basically linear risk with subsequent cardiovascular disease.
BNP seems to demonstrate more of a threshold effect, where levels above 60 start to associate with problems.
And CRP does a similar thing, with levels above 1.
All of these findings were statistically significant. If you have higher levels of one or more of these biomarkers, you are more likely to have cardiovascular disease in the future.
Of course, our old friend the pooled cohort risk equation is still here — in the background — requiring just that one blood test and measurement of blood pressure. Let’s talk about predictive power.
The pooled cohort risk equation score, in this study, had a C-statistic of 0.812.
By adding troponin, BNP, and CRP to the equation, the new C-statistic is 0.819. Barely any change.
Now, the authors looked at different types of prediction here. The greatest improvement in the AUC was seen when they tried to predict heart failure within 1 year of measurement; there the AUC improved by 0.04. But the presence of BNP as a biomarker and the short time window of 1 year makes me wonder whether this is really prediction at all or whether they were essentially just diagnosing people with existing heart failure.
Why does this happen? Why do these promising biomarkers, clearly associated with bad outcomes, fail to improve our ability to predict the future? I already gave one example, which has to do with how the markers are distributed in the population. But even more relevant here is that the new markers will only improve prediction insofar as they are not already represented in the old predictive model.
Of course, BNP, for example, wasn’t in the old model. But smoking was. Diabetes was. Blood pressure was. All of that data might actually tell you something about the patient’s BNP through their mutual correlation. And improvement in prediction requires new information.
This is actually why I consider this a really successful study. We need to do studies like this to help us find what those new sources of information might be.
We will never get to a C-statistic of 1. Perfect prediction is the domain of palm readers and astrophysicists. But better prediction is always possible through data. The big question, of course, is which data?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
It’s the counterintuitive stuff in epidemiology that always really interests me. One intuition many of us have is that if a risk factor is significantly associated with an outcome, knowledge of that risk factor would help to predict that outcome. Makes sense. Feels right.
But it’s not right. Not always.
Here’s a fake example to illustrate my point. Let’s say we have 10,000 individuals who we follow for 10 years and 2000 of them die. (It’s been a rough decade.) At baseline, I measured a novel biomarker, the Perry Factor, in everyone. To keep it simple, the Perry Factor has only two values: 0 or 1.
I then do a standard associational analysis and find that individuals who are positive for the Perry Factor have a 40-fold higher odds of death than those who are negative for it. I am beginning to reconsider ascribing my good name to this biomarker. This is a highly statistically significant result — a P value <.001.
Clearly, knowledge of the Perry Factor should help me predict who will die in the cohort. I evaluate predictive power using a metric called the area under the receiver operating characteristic curve (AUC, referred to as the C-statistic in time-to-event studies). It tells you, given two people — one who dies and one who doesn’t — how frequently you “pick” the right person, given the knowledge of their Perry Factor.
A C-statistic of 0.5, or 50%, would mean the Perry Factor gives you no better results than a coin flip; it’s chance. A C-statistic of 1 is perfect prediction. So, what will the C-statistic be, given the incredibly strong association of the Perry Factor with outcomes? 0.9? 0.95?
0.5024. Almost useless.
Let’s figure out why strength of association and usefulness for prediction are not always the same thing.
I constructed my fake Perry Factor dataset quite carefully to illustrate this point. Let me show you what happened. What you see here is a breakdown of the patients in my fake study. You can see that just 11 of them were Perry Factor positive, but 10 of those 11 ended up dying.
That’s quite unlikely by chance alone. It really does appear that if you have Perry Factor, your risk for death is much higher. But the reason that Perry Factor is a bad predictor is because it is so rare in the population. Sure, you can use it to correctly predict the outcome of 10 of the 11 people who have it, but the vast majority of people don’t have Perry Factor. It’s useless to distinguish who will die vs who will live in that population.
Why have I spent so much time trying to reverse our intuition that strength of association and strength of predictive power must be related? Because it helps to explain this paper, “Prognostic Value of Cardiovascular Biomarkers in the Population,” appearing in JAMA, which is a very nice piece of work trying to help us better predict cardiovascular disease.
I don’t need to tell you that cardiovascular disease is the number-one killer in this country and most of the world. I don’t need to tell you that we have really good preventive therapies and lifestyle interventions that can reduce the risk. But it would be nice to know in whom, specifically, we should use those interventions.
Cardiovascular risk scores, to date, are pretty simple. The most common one in use in the United States, the pooled cohort risk equation, has nine variables, two of which require a cholesterol panel and one a blood pressure test. It’s easy and it’s pretty accurate.
Using the score from the pooled cohort risk calculator, you get a C-statistic as high as 0.82 when applied to Black women, a low of 0.71 when applied to Black men. Non-Black individuals are in the middle. Not bad. But, clearly, not perfect.
And aren’t we in the era of big data, the era of personalized medicine? We have dozens, maybe hundreds, of quantifiable biomarkers that are associated with subsequent heart disease. Surely, by adding these biomarkers into the risk equation, we can improve prediction. Right?
The JAMA study includes 164,054 patients pooled from 28 cohort studies from 12 countries. All the studies measured various key biomarkers at baseline and followed their participants for cardiovascular events like heart attack, stroke, coronary revascularization, and so on.
The biomarkers in question are really the big guns in this space: troponin, a marker of stress on the heart muscle; NT-proBNP, a marker of stretch on the heart muscle; and C-reactive protein, a marker of inflammation. In every case, higher levels of these markers at baseline were associated with a higher risk for cardiovascular disease in the future.
Troponin T, shown here, has a basically linear risk with subsequent cardiovascular disease.
BNP seems to demonstrate more of a threshold effect, where levels above 60 start to associate with problems.
And CRP does a similar thing, with levels above 1.
All of these findings were statistically significant. If you have higher levels of one or more of these biomarkers, you are more likely to have cardiovascular disease in the future.
Of course, our old friend the pooled cohort risk equation is still here — in the background — requiring just that one blood test and measurement of blood pressure. Let’s talk about predictive power.
The pooled cohort risk equation score, in this study, had a C-statistic of 0.812.
By adding troponin, BNP, and CRP to the equation, the new C-statistic is 0.819. Barely any change.
Now, the authors looked at different types of prediction here. The greatest improvement in the AUC was seen when they tried to predict heart failure within 1 year of measurement; there the AUC improved by 0.04. But the presence of BNP as a biomarker and the short time window of 1 year makes me wonder whether this is really prediction at all or whether they were essentially just diagnosing people with existing heart failure.
Why does this happen? Why do these promising biomarkers, clearly associated with bad outcomes, fail to improve our ability to predict the future? I already gave one example, which has to do with how the markers are distributed in the population. But even more relevant here is that the new markers will only improve prediction insofar as they are not already represented in the old predictive model.
Of course, BNP, for example, wasn’t in the old model. But smoking was. Diabetes was. Blood pressure was. All of that data might actually tell you something about the patient’s BNP through their mutual correlation. And improvement in prediction requires new information.
This is actually why I consider this a really successful study. We need to do studies like this to help us find what those new sources of information might be.
We will never get to a C-statistic of 1. Perfect prediction is the domain of palm readers and astrophysicists. But better prediction is always possible through data. The big question, of course, is which data?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
It Would Be Nice if Olive Oil Really Did Prevent Dementia
This transcript has been edited for clarity.
As you all know by now, I’m always looking out for lifestyle changes that are both pleasurable and healthy. They are hard to find, especially when it comes to diet. My kids complain about this all the time: “When you say ‘healthy food,’ you just mean yucky food.” And yes, French fries are amazing, and no, we can’t have them three times a day.
So, when I saw an article claiming that olive oil reduces the risk for dementia, I was interested. I love olive oil; I cook with it all the time. But as is always the case in the world of nutritional epidemiology, we need to be careful. There are a lot of reasons to doubt the results of this study — and one reason to believe it’s true.
The study I’m talking about is “Consumption of Olive Oil and Diet Quality and Risk of Dementia-Related Death,” appearing in JAMA Network Open and following a well-trod formula in the nutritional epidemiology space.
Nearly 100,000 participants, all healthcare workers, filled out a food frequency questionnaire every 4 years with 130 questions touching on all aspects of diet: How often do you eat bananas, bacon, olive oil? Participants were followed for more than 20 years, and if they died, the cause of death was flagged as being dementia-related or not. Over that time frame there were around 38,000 deaths, of which 4751 were due to dementia.
The rest is just statistics. The authors show that those who reported consuming more olive oil were less likely to die from dementia — about 50% less likely, if you compare those who reported eating more than 7 grams of olive oil a day with those who reported eating none.
Is It What You Eat, or What You Don’t Eat?
And we could stop there if we wanted to; I’m sure big olive oil would be happy with that. Is there such a thing as “big olive oil”? But no, we need to dig deeper here because this study has the same problems as all nutritional epidemiology studies. Number one, no one is sitting around drinking small cups of olive oil. They consume it with other foods. And it was clear from the food frequency questionnaire that people who consumed more olive oil also consumed less red meat, more fruits and vegetables, more whole grains, more butter, and less margarine. And those are just the findings reported in the paper. I suspect that people who eat more olive oil also eat more tomatoes, for example, though data this granular aren’t shown. So, it can be really hard, in studies like this, to know for sure that it’s actually the olive oil that is helpful rather than some other constituent in the diet.
The flip side of that coin presents another issue. The food you eat is also a marker of the food you don’t eat. People who ate olive oil consumed less margarine, for example. At the time of this study, margarine was still adulterated with trans-fats, which a pretty solid evidence base suggests are really bad for your vascular system. So perhaps it’s not that olive oil is particularly good for you but that something else is bad for you. In other words, simply adding olive oil to your diet without changing anything else may not do anything.
The other major problem with studies of this sort is that people don’t consume food at random. The type of person who eats a lot of olive oil is simply different from the type of person who doesn›t. For one thing, olive oil is expensive. A 25-ounce bottle of olive oil is on sale at my local supermarket right now for $11.00. A similar-sized bottle of vegetable oil goes for $4.00.
Isn’t it interesting that food that costs more money tends to be associated with better health outcomes? (I’m looking at you, red wine.) Perhaps it’s not the food; perhaps it’s the money. We aren’t provided data on household income in this study, but we can see that the heavy olive oil users were less likely to be current smokers and they got more physical activity.
Now, the authors are aware of these limitations and do their best to account for them. In multivariable models, they adjust for other stuff in the diet, and even for income (sort of; they use census tract as a proxy for income, which is really a broad brush), and still find a significant though weakened association showing a protective effect of olive oil on dementia-related death. But still — adjustment is never perfect, and the small effect size here could definitely be due to residual confounding.
Evidence More Convincing
Now, I did tell you that there is one reason to believe that this study is true, but it’s not really from this study.
It’s from the PREDIMED randomized trial.
This is nutritional epidemiology I can get behind. Published in 2018, investigators in Spain randomized around 7500 participants to receive a liter of olive oil once a week vs mixed nuts, vs small nonfood gifts, the idea here being that if you have olive oil around, you’ll use it more. And people who were randomly assigned to get the olive oil had a 30% lower rate of cardiovascular events. A secondary analysis of that study found that the rate of development of mild cognitive impairment was 65% lower in those who were randomly assigned to olive oil. That’s an impressive result.
So, there might be something to this olive oil thing, but I’m not quite ready to add it to my “pleasurable things that are still good for you” list just yet. Though it does make me wonder: Can we make French fries in the stuff?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
As you all know by now, I’m always looking out for lifestyle changes that are both pleasurable and healthy. They are hard to find, especially when it comes to diet. My kids complain about this all the time: “When you say ‘healthy food,’ you just mean yucky food.” And yes, French fries are amazing, and no, we can’t have them three times a day.
So, when I saw an article claiming that olive oil reduces the risk for dementia, I was interested. I love olive oil; I cook with it all the time. But as is always the case in the world of nutritional epidemiology, we need to be careful. There are a lot of reasons to doubt the results of this study — and one reason to believe it’s true.
The study I’m talking about is “Consumption of Olive Oil and Diet Quality and Risk of Dementia-Related Death,” appearing in JAMA Network Open and following a well-trod formula in the nutritional epidemiology space.
Nearly 100,000 participants, all healthcare workers, filled out a food frequency questionnaire every 4 years with 130 questions touching on all aspects of diet: How often do you eat bananas, bacon, olive oil? Participants were followed for more than 20 years, and if they died, the cause of death was flagged as being dementia-related or not. Over that time frame there were around 38,000 deaths, of which 4751 were due to dementia.
The rest is just statistics. The authors show that those who reported consuming more olive oil were less likely to die from dementia — about 50% less likely, if you compare those who reported eating more than 7 grams of olive oil a day with those who reported eating none.
Is It What You Eat, or What You Don’t Eat?
And we could stop there if we wanted to; I’m sure big olive oil would be happy with that. Is there such a thing as “big olive oil”? But no, we need to dig deeper here because this study has the same problems as all nutritional epidemiology studies. Number one, no one is sitting around drinking small cups of olive oil. They consume it with other foods. And it was clear from the food frequency questionnaire that people who consumed more olive oil also consumed less red meat, more fruits and vegetables, more whole grains, more butter, and less margarine. And those are just the findings reported in the paper. I suspect that people who eat more olive oil also eat more tomatoes, for example, though data this granular aren’t shown. So, it can be really hard, in studies like this, to know for sure that it’s actually the olive oil that is helpful rather than some other constituent in the diet.
The flip side of that coin presents another issue. The food you eat is also a marker of the food you don’t eat. People who ate olive oil consumed less margarine, for example. At the time of this study, margarine was still adulterated with trans-fats, which a pretty solid evidence base suggests are really bad for your vascular system. So perhaps it’s not that olive oil is particularly good for you but that something else is bad for you. In other words, simply adding olive oil to your diet without changing anything else may not do anything.
The other major problem with studies of this sort is that people don’t consume food at random. The type of person who eats a lot of olive oil is simply different from the type of person who doesn›t. For one thing, olive oil is expensive. A 25-ounce bottle of olive oil is on sale at my local supermarket right now for $11.00. A similar-sized bottle of vegetable oil goes for $4.00.
Isn’t it interesting that food that costs more money tends to be associated with better health outcomes? (I’m looking at you, red wine.) Perhaps it’s not the food; perhaps it’s the money. We aren’t provided data on household income in this study, but we can see that the heavy olive oil users were less likely to be current smokers and they got more physical activity.
Now, the authors are aware of these limitations and do their best to account for them. In multivariable models, they adjust for other stuff in the diet, and even for income (sort of; they use census tract as a proxy for income, which is really a broad brush), and still find a significant though weakened association showing a protective effect of olive oil on dementia-related death. But still — adjustment is never perfect, and the small effect size here could definitely be due to residual confounding.
Evidence More Convincing
Now, I did tell you that there is one reason to believe that this study is true, but it’s not really from this study.
It’s from the PREDIMED randomized trial.
This is nutritional epidemiology I can get behind. Published in 2018, investigators in Spain randomized around 7500 participants to receive a liter of olive oil once a week vs mixed nuts, vs small nonfood gifts, the idea here being that if you have olive oil around, you’ll use it more. And people who were randomly assigned to get the olive oil had a 30% lower rate of cardiovascular events. A secondary analysis of that study found that the rate of development of mild cognitive impairment was 65% lower in those who were randomly assigned to olive oil. That’s an impressive result.
So, there might be something to this olive oil thing, but I’m not quite ready to add it to my “pleasurable things that are still good for you” list just yet. Though it does make me wonder: Can we make French fries in the stuff?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
As you all know by now, I’m always looking out for lifestyle changes that are both pleasurable and healthy. They are hard to find, especially when it comes to diet. My kids complain about this all the time: “When you say ‘healthy food,’ you just mean yucky food.” And yes, French fries are amazing, and no, we can’t have them three times a day.
So, when I saw an article claiming that olive oil reduces the risk for dementia, I was interested. I love olive oil; I cook with it all the time. But as is always the case in the world of nutritional epidemiology, we need to be careful. There are a lot of reasons to doubt the results of this study — and one reason to believe it’s true.
The study I’m talking about is “Consumption of Olive Oil and Diet Quality and Risk of Dementia-Related Death,” appearing in JAMA Network Open and following a well-trod formula in the nutritional epidemiology space.
Nearly 100,000 participants, all healthcare workers, filled out a food frequency questionnaire every 4 years with 130 questions touching on all aspects of diet: How often do you eat bananas, bacon, olive oil? Participants were followed for more than 20 years, and if they died, the cause of death was flagged as being dementia-related or not. Over that time frame there were around 38,000 deaths, of which 4751 were due to dementia.
The rest is just statistics. The authors show that those who reported consuming more olive oil were less likely to die from dementia — about 50% less likely, if you compare those who reported eating more than 7 grams of olive oil a day with those who reported eating none.
Is It What You Eat, or What You Don’t Eat?
And we could stop there if we wanted to; I’m sure big olive oil would be happy with that. Is there such a thing as “big olive oil”? But no, we need to dig deeper here because this study has the same problems as all nutritional epidemiology studies. Number one, no one is sitting around drinking small cups of olive oil. They consume it with other foods. And it was clear from the food frequency questionnaire that people who consumed more olive oil also consumed less red meat, more fruits and vegetables, more whole grains, more butter, and less margarine. And those are just the findings reported in the paper. I suspect that people who eat more olive oil also eat more tomatoes, for example, though data this granular aren’t shown. So, it can be really hard, in studies like this, to know for sure that it’s actually the olive oil that is helpful rather than some other constituent in the diet.
The flip side of that coin presents another issue. The food you eat is also a marker of the food you don’t eat. People who ate olive oil consumed less margarine, for example. At the time of this study, margarine was still adulterated with trans-fats, which a pretty solid evidence base suggests are really bad for your vascular system. So perhaps it’s not that olive oil is particularly good for you but that something else is bad for you. In other words, simply adding olive oil to your diet without changing anything else may not do anything.
The other major problem with studies of this sort is that people don’t consume food at random. The type of person who eats a lot of olive oil is simply different from the type of person who doesn›t. For one thing, olive oil is expensive. A 25-ounce bottle of olive oil is on sale at my local supermarket right now for $11.00. A similar-sized bottle of vegetable oil goes for $4.00.
Isn’t it interesting that food that costs more money tends to be associated with better health outcomes? (I’m looking at you, red wine.) Perhaps it’s not the food; perhaps it’s the money. We aren’t provided data on household income in this study, but we can see that the heavy olive oil users were less likely to be current smokers and they got more physical activity.
Now, the authors are aware of these limitations and do their best to account for them. In multivariable models, they adjust for other stuff in the diet, and even for income (sort of; they use census tract as a proxy for income, which is really a broad brush), and still find a significant though weakened association showing a protective effect of olive oil on dementia-related death. But still — adjustment is never perfect, and the small effect size here could definitely be due to residual confounding.
Evidence More Convincing
Now, I did tell you that there is one reason to believe that this study is true, but it’s not really from this study.
It’s from the PREDIMED randomized trial.
This is nutritional epidemiology I can get behind. Published in 2018, investigators in Spain randomized around 7500 participants to receive a liter of olive oil once a week vs mixed nuts, vs small nonfood gifts, the idea here being that if you have olive oil around, you’ll use it more. And people who were randomly assigned to get the olive oil had a 30% lower rate of cardiovascular events. A secondary analysis of that study found that the rate of development of mild cognitive impairment was 65% lower in those who were randomly assigned to olive oil. That’s an impressive result.
So, there might be something to this olive oil thing, but I’m not quite ready to add it to my “pleasurable things that are still good for you” list just yet. Though it does make me wonder: Can we make French fries in the stuff?
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
Intermittent Fasting + HIIT: Fitness Fad or Fix?
Let’s be honest: Although as physicians we have the power of the prescription pad, so much of health, in the end, comes down to lifestyle. Of course, taking a pill is often way easier than changing your longstanding habits. And what’s worse, doesn’t it always seem like the lifestyle stuff that is good for your health is unpleasant?
Two recent lifestyle interventions that I have tried and find really not enjoyable are time-restricted eating (also known as intermittent fasting) and high-intensity interval training, or HIIT. The former leaves me hangry for half the day; the latter is, well, it’s just really hard compared with my usual jog.
But given the rule of unpleasant lifestyle changes, I knew as soon as I saw this recent study what the result would be. What if we combined time-restricted eating with high-intensity interval training?
I’m referring to this study, appearing in PLOS ONE from Ranya Ameur and colleagues, which is a small trial that enrolled otherwise healthy women with a BMI > 30 and randomized them to one of three conditions.
First was time-restricted eating. Women in this group could eat whatever they wanted, but only from 8 a.m. to 4 p.m. daily.
Second: high-intensity functional training. This is a variant of high-intensity interval training which focuses a bit more on resistance exercise than on pure cardiovascular stuff but has the same vibe of doing brief bursts of intensive activity followed by a cool-down period.
Third: a combination of the two. Sounds rough to me.
The study was otherwise straightforward. At baseline, researchers collected data on body composition and dietary intake, and measured blood pressure, glucose, insulin, and lipid biomarkers. A 12-week intervention period followed, after which all of this stuff was measured again.
Now, you may have noticed that there is no control group in this study. We’ll come back to that — a few times.
Let me walk you through some of the outcomes here.
First off, body composition metrics. All three groups lost weight — on average, around 10% of body weight which, for a 12-week intervention, is fairly impressive. BMI and waist circumference went down as well, and, interestingly, much of the weight loss here was in fat mass, not fat-free mass.
Most interventions that lead to weight loss — and I’m including some of the newer drugs here — lead to both fat and muscle loss. That might not be as bad as it sounds; the truth is that muscle mass increases as fat increases because of the simple fact that if you’re carrying more weight when you walk around, your leg muscles get bigger. But to preserve muscle mass in the face of fat loss is sort of a Goldilocks finding, and, based on these results, there’s a suggestion that the high-intensity functional training helps to do just that.
The dietary intake findings were really surprising to me. Across the board, caloric intake decreased. It’s no surprise that time-restricted eating reduces calorie intake. That has been shown many times before and is probably the main reason it induces weight loss — less time to eat means you eat less.
But why would high-intensity functional training lead to lower caloric intake? Most people, myself included, get hungry after they exercise. In fact, one of the reasons it’s hard to lose weight with exercise alone is that we end up eating more calories to make up for what we lost during the exercise. This calorie reduction could be a unique effect of this type of exercise, but honestly this could also be something called the Hawthorne effect. Women in the study kept a food diary to track their intake, and the act of doing that might actually make you eat less. It makes it a little more annoying to snack a bit if you know you have to write it down. This is a situation where I would kill for a control group.
The lipid findings are also pretty striking, with around a 40% reduction in LDL across the board, and evidence of synergistic effects of combined TRE and high-intensity training on total cholesterol and triglycerides. This is like a statin level of effect — pretty impressive. Again, my heart pines for a control group, though.
Same story with glucose and insulin measures: an impressive reduction in fasting glucose and good evidence that the combination of time-restricted eating and high-intensity functional training reduces insulin levels and HOMA-IR as well.
Really the only thing that wasn’t very impressive was the change in blood pressure, with only modest decreases across the board.
Okay, so let’s take a breath after this high-intensity cerebral workout and put this all together. This was a small study, lacking a control group, but with large effect sizes in very relevant clinical areas. It confirms what we know about time-restricted eating — that it makes you eat less calories — and introduces the potential that vigorous exercise can not only magnify the benefits of time-restricted eating but maybe even mitigate some of the risks, like the risk for muscle loss. And of course, it comports with my central hypothesis, which is that the more unpleasant a lifestyle intervention is, the better it is for you. No pain, no gain, right?
Of course, I am being overly dogmatic. There are plenty of caveats. Wrestling bears is quite unpleasant and almost certainly bad for you. And there are even some pleasant things that are pretty good for you — like coffee and sex. And there are even people who find time-restricted eating and high-intensity training pleasurable. They are called masochists.
I’m joking. The truth is that Or, at least, much less painful. The trick is getting over the hump of change. If only there were a pill for that.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Connecticut. He has disclosed no relevant financial relationships. This transcript has been edited for clarity.
A version of this article appeared on Medscape.com.
Let’s be honest: Although as physicians we have the power of the prescription pad, so much of health, in the end, comes down to lifestyle. Of course, taking a pill is often way easier than changing your longstanding habits. And what’s worse, doesn’t it always seem like the lifestyle stuff that is good for your health is unpleasant?
Two recent lifestyle interventions that I have tried and find really not enjoyable are time-restricted eating (also known as intermittent fasting) and high-intensity interval training, or HIIT. The former leaves me hangry for half the day; the latter is, well, it’s just really hard compared with my usual jog.
But given the rule of unpleasant lifestyle changes, I knew as soon as I saw this recent study what the result would be. What if we combined time-restricted eating with high-intensity interval training?
I’m referring to this study, appearing in PLOS ONE from Ranya Ameur and colleagues, which is a small trial that enrolled otherwise healthy women with a BMI > 30 and randomized them to one of three conditions.
First was time-restricted eating. Women in this group could eat whatever they wanted, but only from 8 a.m. to 4 p.m. daily.
Second: high-intensity functional training. This is a variant of high-intensity interval training which focuses a bit more on resistance exercise than on pure cardiovascular stuff but has the same vibe of doing brief bursts of intensive activity followed by a cool-down period.
Third: a combination of the two. Sounds rough to me.
The study was otherwise straightforward. At baseline, researchers collected data on body composition and dietary intake, and measured blood pressure, glucose, insulin, and lipid biomarkers. A 12-week intervention period followed, after which all of this stuff was measured again.
Now, you may have noticed that there is no control group in this study. We’ll come back to that — a few times.
Let me walk you through some of the outcomes here.
First off, body composition metrics. All three groups lost weight — on average, around 10% of body weight which, for a 12-week intervention, is fairly impressive. BMI and waist circumference went down as well, and, interestingly, much of the weight loss here was in fat mass, not fat-free mass.
Most interventions that lead to weight loss — and I’m including some of the newer drugs here — lead to both fat and muscle loss. That might not be as bad as it sounds; the truth is that muscle mass increases as fat increases because of the simple fact that if you’re carrying more weight when you walk around, your leg muscles get bigger. But to preserve muscle mass in the face of fat loss is sort of a Goldilocks finding, and, based on these results, there’s a suggestion that the high-intensity functional training helps to do just that.
The dietary intake findings were really surprising to me. Across the board, caloric intake decreased. It’s no surprise that time-restricted eating reduces calorie intake. That has been shown many times before and is probably the main reason it induces weight loss — less time to eat means you eat less.
But why would high-intensity functional training lead to lower caloric intake? Most people, myself included, get hungry after they exercise. In fact, one of the reasons it’s hard to lose weight with exercise alone is that we end up eating more calories to make up for what we lost during the exercise. This calorie reduction could be a unique effect of this type of exercise, but honestly this could also be something called the Hawthorne effect. Women in the study kept a food diary to track their intake, and the act of doing that might actually make you eat less. It makes it a little more annoying to snack a bit if you know you have to write it down. This is a situation where I would kill for a control group.
The lipid findings are also pretty striking, with around a 40% reduction in LDL across the board, and evidence of synergistic effects of combined TRE and high-intensity training on total cholesterol and triglycerides. This is like a statin level of effect — pretty impressive. Again, my heart pines for a control group, though.
Same story with glucose and insulin measures: an impressive reduction in fasting glucose and good evidence that the combination of time-restricted eating and high-intensity functional training reduces insulin levels and HOMA-IR as well.
Really the only thing that wasn’t very impressive was the change in blood pressure, with only modest decreases across the board.
Okay, so let’s take a breath after this high-intensity cerebral workout and put this all together. This was a small study, lacking a control group, but with large effect sizes in very relevant clinical areas. It confirms what we know about time-restricted eating — that it makes you eat less calories — and introduces the potential that vigorous exercise can not only magnify the benefits of time-restricted eating but maybe even mitigate some of the risks, like the risk for muscle loss. And of course, it comports with my central hypothesis, which is that the more unpleasant a lifestyle intervention is, the better it is for you. No pain, no gain, right?
Of course, I am being overly dogmatic. There are plenty of caveats. Wrestling bears is quite unpleasant and almost certainly bad for you. And there are even some pleasant things that are pretty good for you — like coffee and sex. And there are even people who find time-restricted eating and high-intensity training pleasurable. They are called masochists.
I’m joking. The truth is that Or, at least, much less painful. The trick is getting over the hump of change. If only there were a pill for that.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Connecticut. He has disclosed no relevant financial relationships. This transcript has been edited for clarity.
A version of this article appeared on Medscape.com.
Let’s be honest: Although as physicians we have the power of the prescription pad, so much of health, in the end, comes down to lifestyle. Of course, taking a pill is often way easier than changing your longstanding habits. And what’s worse, doesn’t it always seem like the lifestyle stuff that is good for your health is unpleasant?
Two recent lifestyle interventions that I have tried and find really not enjoyable are time-restricted eating (also known as intermittent fasting) and high-intensity interval training, or HIIT. The former leaves me hangry for half the day; the latter is, well, it’s just really hard compared with my usual jog.
But given the rule of unpleasant lifestyle changes, I knew as soon as I saw this recent study what the result would be. What if we combined time-restricted eating with high-intensity interval training?
I’m referring to this study, appearing in PLOS ONE from Ranya Ameur and colleagues, which is a small trial that enrolled otherwise healthy women with a BMI > 30 and randomized them to one of three conditions.
First was time-restricted eating. Women in this group could eat whatever they wanted, but only from 8 a.m. to 4 p.m. daily.
Second: high-intensity functional training. This is a variant of high-intensity interval training which focuses a bit more on resistance exercise than on pure cardiovascular stuff but has the same vibe of doing brief bursts of intensive activity followed by a cool-down period.
Third: a combination of the two. Sounds rough to me.
The study was otherwise straightforward. At baseline, researchers collected data on body composition and dietary intake, and measured blood pressure, glucose, insulin, and lipid biomarkers. A 12-week intervention period followed, after which all of this stuff was measured again.
Now, you may have noticed that there is no control group in this study. We’ll come back to that — a few times.
Let me walk you through some of the outcomes here.
First off, body composition metrics. All three groups lost weight — on average, around 10% of body weight which, for a 12-week intervention, is fairly impressive. BMI and waist circumference went down as well, and, interestingly, much of the weight loss here was in fat mass, not fat-free mass.
Most interventions that lead to weight loss — and I’m including some of the newer drugs here — lead to both fat and muscle loss. That might not be as bad as it sounds; the truth is that muscle mass increases as fat increases because of the simple fact that if you’re carrying more weight when you walk around, your leg muscles get bigger. But to preserve muscle mass in the face of fat loss is sort of a Goldilocks finding, and, based on these results, there’s a suggestion that the high-intensity functional training helps to do just that.
The dietary intake findings were really surprising to me. Across the board, caloric intake decreased. It’s no surprise that time-restricted eating reduces calorie intake. That has been shown many times before and is probably the main reason it induces weight loss — less time to eat means you eat less.
But why would high-intensity functional training lead to lower caloric intake? Most people, myself included, get hungry after they exercise. In fact, one of the reasons it’s hard to lose weight with exercise alone is that we end up eating more calories to make up for what we lost during the exercise. This calorie reduction could be a unique effect of this type of exercise, but honestly this could also be something called the Hawthorne effect. Women in the study kept a food diary to track their intake, and the act of doing that might actually make you eat less. It makes it a little more annoying to snack a bit if you know you have to write it down. This is a situation where I would kill for a control group.
The lipid findings are also pretty striking, with around a 40% reduction in LDL across the board, and evidence of synergistic effects of combined TRE and high-intensity training on total cholesterol and triglycerides. This is like a statin level of effect — pretty impressive. Again, my heart pines for a control group, though.
Same story with glucose and insulin measures: an impressive reduction in fasting glucose and good evidence that the combination of time-restricted eating and high-intensity functional training reduces insulin levels and HOMA-IR as well.
Really the only thing that wasn’t very impressive was the change in blood pressure, with only modest decreases across the board.
Okay, so let’s take a breath after this high-intensity cerebral workout and put this all together. This was a small study, lacking a control group, but with large effect sizes in very relevant clinical areas. It confirms what we know about time-restricted eating — that it makes you eat less calories — and introduces the potential that vigorous exercise can not only magnify the benefits of time-restricted eating but maybe even mitigate some of the risks, like the risk for muscle loss. And of course, it comports with my central hypothesis, which is that the more unpleasant a lifestyle intervention is, the better it is for you. No pain, no gain, right?
Of course, I am being overly dogmatic. There are plenty of caveats. Wrestling bears is quite unpleasant and almost certainly bad for you. And there are even some pleasant things that are pretty good for you — like coffee and sex. And there are even people who find time-restricted eating and high-intensity training pleasurable. They are called masochists.
I’m joking. The truth is that Or, at least, much less painful. The trick is getting over the hump of change. If only there were a pill for that.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Connecticut. He has disclosed no relevant financial relationships. This transcript has been edited for clarity.
A version of this article appeared on Medscape.com.
Are Women Better Doctors Than Men?
This transcript has been edited for clarity.
It’s a battle of the sexes today as we dive into a paper that makes you say, “Wow, what an interesting study” and also “Boy, am I glad I didn’t do that study.” That’s because studies like this are always somewhat fraught; they say something about medicine but also something about society — and that makes this a bit precarious. But that’s never stopped us before. So, let’s go ahead and try to answer the question: Do women make better doctors than men?
On the surface, this question seems nearly impossible to answer. It’s too broad; what does it mean to be a “better” doctor? At first blush it seems that there are just too many variables to control for here: the type of doctor, the type of patient, the clinical scenario, and so on.
But this study, “Comparison of hospital mortality and readmission rates by physician and patient sex,” which appears in Annals of Internal Medicine, uses a fairly ingenious method to cut through all the bias by leveraging two simple facts: First, hospital medicine is largely conducted by hospitalists these days; second, due to the shift-based nature of hospitalist work, the hospitalist you get when you are admitted to the hospital is pretty much random.
In other words, if you are admitted to the hospital for an acute illness and get a hospitalist as your attending, you have no control over whether it is a man or a woman. Is this a randomized trial? No, but it’s not bad.
Researchers used Medicare claims data to identify adults over age 65 who had nonelective hospital admissions throughout the United States. The claims revealed the sex of the patient and the name of the attending physician. By linking to a medical provider database, they could determine the sex of the provider.
The goal was to look at outcomes across four dyads:
- Male patient – male doctor
- Male patient – female doctor
- Female patient – male doctor
- Female patient – female doctor
The primary outcome was 30-day mortality.
I told you that focusing on hospitalists produces some pseudorandomization, but let’s look at the data to be sure. Just under a million patients were treated by approximately 50,000 physicians, 30% of whom were female. And, though female patients and male patients differed, they did not differ with respect to the sex of their hospitalist. So, by physician sex, patients were similar in mean age, race, ethnicity, household income, eligibility for Medicaid, and comorbid conditions. The authors even created a “predicted mortality” score which was similar across the groups as well.
Now, the female physicians were a bit different from the male physicians. The female hospitalists were slightly more likely to have an osteopathic degree, had slightly fewer admissions per year, and were a bit younger.
So, we have broadly similar patients regardless of who their hospitalist was, but hospitalists differ by factors other than their sex. Fine.
I’ve graphed the results here.
This is a relatively small effect, to be sure, but if you multiply it across the millions of hospitalist admissions per year, you can start to put up some real numbers.
So, what is going on here? I see four broad buckets of possibilities.
Let’s start with the obvious explanation: Women, on average, are better doctors than men. I am married to a woman doctor, and based on my personal experience, this explanation is undoubtedly true. But why would that be?
The authors cite data that suggest that female physicians are less likely than male physicians to dismiss patient concerns — and in particular, the concerns of female patients — perhaps leading to fewer missed diagnoses. But this is impossible to measure with administrative data, so this study can no more tell us whether these female hospitalists are more attentive than their male counterparts than it can suggest that the benefit is mediated by the shorter average height of female physicians. Perhaps the key is being closer to the patient?
The second possibility here is that this has nothing to do with the sex of the physician at all; it has to do with those other things that associate with the sex of the physician. We know, for example, that the female physicians saw fewer patients per year than the male physicians, but the study authors adjusted for this in the statistical models. Still, other unmeasured factors (confounders) could be present. By the way, confounders wouldn’t necessarily change the primary finding — you are better off being cared for by female physicians. It’s just not because they are female; it’s a convenient marker for some other quality, such as age.
The third possibility is that the study represents a phenomenon called collider bias. The idea here is that physicians only get into the study if they are hospitalists, and the quality of physicians who choose to become a hospitalist may differ by sex. When deciding on a specialty, a talented resident considering certain lifestyle issues may find hospital medicine particularly attractive — and that draw toward a more lifestyle-friendly specialty may differ by sex, as some prior studies have shown. If true, the pool of women hospitalists may be better than their male counterparts because male physicians of that caliber don’t become hospitalists.
Okay, don’t write in. I’m just trying to cite examples of how to think about collider bias. I can’t prove that this is the case, and in fact the authors do a sensitivity analysis of all physicians, not just hospitalists, and show the same thing. So this is probably not true, but epidemiology is fun, right?
And the fourth possibility: This is nothing but statistical noise. The effect size is incredibly small and just on the border of statistical significance. Especially when you’re working with very large datasets like this, you’ve got to be really careful about overinterpreting statistically significant findings that are nevertheless of small magnitude.
Regardless, it’s an interesting study, one that made me think and, of course, worry a bit about how I would present it. Forgive me if I’ve been indelicate in handling the complex issues of sex, gender, and society here. But I’m not sure what you expect; after all, I’m only a male doctor.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
It’s a battle of the sexes today as we dive into a paper that makes you say, “Wow, what an interesting study” and also “Boy, am I glad I didn’t do that study.” That’s because studies like this are always somewhat fraught; they say something about medicine but also something about society — and that makes this a bit precarious. But that’s never stopped us before. So, let’s go ahead and try to answer the question: Do women make better doctors than men?
On the surface, this question seems nearly impossible to answer. It’s too broad; what does it mean to be a “better” doctor? At first blush it seems that there are just too many variables to control for here: the type of doctor, the type of patient, the clinical scenario, and so on.
But this study, “Comparison of hospital mortality and readmission rates by physician and patient sex,” which appears in Annals of Internal Medicine, uses a fairly ingenious method to cut through all the bias by leveraging two simple facts: First, hospital medicine is largely conducted by hospitalists these days; second, due to the shift-based nature of hospitalist work, the hospitalist you get when you are admitted to the hospital is pretty much random.
In other words, if you are admitted to the hospital for an acute illness and get a hospitalist as your attending, you have no control over whether it is a man or a woman. Is this a randomized trial? No, but it’s not bad.
Researchers used Medicare claims data to identify adults over age 65 who had nonelective hospital admissions throughout the United States. The claims revealed the sex of the patient and the name of the attending physician. By linking to a medical provider database, they could determine the sex of the provider.
The goal was to look at outcomes across four dyads:
- Male patient – male doctor
- Male patient – female doctor
- Female patient – male doctor
- Female patient – female doctor
The primary outcome was 30-day mortality.
I told you that focusing on hospitalists produces some pseudorandomization, but let’s look at the data to be sure. Just under a million patients were treated by approximately 50,000 physicians, 30% of whom were female. And, though female patients and male patients differed, they did not differ with respect to the sex of their hospitalist. So, by physician sex, patients were similar in mean age, race, ethnicity, household income, eligibility for Medicaid, and comorbid conditions. The authors even created a “predicted mortality” score which was similar across the groups as well.
Now, the female physicians were a bit different from the male physicians. The female hospitalists were slightly more likely to have an osteopathic degree, had slightly fewer admissions per year, and were a bit younger.
So, we have broadly similar patients regardless of who their hospitalist was, but hospitalists differ by factors other than their sex. Fine.
I’ve graphed the results here.
This is a relatively small effect, to be sure, but if you multiply it across the millions of hospitalist admissions per year, you can start to put up some real numbers.
So, what is going on here? I see four broad buckets of possibilities.
Let’s start with the obvious explanation: Women, on average, are better doctors than men. I am married to a woman doctor, and based on my personal experience, this explanation is undoubtedly true. But why would that be?
The authors cite data that suggest that female physicians are less likely than male physicians to dismiss patient concerns — and in particular, the concerns of female patients — perhaps leading to fewer missed diagnoses. But this is impossible to measure with administrative data, so this study can no more tell us whether these female hospitalists are more attentive than their male counterparts than it can suggest that the benefit is mediated by the shorter average height of female physicians. Perhaps the key is being closer to the patient?
The second possibility here is that this has nothing to do with the sex of the physician at all; it has to do with those other things that associate with the sex of the physician. We know, for example, that the female physicians saw fewer patients per year than the male physicians, but the study authors adjusted for this in the statistical models. Still, other unmeasured factors (confounders) could be present. By the way, confounders wouldn’t necessarily change the primary finding — you are better off being cared for by female physicians. It’s just not because they are female; it’s a convenient marker for some other quality, such as age.
The third possibility is that the study represents a phenomenon called collider bias. The idea here is that physicians only get into the study if they are hospitalists, and the quality of physicians who choose to become a hospitalist may differ by sex. When deciding on a specialty, a talented resident considering certain lifestyle issues may find hospital medicine particularly attractive — and that draw toward a more lifestyle-friendly specialty may differ by sex, as some prior studies have shown. If true, the pool of women hospitalists may be better than their male counterparts because male physicians of that caliber don’t become hospitalists.
Okay, don’t write in. I’m just trying to cite examples of how to think about collider bias. I can’t prove that this is the case, and in fact the authors do a sensitivity analysis of all physicians, not just hospitalists, and show the same thing. So this is probably not true, but epidemiology is fun, right?
And the fourth possibility: This is nothing but statistical noise. The effect size is incredibly small and just on the border of statistical significance. Especially when you’re working with very large datasets like this, you’ve got to be really careful about overinterpreting statistically significant findings that are nevertheless of small magnitude.
Regardless, it’s an interesting study, one that made me think and, of course, worry a bit about how I would present it. Forgive me if I’ve been indelicate in handling the complex issues of sex, gender, and society here. But I’m not sure what you expect; after all, I’m only a male doctor.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
It’s a battle of the sexes today as we dive into a paper that makes you say, “Wow, what an interesting study” and also “Boy, am I glad I didn’t do that study.” That’s because studies like this are always somewhat fraught; they say something about medicine but also something about society — and that makes this a bit precarious. But that’s never stopped us before. So, let’s go ahead and try to answer the question: Do women make better doctors than men?
On the surface, this question seems nearly impossible to answer. It’s too broad; what does it mean to be a “better” doctor? At first blush it seems that there are just too many variables to control for here: the type of doctor, the type of patient, the clinical scenario, and so on.
But this study, “Comparison of hospital mortality and readmission rates by physician and patient sex,” which appears in Annals of Internal Medicine, uses a fairly ingenious method to cut through all the bias by leveraging two simple facts: First, hospital medicine is largely conducted by hospitalists these days; second, due to the shift-based nature of hospitalist work, the hospitalist you get when you are admitted to the hospital is pretty much random.
In other words, if you are admitted to the hospital for an acute illness and get a hospitalist as your attending, you have no control over whether it is a man or a woman. Is this a randomized trial? No, but it’s not bad.
Researchers used Medicare claims data to identify adults over age 65 who had nonelective hospital admissions throughout the United States. The claims revealed the sex of the patient and the name of the attending physician. By linking to a medical provider database, they could determine the sex of the provider.
The goal was to look at outcomes across four dyads:
- Male patient – male doctor
- Male patient – female doctor
- Female patient – male doctor
- Female patient – female doctor
The primary outcome was 30-day mortality.
I told you that focusing on hospitalists produces some pseudorandomization, but let’s look at the data to be sure. Just under a million patients were treated by approximately 50,000 physicians, 30% of whom were female. And, though female patients and male patients differed, they did not differ with respect to the sex of their hospitalist. So, by physician sex, patients were similar in mean age, race, ethnicity, household income, eligibility for Medicaid, and comorbid conditions. The authors even created a “predicted mortality” score which was similar across the groups as well.
Now, the female physicians were a bit different from the male physicians. The female hospitalists were slightly more likely to have an osteopathic degree, had slightly fewer admissions per year, and were a bit younger.
So, we have broadly similar patients regardless of who their hospitalist was, but hospitalists differ by factors other than their sex. Fine.
I’ve graphed the results here.
This is a relatively small effect, to be sure, but if you multiply it across the millions of hospitalist admissions per year, you can start to put up some real numbers.
So, what is going on here? I see four broad buckets of possibilities.
Let’s start with the obvious explanation: Women, on average, are better doctors than men. I am married to a woman doctor, and based on my personal experience, this explanation is undoubtedly true. But why would that be?
The authors cite data that suggest that female physicians are less likely than male physicians to dismiss patient concerns — and in particular, the concerns of female patients — perhaps leading to fewer missed diagnoses. But this is impossible to measure with administrative data, so this study can no more tell us whether these female hospitalists are more attentive than their male counterparts than it can suggest that the benefit is mediated by the shorter average height of female physicians. Perhaps the key is being closer to the patient?
The second possibility here is that this has nothing to do with the sex of the physician at all; it has to do with those other things that associate with the sex of the physician. We know, for example, that the female physicians saw fewer patients per year than the male physicians, but the study authors adjusted for this in the statistical models. Still, other unmeasured factors (confounders) could be present. By the way, confounders wouldn’t necessarily change the primary finding — you are better off being cared for by female physicians. It’s just not because they are female; it’s a convenient marker for some other quality, such as age.
The third possibility is that the study represents a phenomenon called collider bias. The idea here is that physicians only get into the study if they are hospitalists, and the quality of physicians who choose to become a hospitalist may differ by sex. When deciding on a specialty, a talented resident considering certain lifestyle issues may find hospital medicine particularly attractive — and that draw toward a more lifestyle-friendly specialty may differ by sex, as some prior studies have shown. If true, the pool of women hospitalists may be better than their male counterparts because male physicians of that caliber don’t become hospitalists.
Okay, don’t write in. I’m just trying to cite examples of how to think about collider bias. I can’t prove that this is the case, and in fact the authors do a sensitivity analysis of all physicians, not just hospitalists, and show the same thing. So this is probably not true, but epidemiology is fun, right?
And the fourth possibility: This is nothing but statistical noise. The effect size is incredibly small and just on the border of statistical significance. Especially when you’re working with very large datasets like this, you’ve got to be really careful about overinterpreting statistically significant findings that are nevertheless of small magnitude.
Regardless, it’s an interesting study, one that made me think and, of course, worry a bit about how I would present it. Forgive me if I’ve been indelicate in handling the complex issues of sex, gender, and society here. But I’m not sure what you expect; after all, I’m only a male doctor.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
‘Difficult Patient’: Stigmatizing Words and Medical Error
This transcript has been edited for clarity.
When I was doing my nephrology training, I had an attending who would write notes that were, well, kind of funny. I remember one time we were seeing a patient whose first name was “Lucky.” He dryly opened his section of the consult note as follows: “This is a 56-year-old woman with an ironic name who presents with acute renal failure.”
As an exhausted renal fellow, I appreciated the bit of color amid the ongoing series of tragedies that was the consult service. But let’s be clear — writing like this in the medical record is not a good idea. It wasn’t a good idea then, when any record might end up disclosed during a malpractice suit, and it’s really not a good idea now, when patients have ready and automated access to all the notes we write about them.
And yet, worse language than that of my attending appears in hospital notes all the time; there is research about this. Specifically, I’m talking about language that does not have high clinical utility but telegraphs the biases of the person writing the note. This is known as “stigmatizing language” and it can be overt or subtle.
For example, a physician wrote “I listed several fictitious medication names and she reported she was taking them.”
This casts suspicions about the patient’s credibility, as does the more subtle statement, “he claims nicotine patches don’t work for him.” Stigmatizing language may cast the patient in a difficult light, like this note: “she persevered on the fact that ... ‘you wouldn’t understand.’ ”
Stay with me.
We are going to start by defining a very sick patient population: those admitted to the hospital and who, within 48 hours, have either been transferred to the intensive care unit or died. Because of the severity of illness in this population we’ve just defined, figuring out whether a diagnostic or other error was made would be extremely high yield; these can mean the difference between life and death.
In a letter appearing in JAMA Internal Medicine, researchers examined a group of more than 2300 patients just like this from 29 hospitals, scouring the medical records for evidence of these types of errors.
Nearly one in four (23.2%) had at least one diagnostic error, which could include a missed physical exam finding, failure to ask a key question on history taking, inadequate testing, and so on.
Understanding why we make these errors is clearly critical to improving care for these patients. The researchers hypothesized that stigmatizing language might lead to errors like this. For example, by demonstrating that you don’t find a patient credible, you may ignore statements that would help make a better diagnosis.
Just over 5% of these patients had evidence of stigmatizing language in their medical notes. Like earlier studies, this language was more common if the patient was Black or had unstable housing.
Critically, stigmatizing language was more likely to be found among those who had diagnostic errors — a rate of 8.2% vs 4.1%. After adjustment for factors like race, the presence of stigmatizing language was associated with roughly a doubling of the risk for diagnostic errors.
Now, I’m all for eliminating stigmatizing language from our medical notes. And, given the increased transparency of all medical notes these days, I expect that we’ll see less of this over time. But of course, the fact that a physician doesn’t write something that disparages the patient does not necessarily mean that they don’t retain that bias. That said, those comments have an effect on all the other team members who care for that patient as well; it sets a tone and can entrench an individual’s bias more broadly. We should strive to eliminate our biases when it comes to caring for patients. But perhaps the second best thing is to work to keep those biases to ourselves.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
When I was doing my nephrology training, I had an attending who would write notes that were, well, kind of funny. I remember one time we were seeing a patient whose first name was “Lucky.” He dryly opened his section of the consult note as follows: “This is a 56-year-old woman with an ironic name who presents with acute renal failure.”
As an exhausted renal fellow, I appreciated the bit of color amid the ongoing series of tragedies that was the consult service. But let’s be clear — writing like this in the medical record is not a good idea. It wasn’t a good idea then, when any record might end up disclosed during a malpractice suit, and it’s really not a good idea now, when patients have ready and automated access to all the notes we write about them.
And yet, worse language than that of my attending appears in hospital notes all the time; there is research about this. Specifically, I’m talking about language that does not have high clinical utility but telegraphs the biases of the person writing the note. This is known as “stigmatizing language” and it can be overt or subtle.
For example, a physician wrote “I listed several fictitious medication names and she reported she was taking them.”
This casts suspicions about the patient’s credibility, as does the more subtle statement, “he claims nicotine patches don’t work for him.” Stigmatizing language may cast the patient in a difficult light, like this note: “she persevered on the fact that ... ‘you wouldn’t understand.’ ”
Stay with me.
We are going to start by defining a very sick patient population: those admitted to the hospital and who, within 48 hours, have either been transferred to the intensive care unit or died. Because of the severity of illness in this population we’ve just defined, figuring out whether a diagnostic or other error was made would be extremely high yield; these can mean the difference between life and death.
In a letter appearing in JAMA Internal Medicine, researchers examined a group of more than 2300 patients just like this from 29 hospitals, scouring the medical records for evidence of these types of errors.
Nearly one in four (23.2%) had at least one diagnostic error, which could include a missed physical exam finding, failure to ask a key question on history taking, inadequate testing, and so on.
Understanding why we make these errors is clearly critical to improving care for these patients. The researchers hypothesized that stigmatizing language might lead to errors like this. For example, by demonstrating that you don’t find a patient credible, you may ignore statements that would help make a better diagnosis.
Just over 5% of these patients had evidence of stigmatizing language in their medical notes. Like earlier studies, this language was more common if the patient was Black or had unstable housing.
Critically, stigmatizing language was more likely to be found among those who had diagnostic errors — a rate of 8.2% vs 4.1%. After adjustment for factors like race, the presence of stigmatizing language was associated with roughly a doubling of the risk for diagnostic errors.
Now, I’m all for eliminating stigmatizing language from our medical notes. And, given the increased transparency of all medical notes these days, I expect that we’ll see less of this over time. But of course, the fact that a physician doesn’t write something that disparages the patient does not necessarily mean that they don’t retain that bias. That said, those comments have an effect on all the other team members who care for that patient as well; it sets a tone and can entrench an individual’s bias more broadly. We should strive to eliminate our biases when it comes to caring for patients. But perhaps the second best thing is to work to keep those biases to ourselves.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
When I was doing my nephrology training, I had an attending who would write notes that were, well, kind of funny. I remember one time we were seeing a patient whose first name was “Lucky.” He dryly opened his section of the consult note as follows: “This is a 56-year-old woman with an ironic name who presents with acute renal failure.”
As an exhausted renal fellow, I appreciated the bit of color amid the ongoing series of tragedies that was the consult service. But let’s be clear — writing like this in the medical record is not a good idea. It wasn’t a good idea then, when any record might end up disclosed during a malpractice suit, and it’s really not a good idea now, when patients have ready and automated access to all the notes we write about them.
And yet, worse language than that of my attending appears in hospital notes all the time; there is research about this. Specifically, I’m talking about language that does not have high clinical utility but telegraphs the biases of the person writing the note. This is known as “stigmatizing language” and it can be overt or subtle.
For example, a physician wrote “I listed several fictitious medication names and she reported she was taking them.”
This casts suspicions about the patient’s credibility, as does the more subtle statement, “he claims nicotine patches don’t work for him.” Stigmatizing language may cast the patient in a difficult light, like this note: “she persevered on the fact that ... ‘you wouldn’t understand.’ ”
Stay with me.
We are going to start by defining a very sick patient population: those admitted to the hospital and who, within 48 hours, have either been transferred to the intensive care unit or died. Because of the severity of illness in this population we’ve just defined, figuring out whether a diagnostic or other error was made would be extremely high yield; these can mean the difference between life and death.
In a letter appearing in JAMA Internal Medicine, researchers examined a group of more than 2300 patients just like this from 29 hospitals, scouring the medical records for evidence of these types of errors.
Nearly one in four (23.2%) had at least one diagnostic error, which could include a missed physical exam finding, failure to ask a key question on history taking, inadequate testing, and so on.
Understanding why we make these errors is clearly critical to improving care for these patients. The researchers hypothesized that stigmatizing language might lead to errors like this. For example, by demonstrating that you don’t find a patient credible, you may ignore statements that would help make a better diagnosis.
Just over 5% of these patients had evidence of stigmatizing language in their medical notes. Like earlier studies, this language was more common if the patient was Black or had unstable housing.
Critically, stigmatizing language was more likely to be found among those who had diagnostic errors — a rate of 8.2% vs 4.1%. After adjustment for factors like race, the presence of stigmatizing language was associated with roughly a doubling of the risk for diagnostic errors.
Now, I’m all for eliminating stigmatizing language from our medical notes. And, given the increased transparency of all medical notes these days, I expect that we’ll see less of this over time. But of course, the fact that a physician doesn’t write something that disparages the patient does not necessarily mean that they don’t retain that bias. That said, those comments have an effect on all the other team members who care for that patient as well; it sets a tone and can entrench an individual’s bias more broadly. We should strive to eliminate our biases when it comes to caring for patients. But perhaps the second best thing is to work to keep those biases to ourselves.
Dr. Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
A Banned Chemical That Is Still Causing Cancer
This transcript has been edited for clarity.
These types of stories usually end with a call for regulation — to ban said chemical or substance, or to regulate it — but in this case, that has already happened. This new carcinogen I’m telling you about is actually an old chemical. And it has not been manufactured or legally imported in the US since 2013.
So, why bother? Because in this case, the chemical — or, really, a group of chemicals called polybrominated diphenyl ethers (PBDEs) — are still around: in our soil, in our food, and in our blood.
PBDEs are a group of compounds that confer flame-retardant properties to plastics, and they were used extensively in the latter part of the 20th century in electronic enclosures, business equipment, and foam cushioning in upholstery.
But there was a problem. They don’t chemically bond to plastics; they are just sort of mixed in, which means they can leach out. They are hydrophobic, meaning they don’t get washed out of soil, and, when ingested or inhaled by humans, they dissolve in our fat stores, making it difficult for our normal excretory systems to excrete them.
PBDEs biomagnify. Small animals can take them up from contaminated soil or water, and those animals are eaten by larger animals, which accumulate higher concentrations of the chemicals. This bioaccumulation increases as you move up the food web until you get to an apex predator — like you and me.
This is true of lots of chemicals, of course. The concern arises when these chemicals are toxic. To date, the toxicity data for PBDEs were pretty limited. There were some animal studies where rats were exposed to extremely high doses and they developed liver lesions — but I am always very wary of extrapolating high-dose rat toxicity studies to humans. There was also some suggestion that the chemicals could be endocrine disruptors, affecting breast and thyroid tissue.
What about cancer? In 2016, the International Agency for Research on Cancer concluded there was “inadequate evidence in humans for the carcinogencity of” PBDEs.
In the same report, though, they suggested PBDEs are “probably carcinogenic to humans” based on mechanistic studies.
In other words, we can’t prove they’re cancerous — but come on, they probably are.
Finally, we have some evidence that really pushes us toward the carcinogenic conclusion, in the form of this study, appearing in JAMA Network Open. It’s a nice bit of epidemiology leveraging the population-based National Health and Nutrition Examination Survey (NHANES).
Researchers measured PBDE levels in blood samples from 1100 people enrolled in NHANES in 2003 and 2004 and linked them to death records collected over the next 20 years or so.
The first thing to note is that the researchers were able to measure PBDEs in the blood samples. They were in there. They were detectable. And they were variable. Dividing the 1100 participants into low, medium, and high PBDE tertiles, you can see a nearly 10-fold difference across the population.
Importantly, not many baseline variables correlated with PBDE levels. People in the highest group were a bit younger but had a fairly similar sex distribution, race, ethnicity, education, income, physical activity, smoking status, and body mass index.
This is not a randomized trial, of course — but at least based on these data, exposure levels do seem fairly random, which is what you would expect from an environmental toxin that percolates up through the food chain. They are often somewhat indiscriminate.
This similarity in baseline characteristics between people with low or high blood levels of PBDE also allows us to make some stronger inferences about the observed outcomes. Let’s take a look at them.
After adjustment for baseline factors, individuals in the highest PBDE group had a 43% higher rate of death from any cause over the follow-up period. This was not enough to achieve statistical significance, but it was close.
But the key finding is deaths due to cancer. After adjustment, cancer deaths occurred four times as frequently among those in the high PBDE group, and that is a statistically significant difference.
To be fair, cancer deaths were rare in this cohort. The vast majority of people did not die of anything during the follow-up period regardless of PBDE level. But the data are strongly suggestive of the carcinogenicity of these chemicals.
I should also point out that the researchers are linking the PBDE level at a single time point to all these future events. If PBDE levels remain relatively stable within an individual over time, that’s fine, but if they tend to vary with intake of different foods for example, this would not be captured and would actually lead to an underestimation of the cancer risk.
The researchers also didn’t have granular enough data to determine the type of cancer, but they do show that rates are similar between men and women, which might point away from the more sex-specific cancer etiologies. Clearly, some more work is needed.
Of course, I started this piece by telling you that these chemicals are already pretty much banned in the United States. What are we supposed to do about these findings? Studies have examined the primary ongoing sources of PBDE in our environment and it seems like most of our exposure will be coming from the food we eat due to that biomagnification thing: high-fat fish, meat and dairy products, and fish oil supplements. It may be worth some investigation into the relative adulteration of these products with this new old carcinogen.
Dr. F. Perry Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
These types of stories usually end with a call for regulation — to ban said chemical or substance, or to regulate it — but in this case, that has already happened. This new carcinogen I’m telling you about is actually an old chemical. And it has not been manufactured or legally imported in the US since 2013.
So, why bother? Because in this case, the chemical — or, really, a group of chemicals called polybrominated diphenyl ethers (PBDEs) — are still around: in our soil, in our food, and in our blood.
PBDEs are a group of compounds that confer flame-retardant properties to plastics, and they were used extensively in the latter part of the 20th century in electronic enclosures, business equipment, and foam cushioning in upholstery.
But there was a problem. They don’t chemically bond to plastics; they are just sort of mixed in, which means they can leach out. They are hydrophobic, meaning they don’t get washed out of soil, and, when ingested or inhaled by humans, they dissolve in our fat stores, making it difficult for our normal excretory systems to excrete them.
PBDEs biomagnify. Small animals can take them up from contaminated soil or water, and those animals are eaten by larger animals, which accumulate higher concentrations of the chemicals. This bioaccumulation increases as you move up the food web until you get to an apex predator — like you and me.
This is true of lots of chemicals, of course. The concern arises when these chemicals are toxic. To date, the toxicity data for PBDEs were pretty limited. There were some animal studies where rats were exposed to extremely high doses and they developed liver lesions — but I am always very wary of extrapolating high-dose rat toxicity studies to humans. There was also some suggestion that the chemicals could be endocrine disruptors, affecting breast and thyroid tissue.
What about cancer? In 2016, the International Agency for Research on Cancer concluded there was “inadequate evidence in humans for the carcinogencity of” PBDEs.
In the same report, though, they suggested PBDEs are “probably carcinogenic to humans” based on mechanistic studies.
In other words, we can’t prove they’re cancerous — but come on, they probably are.
Finally, we have some evidence that really pushes us toward the carcinogenic conclusion, in the form of this study, appearing in JAMA Network Open. It’s a nice bit of epidemiology leveraging the population-based National Health and Nutrition Examination Survey (NHANES).
Researchers measured PBDE levels in blood samples from 1100 people enrolled in NHANES in 2003 and 2004 and linked them to death records collected over the next 20 years or so.
The first thing to note is that the researchers were able to measure PBDEs in the blood samples. They were in there. They were detectable. And they were variable. Dividing the 1100 participants into low, medium, and high PBDE tertiles, you can see a nearly 10-fold difference across the population.
Importantly, not many baseline variables correlated with PBDE levels. People in the highest group were a bit younger but had a fairly similar sex distribution, race, ethnicity, education, income, physical activity, smoking status, and body mass index.
This is not a randomized trial, of course — but at least based on these data, exposure levels do seem fairly random, which is what you would expect from an environmental toxin that percolates up through the food chain. They are often somewhat indiscriminate.
This similarity in baseline characteristics between people with low or high blood levels of PBDE also allows us to make some stronger inferences about the observed outcomes. Let’s take a look at them.
After adjustment for baseline factors, individuals in the highest PBDE group had a 43% higher rate of death from any cause over the follow-up period. This was not enough to achieve statistical significance, but it was close.
But the key finding is deaths due to cancer. After adjustment, cancer deaths occurred four times as frequently among those in the high PBDE group, and that is a statistically significant difference.
To be fair, cancer deaths were rare in this cohort. The vast majority of people did not die of anything during the follow-up period regardless of PBDE level. But the data are strongly suggestive of the carcinogenicity of these chemicals.
I should also point out that the researchers are linking the PBDE level at a single time point to all these future events. If PBDE levels remain relatively stable within an individual over time, that’s fine, but if they tend to vary with intake of different foods for example, this would not be captured and would actually lead to an underestimation of the cancer risk.
The researchers also didn’t have granular enough data to determine the type of cancer, but they do show that rates are similar between men and women, which might point away from the more sex-specific cancer etiologies. Clearly, some more work is needed.
Of course, I started this piece by telling you that these chemicals are already pretty much banned in the United States. What are we supposed to do about these findings? Studies have examined the primary ongoing sources of PBDE in our environment and it seems like most of our exposure will be coming from the food we eat due to that biomagnification thing: high-fat fish, meat and dairy products, and fish oil supplements. It may be worth some investigation into the relative adulteration of these products with this new old carcinogen.
Dr. F. Perry Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
This transcript has been edited for clarity.
These types of stories usually end with a call for regulation — to ban said chemical or substance, or to regulate it — but in this case, that has already happened. This new carcinogen I’m telling you about is actually an old chemical. And it has not been manufactured or legally imported in the US since 2013.
So, why bother? Because in this case, the chemical — or, really, a group of chemicals called polybrominated diphenyl ethers (PBDEs) — are still around: in our soil, in our food, and in our blood.
PBDEs are a group of compounds that confer flame-retardant properties to plastics, and they were used extensively in the latter part of the 20th century in electronic enclosures, business equipment, and foam cushioning in upholstery.
But there was a problem. They don’t chemically bond to plastics; they are just sort of mixed in, which means they can leach out. They are hydrophobic, meaning they don’t get washed out of soil, and, when ingested or inhaled by humans, they dissolve in our fat stores, making it difficult for our normal excretory systems to excrete them.
PBDEs biomagnify. Small animals can take them up from contaminated soil or water, and those animals are eaten by larger animals, which accumulate higher concentrations of the chemicals. This bioaccumulation increases as you move up the food web until you get to an apex predator — like you and me.
This is true of lots of chemicals, of course. The concern arises when these chemicals are toxic. To date, the toxicity data for PBDEs were pretty limited. There were some animal studies where rats were exposed to extremely high doses and they developed liver lesions — but I am always very wary of extrapolating high-dose rat toxicity studies to humans. There was also some suggestion that the chemicals could be endocrine disruptors, affecting breast and thyroid tissue.
What about cancer? In 2016, the International Agency for Research on Cancer concluded there was “inadequate evidence in humans for the carcinogencity of” PBDEs.
In the same report, though, they suggested PBDEs are “probably carcinogenic to humans” based on mechanistic studies.
In other words, we can’t prove they’re cancerous — but come on, they probably are.
Finally, we have some evidence that really pushes us toward the carcinogenic conclusion, in the form of this study, appearing in JAMA Network Open. It’s a nice bit of epidemiology leveraging the population-based National Health and Nutrition Examination Survey (NHANES).
Researchers measured PBDE levels in blood samples from 1100 people enrolled in NHANES in 2003 and 2004 and linked them to death records collected over the next 20 years or so.
The first thing to note is that the researchers were able to measure PBDEs in the blood samples. They were in there. They were detectable. And they were variable. Dividing the 1100 participants into low, medium, and high PBDE tertiles, you can see a nearly 10-fold difference across the population.
Importantly, not many baseline variables correlated with PBDE levels. People in the highest group were a bit younger but had a fairly similar sex distribution, race, ethnicity, education, income, physical activity, smoking status, and body mass index.
This is not a randomized trial, of course — but at least based on these data, exposure levels do seem fairly random, which is what you would expect from an environmental toxin that percolates up through the food chain. They are often somewhat indiscriminate.
This similarity in baseline characteristics between people with low or high blood levels of PBDE also allows us to make some stronger inferences about the observed outcomes. Let’s take a look at them.
After adjustment for baseline factors, individuals in the highest PBDE group had a 43% higher rate of death from any cause over the follow-up period. This was not enough to achieve statistical significance, but it was close.
But the key finding is deaths due to cancer. After adjustment, cancer deaths occurred four times as frequently among those in the high PBDE group, and that is a statistically significant difference.
To be fair, cancer deaths were rare in this cohort. The vast majority of people did not die of anything during the follow-up period regardless of PBDE level. But the data are strongly suggestive of the carcinogenicity of these chemicals.
I should also point out that the researchers are linking the PBDE level at a single time point to all these future events. If PBDE levels remain relatively stable within an individual over time, that’s fine, but if they tend to vary with intake of different foods for example, this would not be captured and would actually lead to an underestimation of the cancer risk.
The researchers also didn’t have granular enough data to determine the type of cancer, but they do show that rates are similar between men and women, which might point away from the more sex-specific cancer etiologies. Clearly, some more work is needed.
Of course, I started this piece by telling you that these chemicals are already pretty much banned in the United States. What are we supposed to do about these findings? Studies have examined the primary ongoing sources of PBDE in our environment and it seems like most of our exposure will be coming from the food we eat due to that biomagnification thing: high-fat fish, meat and dairy products, and fish oil supplements. It may be worth some investigation into the relative adulteration of these products with this new old carcinogen.
Dr. F. Perry Wilson is associate professor of medicine and public health and director of the Clinical and Translational Research Accelerator at Yale University, New Haven, Conn. He has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
Vitamin D Supplements May Be a Double-Edged Sword
This transcript has been edited for clarity.
Welcome to Impact Factor, your weekly dose of commentary on a new medical study. I’m Dr F. Perry Wilson of the Yale School of Medicine.
Imagine, if you will, the great Cathedral of Our Lady of Correlation. You walk through the majestic oak doors depicting the link between ice cream sales and shark attacks, past the rose window depicting the cardiovascular benefits of red wine, and down the aisles frescoed in dramatic images showing how Facebook usage is associated with less life satisfaction. And then you reach the altar, the holy of holies where, emblazoned in shimmering pyrite, you see the patron saint of this church: vitamin D.
Yes, if you’ve watched this space, then you know that I have little truck with the wildly popular supplement. In all of clinical research, I believe that there is no molecule with stronger data for correlation and weaker data for causation.
Low serum vitamin D levels have been linked to higher risks for heart disease, cancer, falls, COVID, dementia, C diff, and others. And yet, when we do randomized trials of vitamin D supplementation — the thing that can prove that the low level was causally linked to the outcome of interest — we get negative results.
Trials aren’t perfect, of course, and we’ll talk in a moment about a big one that had some issues. But we are at a point where we need to either be vitamin D apologists, saying, “Forget what those lying RCTs tell you and buy this supplement” — an $800 million-a-year industry, by the way — or conclude that vitamin D levels are a convenient marker of various lifestyle factors that are associated with better outcomes: markers of exercise, getting outside, eating a varied diet.
Or perhaps vitamin D supplements have real effects. It’s just that the beneficial effects are matched by the harmful ones. Stay tuned.
The Women’s Health Initiative remains among the largest randomized trials of vitamin D and calcium supplementation ever conducted — and a major contributor to the negative outcomes of vitamin D trials.
But if you dig into the inclusion and exclusion criteria for this trial, you’ll find that individuals were allowed to continue taking vitamins and supplements while they were in the trial, regardless of their randomization status. In fact, the majority took supplements at baseline, and more took supplements over time.
That means, of course, that people in the placebo group, who were getting sugar pills instead of vitamin D and calcium, may have been taking vitamin D and calcium on the side. That would certainly bias the results of the trial toward the null, which is what the primary analyses showed. To wit, the original analysis of the Women’s Health Initiative trial showed no effect of randomization to vitamin D supplementation on improving cancer or cardiovascular outcomes.
But the Women’s Health Initiative trial started 30 years ago. Today, with the benefit of decades of follow-up, we can re-investigate — and perhaps re-litigate — those findings, courtesy of this study, “Long-Term Effect of Randomization to Calcium and Vitamin D Supplementation on Health in Older Women” appearing in Annals of Internal Medicine.
Dr Cynthia Thomson, of the Mel and Enid Zuckerman College of Public Health at the University of Arizona, and colleagues led this updated analysis focused on two findings that had been hinted at, but not statistically confirmed, in other vitamin D studies: a potential for the supplement to reduce the risk for cancer, and a potential for it to increase the risk for heart disease.
The randomized trial itself only lasted 7 years. What we are seeing in this analysis of 36,282 women is outcomes that happened at any time from randomization to the end of 2023 — around 20 years after the randomization to supplementation stopped. But, the researchers would argue, that’s probably okay. Cancer and heart disease take time to develop; we see lung cancer long after people stop smoking. So a history of consistent vitamin D supplementation may indeed be protective — or harmful.
Here are the top-line results. Those randomized to vitamin D and calcium supplementation had a 7% reduction in the rate of death from cancer, driven primarily by a reduction in colorectal cancer. This was statistically significant. Also statistically significant? Those randomized to supplementation had a 6% increase in the rate of death from cardiovascular disease. Put those findings together and what do you get? Stone-cold nothing, in terms of overall mortality.
Okay, you say, but what about all that supplementation that was happening outside of the context of the trial, biasing our results toward the null?
The researchers finally clue us in.
First of all, I’ll tell you that, yes, people who were supplementing outside of the trial had higher baseline vitamin D levels — a median of 54.5 nmol/L vs 32.8 nmol/L. This may be because they were supplementing with vitamin D, but it could also be because people who take supplements tend to do other healthy things — another correlation to add to the great cathedral.
To get a better view of the real effects of randomization, the authors restricted the analysis to just those who did not use outside supplements. If vitamin D supplements help, then these are the people they should help. This group had about a 11% reduction in the incidence of cancer — statistically significant — and a 7% reduction in cancer mortality that did not meet the bar for statistical significance.
There was no increase in cardiovascular disease among this group. But this small effect on cancer was nowhere near enough to significantly reduce the rate of all-cause mortality.
Among those using supplements, vitamin D supplementation didn’t really move the needle on any outcome.
I know what you’re thinking: How many of these women were vitamin D deficient when we got started? These results may simply be telling us that people who have normal vitamin D levels are fine to go without supplementation.
Nearly three fourths of women who were not taking supplements entered the trial with vitamin D levels below the 50 nmol/L cutoff that the authors suggest would qualify for deficiency. Around half of those who used supplements were deficient. And yet, frustratingly, I could not find data on the effect of randomization to supplementation stratified by baseline vitamin D level. I even reached out to Dr Thomson to ask about this. She replied, “We did not stratify on baseline values because the numbers are too small statistically to test this.” Sorry.
In the meantime, I can tell you that for your “average woman,” vitamin D supplementation likely has no effect on mortality. It might modestly reduce the risk for certain cancers while increasing the risk for heart disease (probably through coronary calcification). So, there might be some room for personalization here. Perhaps women with a strong family history of cancer or other risk factors would do better with supplements, and those with a high risk for heart disease would do worse. Seems like a strategy that could be tested in a clinical trial. But maybe we could ask the participants to give up their extracurricular supplement use before they enter the trial. F. Perry Wilson, MD, MSCE, has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
F. Perry Wilson, MD, MSCE, is an associate professor of medicine and public health and director of Yale’s Clinical and Translational Research Accelerator. His science communication work can be found in the Huffington Post, on NPR, and here on Medscape. He tweets @fperrywilson and his book, How Medicine Works and When It Doesn’t, is available now.
This transcript has been edited for clarity.
Welcome to Impact Factor, your weekly dose of commentary on a new medical study. I’m Dr F. Perry Wilson of the Yale School of Medicine.
Imagine, if you will, the great Cathedral of Our Lady of Correlation. You walk through the majestic oak doors depicting the link between ice cream sales and shark attacks, past the rose window depicting the cardiovascular benefits of red wine, and down the aisles frescoed in dramatic images showing how Facebook usage is associated with less life satisfaction. And then you reach the altar, the holy of holies where, emblazoned in shimmering pyrite, you see the patron saint of this church: vitamin D.
Yes, if you’ve watched this space, then you know that I have little truck with the wildly popular supplement. In all of clinical research, I believe that there is no molecule with stronger data for correlation and weaker data for causation.
Low serum vitamin D levels have been linked to higher risks for heart disease, cancer, falls, COVID, dementia, C diff, and others. And yet, when we do randomized trials of vitamin D supplementation — the thing that can prove that the low level was causally linked to the outcome of interest — we get negative results.
Trials aren’t perfect, of course, and we’ll talk in a moment about a big one that had some issues. But we are at a point where we need to either be vitamin D apologists, saying, “Forget what those lying RCTs tell you and buy this supplement” — an $800 million-a-year industry, by the way — or conclude that vitamin D levels are a convenient marker of various lifestyle factors that are associated with better outcomes: markers of exercise, getting outside, eating a varied diet.
Or perhaps vitamin D supplements have real effects. It’s just that the beneficial effects are matched by the harmful ones. Stay tuned.
The Women’s Health Initiative remains among the largest randomized trials of vitamin D and calcium supplementation ever conducted — and a major contributor to the negative outcomes of vitamin D trials.
But if you dig into the inclusion and exclusion criteria for this trial, you’ll find that individuals were allowed to continue taking vitamins and supplements while they were in the trial, regardless of their randomization status. In fact, the majority took supplements at baseline, and more took supplements over time.
That means, of course, that people in the placebo group, who were getting sugar pills instead of vitamin D and calcium, may have been taking vitamin D and calcium on the side. That would certainly bias the results of the trial toward the null, which is what the primary analyses showed. To wit, the original analysis of the Women’s Health Initiative trial showed no effect of randomization to vitamin D supplementation on improving cancer or cardiovascular outcomes.
But the Women’s Health Initiative trial started 30 years ago. Today, with the benefit of decades of follow-up, we can re-investigate — and perhaps re-litigate — those findings, courtesy of this study, “Long-Term Effect of Randomization to Calcium and Vitamin D Supplementation on Health in Older Women” appearing in Annals of Internal Medicine.
Dr Cynthia Thomson, of the Mel and Enid Zuckerman College of Public Health at the University of Arizona, and colleagues led this updated analysis focused on two findings that had been hinted at, but not statistically confirmed, in other vitamin D studies: a potential for the supplement to reduce the risk for cancer, and a potential for it to increase the risk for heart disease.
The randomized trial itself only lasted 7 years. What we are seeing in this analysis of 36,282 women is outcomes that happened at any time from randomization to the end of 2023 — around 20 years after the randomization to supplementation stopped. But, the researchers would argue, that’s probably okay. Cancer and heart disease take time to develop; we see lung cancer long after people stop smoking. So a history of consistent vitamin D supplementation may indeed be protective — or harmful.
Here are the top-line results. Those randomized to vitamin D and calcium supplementation had a 7% reduction in the rate of death from cancer, driven primarily by a reduction in colorectal cancer. This was statistically significant. Also statistically significant? Those randomized to supplementation had a 6% increase in the rate of death from cardiovascular disease. Put those findings together and what do you get? Stone-cold nothing, in terms of overall mortality.
Okay, you say, but what about all that supplementation that was happening outside of the context of the trial, biasing our results toward the null?
The researchers finally clue us in.
First of all, I’ll tell you that, yes, people who were supplementing outside of the trial had higher baseline vitamin D levels — a median of 54.5 nmol/L vs 32.8 nmol/L. This may be because they were supplementing with vitamin D, but it could also be because people who take supplements tend to do other healthy things — another correlation to add to the great cathedral.
To get a better view of the real effects of randomization, the authors restricted the analysis to just those who did not use outside supplements. If vitamin D supplements help, then these are the people they should help. This group had about a 11% reduction in the incidence of cancer — statistically significant — and a 7% reduction in cancer mortality that did not meet the bar for statistical significance.
There was no increase in cardiovascular disease among this group. But this small effect on cancer was nowhere near enough to significantly reduce the rate of all-cause mortality.
Among those using supplements, vitamin D supplementation didn’t really move the needle on any outcome.
I know what you’re thinking: How many of these women were vitamin D deficient when we got started? These results may simply be telling us that people who have normal vitamin D levels are fine to go without supplementation.
Nearly three fourths of women who were not taking supplements entered the trial with vitamin D levels below the 50 nmol/L cutoff that the authors suggest would qualify for deficiency. Around half of those who used supplements were deficient. And yet, frustratingly, I could not find data on the effect of randomization to supplementation stratified by baseline vitamin D level. I even reached out to Dr Thomson to ask about this. She replied, “We did not stratify on baseline values because the numbers are too small statistically to test this.” Sorry.
In the meantime, I can tell you that for your “average woman,” vitamin D supplementation likely has no effect on mortality. It might modestly reduce the risk for certain cancers while increasing the risk for heart disease (probably through coronary calcification). So, there might be some room for personalization here. Perhaps women with a strong family history of cancer or other risk factors would do better with supplements, and those with a high risk for heart disease would do worse. Seems like a strategy that could be tested in a clinical trial. But maybe we could ask the participants to give up their extracurricular supplement use before they enter the trial. F. Perry Wilson, MD, MSCE, has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
F. Perry Wilson, MD, MSCE, is an associate professor of medicine and public health and director of Yale’s Clinical and Translational Research Accelerator. His science communication work can be found in the Huffington Post, on NPR, and here on Medscape. He tweets @fperrywilson and his book, How Medicine Works and When It Doesn’t, is available now.
This transcript has been edited for clarity.
Welcome to Impact Factor, your weekly dose of commentary on a new medical study. I’m Dr F. Perry Wilson of the Yale School of Medicine.
Imagine, if you will, the great Cathedral of Our Lady of Correlation. You walk through the majestic oak doors depicting the link between ice cream sales and shark attacks, past the rose window depicting the cardiovascular benefits of red wine, and down the aisles frescoed in dramatic images showing how Facebook usage is associated with less life satisfaction. And then you reach the altar, the holy of holies where, emblazoned in shimmering pyrite, you see the patron saint of this church: vitamin D.
Yes, if you’ve watched this space, then you know that I have little truck with the wildly popular supplement. In all of clinical research, I believe that there is no molecule with stronger data for correlation and weaker data for causation.
Low serum vitamin D levels have been linked to higher risks for heart disease, cancer, falls, COVID, dementia, C diff, and others. And yet, when we do randomized trials of vitamin D supplementation — the thing that can prove that the low level was causally linked to the outcome of interest — we get negative results.
Trials aren’t perfect, of course, and we’ll talk in a moment about a big one that had some issues. But we are at a point where we need to either be vitamin D apologists, saying, “Forget what those lying RCTs tell you and buy this supplement” — an $800 million-a-year industry, by the way — or conclude that vitamin D levels are a convenient marker of various lifestyle factors that are associated with better outcomes: markers of exercise, getting outside, eating a varied diet.
Or perhaps vitamin D supplements have real effects. It’s just that the beneficial effects are matched by the harmful ones. Stay tuned.
The Women’s Health Initiative remains among the largest randomized trials of vitamin D and calcium supplementation ever conducted — and a major contributor to the negative outcomes of vitamin D trials.
But if you dig into the inclusion and exclusion criteria for this trial, you’ll find that individuals were allowed to continue taking vitamins and supplements while they were in the trial, regardless of their randomization status. In fact, the majority took supplements at baseline, and more took supplements over time.
That means, of course, that people in the placebo group, who were getting sugar pills instead of vitamin D and calcium, may have been taking vitamin D and calcium on the side. That would certainly bias the results of the trial toward the null, which is what the primary analyses showed. To wit, the original analysis of the Women’s Health Initiative trial showed no effect of randomization to vitamin D supplementation on improving cancer or cardiovascular outcomes.
But the Women’s Health Initiative trial started 30 years ago. Today, with the benefit of decades of follow-up, we can re-investigate — and perhaps re-litigate — those findings, courtesy of this study, “Long-Term Effect of Randomization to Calcium and Vitamin D Supplementation on Health in Older Women” appearing in Annals of Internal Medicine.
Dr Cynthia Thomson, of the Mel and Enid Zuckerman College of Public Health at the University of Arizona, and colleagues led this updated analysis focused on two findings that had been hinted at, but not statistically confirmed, in other vitamin D studies: a potential for the supplement to reduce the risk for cancer, and a potential for it to increase the risk for heart disease.
The randomized trial itself only lasted 7 years. What we are seeing in this analysis of 36,282 women is outcomes that happened at any time from randomization to the end of 2023 — around 20 years after the randomization to supplementation stopped. But, the researchers would argue, that’s probably okay. Cancer and heart disease take time to develop; we see lung cancer long after people stop smoking. So a history of consistent vitamin D supplementation may indeed be protective — or harmful.
Here are the top-line results. Those randomized to vitamin D and calcium supplementation had a 7% reduction in the rate of death from cancer, driven primarily by a reduction in colorectal cancer. This was statistically significant. Also statistically significant? Those randomized to supplementation had a 6% increase in the rate of death from cardiovascular disease. Put those findings together and what do you get? Stone-cold nothing, in terms of overall mortality.
Okay, you say, but what about all that supplementation that was happening outside of the context of the trial, biasing our results toward the null?
The researchers finally clue us in.
First of all, I’ll tell you that, yes, people who were supplementing outside of the trial had higher baseline vitamin D levels — a median of 54.5 nmol/L vs 32.8 nmol/L. This may be because they were supplementing with vitamin D, but it could also be because people who take supplements tend to do other healthy things — another correlation to add to the great cathedral.
To get a better view of the real effects of randomization, the authors restricted the analysis to just those who did not use outside supplements. If vitamin D supplements help, then these are the people they should help. This group had about a 11% reduction in the incidence of cancer — statistically significant — and a 7% reduction in cancer mortality that did not meet the bar for statistical significance.
There was no increase in cardiovascular disease among this group. But this small effect on cancer was nowhere near enough to significantly reduce the rate of all-cause mortality.
Among those using supplements, vitamin D supplementation didn’t really move the needle on any outcome.
I know what you’re thinking: How many of these women were vitamin D deficient when we got started? These results may simply be telling us that people who have normal vitamin D levels are fine to go without supplementation.
Nearly three fourths of women who were not taking supplements entered the trial with vitamin D levels below the 50 nmol/L cutoff that the authors suggest would qualify for deficiency. Around half of those who used supplements were deficient. And yet, frustratingly, I could not find data on the effect of randomization to supplementation stratified by baseline vitamin D level. I even reached out to Dr Thomson to ask about this. She replied, “We did not stratify on baseline values because the numbers are too small statistically to test this.” Sorry.
In the meantime, I can tell you that for your “average woman,” vitamin D supplementation likely has no effect on mortality. It might modestly reduce the risk for certain cancers while increasing the risk for heart disease (probably through coronary calcification). So, there might be some room for personalization here. Perhaps women with a strong family history of cancer or other risk factors would do better with supplements, and those with a high risk for heart disease would do worse. Seems like a strategy that could be tested in a clinical trial. But maybe we could ask the participants to give up their extracurricular supplement use before they enter the trial. F. Perry Wilson, MD, MSCE, has disclosed no relevant financial relationships.
A version of this article appeared on Medscape.com.
F. Perry Wilson, MD, MSCE, is an associate professor of medicine and public health and director of Yale’s Clinical and Translational Research Accelerator. His science communication work can be found in the Huffington Post, on NPR, and here on Medscape. He tweets @fperrywilson and his book, How Medicine Works and When It Doesn’t, is available now.