First a review of the main trial then to the new paper in the Journal of the American College of Cardiology.
SELECT Trial
The SELECT trial was the first to show that the glucagon-like-peptide-1 receptor agonist (GLP1) semaglutide could actually modify cardiovascular disease. The discovery of another disease-modifying agent for heart disease is a breakthrough.
SELECT randomized more than 17,000 patients with cardiovascular disease and obesity but without diabetes to semaglutide or placebo. Semaglutide led to a 20% reduction in a composite primary endpoint of CV death, myocardial infarction or stroke. The hazard ratio was 0.80 and the 95% confidence intervals went from 0.72-0.90. The p-value was very low.
We say that this was a clinically important and statistically robust finding.
Now let’s notice a few things about the results.
First is the number of patients in the study. The authors did not enroll a few hundred or a few thousand. They enrolled 17,604 patients.
Next point: look at the number of primary outcome events. There were 569 events in the semaglutide arm vs 701 events in the placebo arm. That is a lot of events. The authors knew that it would take that many events to sort out signal from noise.
Third point: the authors knew that people would be interested in other outcomes besides the composite of CVD, MI and stroke. The problem with this is that one specific outcome will surely have fewer events than 3 combined. Fewer events means that there is a greater chance of detecting noise rather than signal.
To account for mistaking noise and signal they used a rule: if the rate of death from cardiovascular causes was not statistically significant, then no other outcomes could be assessed. See the image:
The lower rate of cardiovascular death did not make statistical significance. The upper bound of the 95% confidence interval is more than 1.0 and the p-value is more than 0.05. All other outcomes below that (MI, stroke, death) were deemed “NA”.
The Dubious Substudy
For some reason, the authors of SELECT decided to break this rule and look at outcomes below CV death.
The outcome that made the title was COVID-19 related deaths. Sadly, the prominent journal JACC published the study.
They first separated out CV related and non-CV related deaths. Then they told us the most common cause of non-CV death was infections—62 vs 87, semaglutide vs placebo, respectively.
Then they went further. The trial was conducted during the pandemic and the authors collected data on infections and complications of infections.
Compared with placebo, semaglutide did not reduce the number of patients with a reported case of COVID-19 (2,108 vs 2,150 events; P = 0.46).
However, among patients who reported a diagnosis of COVID-19, fewer patients treated with semaglutide had serious COVID-19–related adverse events (232 [2.6%] vs 277 [3.1%]; P = 0.04).
They add this last sentence in the abstract:
High rates of infectious deaths occurred during the COVID-19 pandemic, with less infectious death in the semaglutide arm, and resulted in fewer participants in the placebo group being at risk for CV death.
Then in the conclusion there are these two sentences:
The lower rate of non-CV death with semaglutide was predominantly because of fewer infectious deaths. These findings highlight the effect of semaglutide on mortality across a broad population of patients with CV disease and obesity.
Comments
“Fewer infectious deaths” equals 62 vs 87. In a trial with 17,000 patients.
I hope that you can see how bad this is. In the main paper, to sort out the drug’s signal they recruited 17,000 patients, which then resulted in ≈ 1200 events. The authors then set out a rule that would prevent them from detecting noise. The rule was that CV death had to be significant to look further. It wasn’t, so they made no other claims about MI or stroke or all-cause death in the main paper.
Now they come out with a separate paper on infections—62 vs 87 events. And COVID-19 related adverse events—which they don’t describe, but number way less than then the total number of major adverse cardiac events.
All we get is one sentence in the limitations about testing subgroups may lead to underpowered analyses and chance findings. One sentence.
I highlight this study because it should never have happened. The authors broke their own rule about looking at low-frequency outcomes beyond the primary endpoint.
I am not sure I would go so far as saying this was p-hacking, but looking at data after you know the results and publishing positive findings is really problematic.
That the positive outcome had COVID-19 in it may have played a role in the study’s publication. It surely had a role in it being covered by 44 news outlets and Tweeted by hundreds of people—including some prominent MD-influencers.
Papers like this reduce trust in medical journals and scientists alike. I am not sure why medical scientists and journal editors can’t resist the urge to fly too close to the sun.
The SELECT trial was a huge win for cardiology. That should be enough.
John Mandrola--Heart rhythm doc, writer/podcaster for @Medscape, learner, cyclist, married to an #HPM doctor. #MedicalConservative. The more you see, the harder medicine gets
It is widely recognized that most children with gender dysphoria (GD) will come to terms with their sex and not live as transgender adults. Transition advocates contend, however, that administering irreversible endocrine and surgical interventions to adolescents is not a problem because, unlike childhood-onset GD, adolescent GD almost never remits.
This view is encapsulated in a quote from Stephen Rosenthal, a notable U.S. gender physician, in an article for Nature Reviews Endocrinology, one of the highest-ranked peer-reviewed medical journals: “Longitudinal studies have indicated that the emergence or worsening of gender dysphoria with pubertal onset is associated with a very high likelihood of being a transgender adult. This observation is central to the rationale for medical intervention in eligible transgender adolescents” (emphasis added).
Like many assertions in youth gender medicine, the claim about the near-permanence of adolescent gender dysphoria (GD) has never been properly tested. (How these studies are designed makes them incapable of answering this question, which is probably why Rosenthal uses the vague word “indicate[s].”) So we decided to test it ourselves. Our findings, from an ongoing Manhattan Institute analysis of an all-payer, all-claims national insurance database, challenge this “central” belief underpinning youth gender medicine. In fact, the rate of persistence of the gender dysphoria diagnosis for youth over seven years is 42.2 percent to 49.9 percent, with the trend line suggesting likely future declines.
Our findings are highly significant for the debate over youth gender medicine. Treatments with permanent effects, and that include negative impacts on health and functioning, should not be offered to patients—especially not minors—with a diagnosis likely to disappear after a few years.
Like our prior analysis of the number of mastectomies performed on minors, this analysis is based on a comprehensive database of insurance health claims in the United States containing health-care encounter data for about 85 percent of the insured U.S. population. Since American insurance rates are high (about 90 percent of the U.S. population overall, and 95 percent of children, are insured), this is probably one of the most comprehensive resources for health care-related inquires.
In the first part of our analysis, we estimated the number of U.S. minors (age 17.5 and younger) who have had a gender-related diagnosis between 2017 and 2023. Our data show between 272,181 and 342,476 such cases. The smaller number in this range comes from only using the International Classification of Diseases (ICD) diagnostic category F64, which captures the diagnoses of “gender identity disorders” (see Table 1a below). F64 is also used to capture “gender dysphoria” and “gender incongruence.” For simplicity, we will refer to this group as the “GD” group. The bigger number comes from adding two more ICD diagnoses commonly used to signal gender-related concerns: E34.9 (“endocrine disorder, unspecified”) and Z87.890 (“personal history of sex reassignment”). For simplicity, we will refer to this group as “GD+”. Further accounting for the estimated 15 percent missing claims in our database, we get a range of roughly 320,000 to 400,000 minors who were diagnosed with GD/GD+ at some point between 2017–2023.
Table 1b: F64 “gender identity disorder” and related diagnoses (GD+) for youth (<18 years)
2017
2018
2019
2020
2021
2022
2023*
Total***
F640 - Transsexualism
16,740
21,112
27,902
32,479
42,771
47,160
39,318
143,905
F649 - Gender identity disorder, unspecified
8,166
13,143
21,038
26,595
43,541
49,330
43,913
138,627
E349 - Endocrine disorder, unspecified
16,326
15,192
16,323
16,037
19,587
20,920
16,639
95,069
F642 - Gender identity disorder of childhood
7,303
8,621
11,095
13,259
21,708
23,694
18,183
68,079
F641 - Dual role transvestism
7,343
6,363
6,911
6,528
7,005
6,887
4,454
31,942
F648 - Other gender identity disorders
1,141
1,477
2,055
2,193
4,349
5,205
4,156
15,844
Z87890 - Personal history of sex reassignment
703
714
888
828
1,068
1,134
787
5,316
F64 - MISSING DESC
19
14
25
427
306
108
41
850
Total
44,930
50,700
63,672
72,601
104,190
115,270
99,005 **
342,476
*The 2023 data contain around 90 percent of total expected claim volume for that year due to the known issue of “claim runout”—claims for services incurred at the end of the calendar year are not always submitted in a timely manner, leading to an undercounting of such claims. However, since patients with GD have, on average, four-five diagnoses per year, while 2023 may represent a slight undercount, it is much likely to be less than 10 percent, as most patients would have already presented with the diagnoses earlier in the year and would have been captured in our data.
**It appears that the number of GD-related diagnoses in 2023 have dropped substantially. We are undertaking a separate analysis of this preliminary finding. Our current analysis suggests that though states that imposed age limits on medical transition had the highest drops in the diagnosed prevalence of GD, all the states, including those that became “sanctuary” states for minor transition, seem to have experienced notable declines in 2023.
*** The numbers in the year columns represent the diagnostic prevalence (unique count of patients with the diagnosis) for that year. The number in the “Total” column is the total number of unique patients for each diagnosis for the years 2017 to 2023. The number in the “Total” column is thus less than the sum of the individual columns.
Having established the size of the population of youth with GD and GD+ in our dataset to be between 272,181 (320,000) and 342,476 (400,000) cases, we focused next on the key question: What is the evidence that gender dysphoria in adolescents is so persistent as to be regarded in clinical settings as permanent? A high rate of persistence would suggest that adolescents with GD are, in fact, “transgender adolescents,” meaning they will go on to live their lives as adults who feel severe discomfort with their sex.
To estimate the diagnostic persistence rate of GD, we created a baseline cohort of minors who had the diagnosis of GD (“F64”) in 2017 and who were continuously present in the dataset for the entire seven years through 2023, as evidenced by medical professionals billing for any health-care service for each of these patients, in every year. We then estimated the persistence of the diagnosis using various scenarios in order to test the robustness of our findings.
*GD consists of all F64 codes
**Related diagnoses (GD+) consist of all F64 (gender identity disorders) codes, as well as F651 (transvestic fetishism); E34.9 (endocrine disorder, unspecified); Z87.890 (personal history of sex reassignment); and Z90.970 (acquired absence of other genital organ(s))
Table 2a: persistence of GD and GD+ in minors over 7 years, cohort-based analysis, unique patients
2017
2018
2019
2020
2021
2022
2023
7.5–17.5-year-olds, GD at baseline, GD+ at follow-up
9144
6315
5703
5324
5049
4726
4066
7.5–17.5-year-olds, GD at baseline, GD only at follow-up
9144
6192
5541
5139
4855
4491
3856
12.5–17.5-year-olds, GD at baseline, GD+ at follow-up
6616
4690
4264
3997
3774
3537
3058
12.5–17.5-year-olds, GD at baseline, GD only at follow-up
6616
4585
4126
3836
3606
3348
2891
12.5–17.5-year-olds, 2 diagnoses of GD (GD and GD+ at baseline in 180 days), GD+ at follow-up
4800
3915
3449
3149
2975
2759
2395
Table 2b: persistence of GD and GD+ in minors over 7 years, cohort-based analysis, percent
2017
2018
2019
2020
2021
2022
2023
7.5–17.5-year-olds, GD at baseline, GD+ at follow-up
100 percent
69.1 percent
62.4 percent
58.2 percent
55.2 percent
51.7 percent
44.5 percent
7.5–17.5-year-olds, GD at baseline, GD only at follow-up
100 percent
67.7 percent
60.6 percent
56.2 percent
53.1 percent
49.1 percent
42.2 percent
12.5–17.5-year-olds, GD at baseline, GD+ at follow-up
100 percent
70.9 percent
64.4 percent
60.4 percent
57.0 percent
53.5 percent
46.2 percent
12.5–17.5-year-olds, GD at baseline, GD only at follow-up
100 percent
69.3 percent
62.4 percent
58.0 percent
54.5 percent
50.6 percent
43.7 percent
12.5–17.5-year-olds, 2 diagnoses of GD (GD and GD+ at baseline in 180 days), GD+ at follow-up
100 percent
81.6 percent
71.9 percent
65.6 percent
62.0 percent
57.5 percent
49.9 percent
For the lower end of the estimate, we coded individuals as diagnostically persistent only if they had an F64 (“gender identity disorders”) claim in the follow-up period (in addition to baseline). We recorded this as “GD.” For the upper range of the estimate of persistence, we allowed for any GD-related claim (F64, F651, Z87.890, Z90.79, E34.9) to be counted as “persistence” (but only F64 at baseline). We recorded this as “GD+.”
For HIPAA reasons, the data use agreement allowed us to conduct analysis based on five-year brackets, which limited our analysis for youth to 7.5–12.5-year-olds and 12.5–17.5-year-olds. We analyzed the entire 7.5–17.5 cohort of youth, but we also ran a separate analysis for the 12.5–17.5 subgroup. Our rationale was that while eight- and nine-year-olds could be candidates for hormonal suppression under the current protocols, the 12.5+ group is much more likely to be treated medically, and they may also have a different diagnostic-persistence rate. Our criteria produced 9,144 unique patients at baseline for the 7.5-to-17.5-year-old cohort and 6,616 for the 12.5-to-17.5-year-old cohort. Both denominators are sufficiently large to analyze diagnostic persistence.
Our analysis found that in the 12.5–17.5 age category, 43.7 percent–46.2 percent of those who had a GD diagnosis in 2017 retained a gender-related diagnosis by 2023. In the combined 7.5–17.5 age groups category, the diagnostic persistence rate was slightly lower, at 42.2 percent–44.5 percent.
As the diagnostic-persistence chart shows, across all age groups, there was a steeper drop-off between the first and second years (2017 to 2018) as compared with subsequent years. We considered different explanations for this. We ran several other analyses, starting our cohorts at other years, and continued to observe the effect of having a sharper drop-off after year one, with a flatter but ongoing reduction in subsequent years.
To account for the possibility of false positive diagnoses in the baseline year, we ran another analysis, this time requiring that the baseline cohort have at least two gender-related claims (one F64, and the other F64 or other GD-related codes) within a six-month period. Applying this approach to the older adolescent group of 12.5-to-17.5-year-olds, we saw that the diagnostic persistence rate over seven years rose only slightly, to 49.9 percent.
Finally, we ran a sub-analysis which ensured that the patients in the initial cohort were diagnosed with GD for the first time in 2018 and had no prior GD diagnosis in 2017. This truncated our follow-up period to six years (2018–2023) and resulted in even lower persistence rates (around 40 percent) in just six rather than seven years. Incidentally, this sub-analysis also resulted in a drop-off after the first year that was less sharp than in the main analysis, but still sharper than in subsequent years. These interesting findings may reflect a more robust way to analyze the data and are worth exploring further.
So, what is the takeaway from this analysis? The single biggest observation is that, contrary to what has been asserted by advocates of youth transition, most adolescents with a GD diagnosis will not have this diagnosis within as few as seven years, during the period of rapid identity development. The single most important implication is that there is no empirical basis for assuming that most adolescents presenting with GD are destined to live as gender-transitioned adults. This further suggests that the GD diagnosis presents a dubious basis for offering teens life-altering interventions with permanent impacts on health and functioning.
One should consider alternative interpretations of our findings, which are preliminary and conservative, and for which we welcome feedback. First, perhaps non-accepting parents are not allowing young people to seek medical services related to their gender distress after an initial health-care encounter, and these minors delay transition until adulthood as a result. The problem with this explanation is that, by the end of our analysis, more than half of the original cohort—5,962 out of 9,144 individuals—were nearly 18 or older, with the oldest participants approaching 25.
Another alternative explanation is that young people are getting their gender-related treatments without insurance (e.g., buying hormones off the street or paying out of pocket). This is possible, especially if an individual’s insurance carrier does not cover transition-related procedures. However, it is unlikely to explain the full extent of the drop, especially since insurers tend to cover gender-transition treatments and online providers tend not to serve patients under 18.
Further, even when age-restriction laws were enacted in some states in 2023, services related to GD treatment such as blood work or psychological care remained legal in these states and would presumably still be covered by insurance. Since these services would likely be billed with the GD diagnosis, the diagnosis would have shown up in the data. From an insurance perspective, the absence of a GD-related diagnosis on insurance claims is a reliable (if not perfect) proxy for non-pursuit of medical interventions related to GD, including medical gender transition.
A third possibility (though not technically an alternative explanation) is that some continue to identify as transgender but stop the pursuit of medical interventions of any kind, including therapy, related to their identification. This is indeed possible: young gender dysphoric people may not pursue medicalization for several reasons, including shifting identities and shifting “embodiment goals.” Notably, however, this explanation does not help those making the case for using an adolescent GD diagnosis as a basis for medical interventions with lifelong impacts on health and functioning.
Our data analysis has several limitations. First, given when we acquired our database, the 2023 data have about 90 percent of the total expected claims for the year. However, we think this has a limited impact on our analysis of diagnostic persistence, since most patients with claims related to GD have four to five or more such claims per year, according to our data. It is possible that with more complete data, the 2023 numbers would increase. Yet even if we inflate our current 2023 patient count by 10 percent, 2023 is still on track to show a decline in diagnostic prevalence, relative to 2022.
Another limitation is the inability to account for data from Kaiser Permanente, which are absent from our database because Kaiser’s is a closed billing system. Kaiser is a Top Five insurer, with a market share of about 7 percent. It may thus capture around 7 percent of transitioning youth. If the persistence rates of Kaiser patients are different, we are unable to account for it.
To summarize our key findings, the number of young people who have received a GD diagnosis in recent years is much higher than previously reported. By our conservative estimate, over 300,000 minors in the U.S. had a GD diagnosis between 2017 and 2023, which means that the condition is not rare. Even more important is that among adolescents with a GD diagnosis in 2017, over half lost their gender-related diagnoses by 2023, with future ongoing declines likely, as suggested by the trend. There is also some evidence of a sharper than usual 2023 decline, though future data would need to confirm this trend.
We are not the first to present findings that challenge the conventional wisdom among gender clinicians on the persistence of adolescent GD. A recent study from the Netherlands on “gender non-contentedness” (“unhappiness with being the gender aligned with one’s sex”) found that unhappiness with gender plummeted from 11 percent among young adolescents to 4 percent 14 years later. A German study published earlier this year and using national insurance data reported that over 60 percent of young people diagnosed with GD no longer had that diagnosis five years later. Almost three-quarters of adolescent girls aged 15 to 19—the prime demographic of rapid-onset gender dysphoria—lost their diagnosis. According to the German researchers, this means that gender dysphoria has “low diagnostic persistence.” Another data analysis, combining U.S. and other countries’ data, showed similar trends, concluding that “GD is not a permanent diagnosis.” A landmark 2022 U.S. study of military health-care records found that one quarter of adolescents who started on hormones discontinued their treatment at the four-year mark.
Doubts about the predictive value of a GD diagnosis even following comprehensive assessment also inform the Cass Review. “Although a diagnosis of gender dysphoria has been seen as necessary for initiating medical treatment,” physician Hilary Cass writes in her report to the National Health Service of England, “it is not reliably predictive of whether that young person will have longstanding gender incongruence in the future, or whether medical intervention will be the best option for them.” The Cass Review’s conclusions were informed by seven new systematic reviews of evidence, including one on care pathways.
In sum, while our analysis is the first comprehensive effort to track diagnostic persistence of GD in the U.S., our findings add to a growing international body of evidence that adolescent GD is not a permanent condition and that, given the stakes, it is irresponsible to view adolescents with GD as “transgender adolescents.”
Leor Sapir is a fellow at the Manhattan Institute.