The Relative Risk Trick
How a single statistical framing turns small absolute effects into headlines, billion dollar markets, and confident clinical advice.
On the morning of September 28, 2022, the Japanese pharmaceutical company Eisai and the American biotech Biogen issued a joint press release that briefly moved markets. The companies announced that their Phase 3 trial of lecanemab, an experimental antibody targeting amyloid protofibrils in the brains of patients with early Alzheimer’s disease, had hit its primary endpoint. The treatment, the press release said, had reduced cognitive decline by twenty seven percent compared with placebo over eighteen months. The trial, called Clarity AD, had enrolled 1,795 patients across multiple sites. The result was statistically significant. The implication, written across the headlines that followed over the next twenty four hours and into the weekend coverage, was that the field had finally produced a treatment that worked against the underlying biology of Alzheimer’s disease, and that the long search for a disease modifying therapy had reached a meaningful milestone.
Two months later, on November 29, 2022, the full results were presented at the Clinical Trials on Alzheimer’s Disease conference in San Francisco and published in The New England Journal of Medicine. The headline number held. Lecanemab had, in fact, slowed cognitive decline by twenty seven percent compared with placebo. The detailed results, reported in the trial paper by Christopher van Dyck and colleagues, also showed something the press releases had not emphasized. The cognitive decline was measured on a scale called the Clinical Dementia Rating Sum of Boxes, an eighteen point instrument on which higher numbers correspond to worse cognitive and functional performance. Patients in the trial started, on average, at a score of 3.2, in the range of mild cognitive impairment. Over eighteen months, the placebo group’s score worsened by 1.66 points, ending at 4.86. The lecanemab group’s score worsened by 1.21 points, ending at 4.41. The difference between the two groups, on an eighteen point scale, was 0.45 points.
This is the same finding, expressed two ways. Twenty seven percent slower decline is the relative framing. A 0.45 point absolute reduction in worsening on an eighteen point scale is the absolute framing. The numbers are mathematically equivalent. They convey very different impressions to readers who are not trained to convert between them. Whether 0.45 points on the CDR Sum of Boxes is a clinically meaningful difference for a patient with early Alzheimer’s disease was, when the trial appeared, and remains today, a matter of substantial debate among neurologists. The minimum clinically important difference for the CDR-SB is generally considered to be in the range of one to two points. A 0.45 point improvement falls below that threshold by most published estimates. The drug also carried real harms. Amyloid related imaging abnormalities, the technical term for brain swelling or microhemorrhages, occurred in roughly twenty one percent of treated patients, with symptomatic events in a smaller but real fraction. The treatment costs approximately twenty six thousand five hundred dollars per year before infusion and monitoring costs are added.
By the time the FDA granted lecanemab traditional approval in July 2023, the public conversation about the drug had largely settled around the twenty seven percent figure. The 0.45 point figure appeared in technical analyses, in editorials in The Lancet and other journals, and in the cautionary commentary of a minority of neurologists who pointed out that a sub clinical effect with a substantial side effect profile and a substantial price tag is an unusual combination to introduce into routine medical practice. The twenty seven percent figure remained the headline. It still does.
This piece is about the gap between those two numbers. It is one of the most common statistical moves in the entire literature of medicine, healthcare AI, and longevity science, and one of the most consistently misunderstood by patients, journalists, and a striking fraction of clinicians. The reader who internalizes the move, and who learns to convert relative framings into absolute ones automatically, will read the field’s claims with substantially more accurate calibration than the reader who does not.
The math, briefly
A relative risk reduction expresses the proportional change in an outcome between a treated group and a comparison group. If a placebo group has a five percent rate of an event and a treated group has a four percent rate, the relative risk reduction is twenty percent, because the absolute rate fell from five percent to four percent, a proportional decrease of one fifth.
An absolute risk reduction expresses the same finding as the simple difference between the two rates. In the same example, the absolute risk reduction is one percentage point, the number you get by subtracting four from five.
The number needed to treat is the reciprocal of the absolute risk reduction expressed as a proportion. If the absolute risk reduction is one percentage point (0.01), the number needed to treat is one hundred, meaning that one hundred patients must receive the treatment for one of them to avoid the event.
These three numbers describe the same underlying clinical finding from three different angles. None of them is wrong. None of them is, by itself, sufficient. A claim reported in only one of the three formats has, by the omission of the other two, told a strictly partial story, and the format chosen for emphasis is almost always the format that makes the finding sound largest.
A relative risk reduction of twenty percent sounds substantial. It is the kind of number that produces headlines, drives clinical adoption, and supports premium pricing. A one percentage point absolute risk reduction sounds modest. A number needed to treat of one hundred sounds like extra work for an effect of unclear magnitude. The same data point produces three different rhetorical effects. The reader who has not learned to demand the absolute number, or who cannot construct it from the available information, is reading a claim that has been formatted, intentionally or not, to maximize its perceived size.
The mammography case
The textbook example of the relative trick is breast cancer screening. The mammography literature has, over five decades, produced randomized controlled trial evidence that screening reduces breast cancer mortality by roughly twenty to twenty five percent in the populations where it has been studied. This figure has been the foundational claim for global screening programs, the basis for most public health messaging about mammography, and a stable feature of the published literature. It is, on the available evidence, approximately correct.
It is also a relative number, and the absolute number underneath it is much smaller than most patients and many clinicians assume. The baseline absolute risk of dying from breast cancer in the screened populations is, depending on the cohort and follow up period, somewhere between four and five per thousand women over a roughly ten year period. A twenty five percent relative reduction on that base produces an absolute reduction of approximately one per thousand women over the same period. The number needed to screen, the inverse, is roughly one thousand women screened for ten years to prevent one breast cancer death.
This is not a critique of mammography. The work of weighing the absolute benefit against the absolute harms of screening, including false positives, overdiagnosis of cancers that would never have caused symptoms, the procedures and anxiety triggered by abnormal findings, and the small but real cost of the screening itself, is the substance of the long running international debate over how to design and target screening programs. The point of the example is what it shows about communication. The German psychologist Gerd Gigerenzer, whose research at the Max Planck Institute has documented this gap in unusual detail, has shown repeatedly that the public and the clinical workforce dramatically overestimate the benefit of mammography when it is presented in relative terms. In a representative survey across four European countries, ninety two percent of women either overestimated the reduction in breast cancer mortality from screening by a factor of ten to two hundred, or did not know. In one of Gigerenzer’s studies of one hundred fifty gynecologists, roughly one third did not correctly understand what a twenty five percent relative risk reduction meant in terms of women out of one thousand. Most of the rest were not certain.
The pattern is not specific to mammography. Gigerenzer and his collaborators have documented similar miscalibration for PSA screening, statin therapy in primary prevention, and a wide range of other interventions where the relative reduction sounds substantial and the absolute reduction is, in most populations, small. The pattern is also not, in most cases, the product of bad faith. Researchers report relative numbers because relative numbers are how clinical trials are designed and powered. Journalists repeat relative numbers because the trial papers and press releases emphasize them. Marketing materials repeat them because they sound large. Clinicians communicate them because their training and continuing education repeat them. The reader at the end of the chain receives a number that, while technically accurate, has been formatted to maximize its perceived magnitude through every link in the communication.
Why the trick persists
The persistence of the relative framing in healthcare communication has several structural causes worth understanding, because they explain why the pattern is unlikely to resolve itself and why the reader has to do the work of conversion.
The first is that clinical trials are designed and analyzed in relative terms. A pharmaceutical or device company designs a trial to detect a particular relative reduction in a primary endpoint with a particular statistical power. The trial is sized for the relative effect. The reporting follows the analysis. The CONSORT statement, the international guideline for randomized trial reporting that has been the standard for over two decades, recommends that effect sizes be reported in both absolute and relative terms, with confidence intervals. Studies of the published literature consistently find that this recommendation is observed in a minority of cases. A 2003 to 2004 analysis of papers in the New England Journal of Medicine, JAMA, The Lancet, BMJ, and the Journal of the National Cancer Institute, the most influential general medical journals in the world, found that roughly two thirds of papers reporting at least one relative measure did not report the corresponding absolute measure for that finding. Other analyses have found similar patterns persisting through the present day. The relative framing is the default. The absolute framing has to be specifically pulled out by the reader.
Featured Partner
Invest in the Infrastructure Behind Modern Medicine
As healthcare expands beyond hospital walls, the buildings and campuses supporting that shift are generating compelling returns for investors who move early. The Healthcare Real Estate Fund offers qualified investors direct access to a curated portfolio of medical office, outpatient, and specialty care facilities.
Learn More →The second cause is the publication and press release amplification chain that this publication has written about in earlier pieces. A relative number that sounds substantial moves through the chain more easily than an absolute number that sounds modest. Press releases use the relative figure. News articles repeat it. Marketing materials adopt it. By the time the claim reaches the patient or the consumer, the absolute number has, in many cases, been left behind entirely.
The third cause is the asymmetry of the rhetorical situation. The party making the claim has an interest in maximizing its perceived size. The patient, journalist, or general reader has no equivalent counterparty whose interest is in providing the absolute number. Regulators, in some cases, require both. Their requirements often apply only to formal product labeling, not to the marketing materials, press releases, conference presentations, or news coverage that shape the public understanding of a finding. The relative framing wins, in the long run, because the structural pressures all point in that direction.
The fourth cause, and the one that is harder to fix, is statistical illiteracy. The conversion between relative and absolute risk requires the reader to know the baseline rate. In many published claims, the baseline rate is not stated, is buried in the methods section, or is qualified in ways that make the conversion difficult without specialized training. A reader who lacks the baseline cannot do the math even when willing to try. The result is that even motivated and educated readers, including practicing physicians, can be misled by relative framings when the baseline is not made transparent.
The healthcare AI version
The relative trick appears in healthcare AI in two recognizable forms. The first is the version familiar from pharmaceutical reporting. An AI tool that improves a detection rate, reduces a missed diagnosis rate, or shifts a clinical decision pattern reports its effect as a relative change. The press release announces a thirty percent improvement in detection of a particular condition. The reader, without further information, has no way to convert that figure into the number of additional cases caught per thousand patients screened. If the baseline rate of the condition is low, the absolute number of additional cases is small. If the false positive rate associated with the higher detection also increased, the trade off between additional true positives and additional false positives is invisible in the relative framing entirely.
The second form is more specific to diagnostic AI and reflects a deeper statistical subtlety. The performance of a diagnostic algorithm is typically reported in terms of sensitivity and specificity, the conditional probabilities of a positive test given disease and a negative test given no disease. These are intrinsic properties of the algorithm. The clinically relevant numbers, positive predictive value and negative predictive value, depend on the prevalence of the disease in the population being tested. A tool with ninety percent sensitivity and ninety percent specificity, applied to a population with one percent disease prevalence, produces a positive predictive value of about eight percent. Most patients flagged positive by the tool do not, in fact, have the disease. The reader who sees only the sensitivity and specificity numbers, without prevalence, is reading a partial claim. The relative trick, in this form, is not about risk reduction but about diagnostic accuracy. The shape of the misreading is the same. The headline number is impressive. The clinically meaningful number, after population prevalence is factored in, is smaller and harder to interpret.
This pattern recurs across the healthcare AI marketing landscape in ways that the reader who has internalized the conversion will see immediately. A symptom checker that increases the probability of a correct first guess by a particular percentage may, in absolute terms, be improving the diagnosis of a small number of patients per thousand. A continuous glucose monitor reporting a particular percentage reduction in hypoglycemic events may, against a low baseline, be preventing a small absolute number of events. A longevity supplement reporting a particular percentage improvement in a biomarker is, in absolute terms, often producing a change that is small relative to the natural variability of the biomarker itself. The pattern is consistent enough that, in this publication’s view, the relative-only framing should be treated as a signal in itself. A claim that reports its effect only in relative terms, without the absolute number that would let the reader judge clinical relevance, has chosen the framing that makes the finding sound largest. The reader’s correct response is not to discount the claim entirely. It is to recognize that the absolute size of the effect is unknown until the conversion is done, and to do the conversion before drawing any inference about clinical or personal relevance.
The reader’s method
A short working method, drawn from the Gigerenzer corpus and adapted for healthcare AI specifically, follows from the foregoing.
When you encounter a claim of percentage improvement, reduction, or change in any healthcare or longevity context, ask first whether the figure is relative or absolute. If the framing is unclear, assume it is relative until the reporting establishes otherwise. The relative framing is the default in marketing communication.
When the figure is relative, find the baseline. The baseline is the rate or score in the comparison group, in the population, or in the untreated condition. The baseline is what the relative figure is a percentage of. The relative figure without the baseline is a number that cannot be converted into clinical meaning.
When you have the baseline, do the conversion. The absolute change is the baseline multiplied by the relative change expressed as a proportion. A twenty seven percent reduction on a 1.66 point decline is an absolute reduction of 1.66 times 0.27, or approximately 0.45 points. The arithmetic is straightforward. The discipline of doing it routinely is what produces calibration.
When the absolute change is small relative to the scale on which it is measured, treat the headline relative number with appropriate skepticism. A small absolute change on a large scale is, in most clinical contexts, a small effect, regardless of how the relative framing sounds. When the claim is about a diagnostic algorithm or screening tool, look for both sensitivity and specificity and the underlying prevalence of the condition in the population. If prevalence is not provided, the positive predictive value cannot be calculated, and the clinical meaning of the sensitivity and specificity claims is, in any honest reading, undetermined.
When the source of the claim is a press release or a marketing communication rather than a primary research paper, expect the relative framing to be amplified. Walk back up the chain to the original paper. The original paper, in most cases, will contain the absolute numbers somewhere, even when the press release does not.
Back to lecanemab
The lecanemab story, three years on from its initial press release, has become a useful case study for the broader pattern. The drug is approved. The drug is being used. Some patients have benefited. Some patients have experienced serious adverse events. The question of whether the twenty seven percent figure is sufficient to justify the cost and the risk profile is being answered in real time by clinicians, patients, payers, and regulators around the world. There is no single right answer. There are better and worse informed readings of the underlying evidence.
The reader who, in 2022 or 2023, knew to convert the twenty seven percent figure into the 0.45 point figure on the CDR Sum of Boxes was operating with substantially better calibration than the reader who did not. The reader who, in 2026, encounters any healthcare AI or longevity claim presented in purely relative terms is in a position to ask the same conversion question. The discipline of asking it produces a different reading of the field. It is also one of the most reliable filters available to anyone reading healthcare claims in any volume. The marketing departments, the press releases, the news articles, and the social media amplification chain will continue to default to the relative framing, because the relative framing is what travels. The reader’s job is to convert, every time, and to weight the underlying claim on the absolute basis the data actually supports.
This is what the verification intelligence we have been describing in this publication looks like at the level of a single statistical move. The next pillar in this series will examine a related pattern, the use of surrogate endpoints that stand in for clinical outcomes the trial did not, or could not, measure directly. The patterns compound. The reader who has absorbed both moves will, in combination, see a meaningfully different field than the field’s marketing has any incentive to project.
Sources and further reading
van Dyck CH, Swanson CJ, Aisen P, et al. Lecanemab in early Alzheimer’s disease. New England Journal of Medicine. 2023;388(1):9 to 21.
Walsh S, Merrick R, Milne R, Brayne C. Lecanemab for Alzheimer’s disease: tempering hype and hope. The Lancet. 2022;400(10367):1899.
Howard R, Liu KY. Questions EMERGE as Biogen claims aducanumab turnaround. Nature Reviews Neurology. 2020;16(2):63 to 64. See also subsequent commentary on lecanemab and donanemab.
Gigerenzer G, Gaissmaier W, Kurz-Milcke E, Schwartz LM, Woloshin S. Helping doctors and patients make sense of health statistics. Psychological Science in the Public Interest. 2007;8(2):53 to 96.
Gigerenzer G. Making sense of health statistics. Bulletin of the World Health Organization. 2009;87:567.
Wegwarth O, Gigerenzer G. The barrier to informed choice in cancer screening: statistical illiteracy in physicians and patients. Recent Results in Cancer Research. 2018;210:207 to 221.
Gøtzsche PC, Nielsen M. Screening for breast cancer with mammography. Cochrane Database of Systematic Reviews. 2011, Issue 1, CD001877.
Schwartz LM, Woloshin S, Welch HG. Misunderstandings about the effects of race and sex on physicians’ referrals for cardiac catheterization. New England Journal of Medicine. 1999;341(4):279 to 283. See also subsequent Schwartz and Woloshin work on risk communication.
CONSORT Group. CONSORT 2010 Statement: Updated guidelines for reporting parallel group randomized trials. British Medical Journal. 2010;340:c332.
Schwartz LM, Woloshin S, Dvorin EL, Welch HG. Ratio measures in leading medical journals: structured review of accessibility of underlying absolute risks. British Medical Journal. 2006;333(7581):1248.
For broader treatment of the relative-absolute distinction in healthcare communication, see Gigerenzer G. Risk Savvy: How to Make Good Decisions. Viking, 2014.
Verification Intelligence for Healthcare AI
This article is part of the Verification Intelligence for Healthcare AI series. For the practical capstone, read How to Read a Healthcare AI Press Release. For the proxy-measure problem that often hides underneath headline effects, continue to Surrogate Endpoints and the Long Wait for Truth.
- The Literature Is a Debate, Not a Record
- What “Clinically Validated” Actually Means
- FDA Cleared, FDA Approved, FDA Authorized
- Prospective vs Retrospective
- The Reproducibility Crisis Healthcare AI Refuses to Talk About
- The Relative Risk Trick
- Surrogate Endpoints and the Long Wait for Truth
- After the Bitter Lesson
- The Resolution Gap
- How to Read a Healthcare AI Press Release
