How Cochrane Reviews Rate the Evidence for Exercise

Cochrane Reviews use a standardized system called the GRADE framework to evaluate how confident we should be in the evidence supporting exercise benefits.

Cochrane Reviews use a standardized system called the GRADE framework to evaluate how confident we should be in the evidence supporting exercise benefits. GRADE rates evidence certainty into four levels—high, moderate, low, and very low—based on a rigorous assessment of study quality, consistency, and relevance. When a Cochrane review concludes that exercise reduces falls in older adults by 23 percent, for example, the certainty rating tells you whether that conclusion rests on solid evidence from well-designed trials or on smaller, less reliable studies.

This certainty rating is just as important as the finding itself, because it shapes how confidently doctors, coaches, and runners can act on the recommendation. Since the mid-2010s, GRADE evaluation has been mandatory in all Cochrane systematic reviews, appearing in tables called “Summary of findings.” This means whether you’re reading a review about running for heart health, strength training for bone density, or aerobic exercise for depression, the researchers must explicitly state how much confidence they have in their conclusions. A 2025 analysis of over 5,100 Cochrane reviews found that certainty ratings vary widely across exercise topics—some show high confidence, but many show only low or very low certainty, reflecting the reality that good evidence for exercise is harder to find than you might expect.

Table of Contents

How Cochrane Reviews Assess Exercise Evidence Certainty

The GRADE framework evaluates evidence certainty by examining five key domains: risk of bias, inconsistency, indirectness, imprecision, and publication bias. Risk of bias means assessing whether the studies included were well-designed—did they randomize participants fairly, did they measure outcomes objectively, or did researchers know which group received exercise versus a placebo? Inconsistency looks at whether different studies reached similar conclusions or if results bounced around unpredictably. Indirectness asks whether the studies tested the right population, intervention, and outcome—if you want to know about running’s benefits for 50-year-old recreational athletes, but the evidence comes from sedentary elderly people in supervised programs, that’s indirect and lowers certainty.

Imprecision examines whether studies included enough participants to detect true effects or if the confidence intervals around the results are so wide that the benefit could be trivial or nonexistent. Publication bias accounts for the reality that studies showing positive effects are more likely to be published than neutral or negative ones. A review might find eight published studies showing exercise reduces knee pain, but if three unpublished negative studies exist in a filing cabinet somewhere, the published evidence overstates the true benefit. Cochrane reviewers often cannot see unpublished data, so they mark this as a potential limitation and lower certainty accordingly.

How Cochrane Reviews Assess Exercise Evidence Certainty

The Five Domains That Determine What You Can Trust

Understanding these five domains reveals why many exercise studies earn low certainty ratings despite seeming straightforward. Take fall prevention in older adults—Cochrane found high certainty evidence that exercise reduces falls by 23 percent, based on 116 studies involving over 25,000 participants. This high certainty rating was justified because many large, well-designed randomized trials tested similar interventions in similar populations, results were consistent across studies, and the evidence directly answered the question asked. But this is the exception rather than the rule in exercise science.

A critical warning: many exercise interventions show only low or very low certainty evidence. Recent data from Cochrane reviews in physical therapy found that very low certainty outcomes increased from 14 percent in studies published from 2001 to 2005 to 34 percent in studies from 2016 to 2020—meaning recent reviews are actually finding less confidence in evidence, not more. Low certainty outcomes comprised 55 percent of all outcomes examined, while moderate certainty made up 22 percent and high certainty only 2 percent. This skewed distribution exists partly because exercise science is harder to study than pharmaceutical trials: you cannot blind participants to whether they exercised, dropout rates run high, and measuring complex outcomes like “improved quality of life” involves subjective judgments. The domain of indirectness is especially problematic—many exercise studies use small samples of college students or highly motivated volunteers, then recommendations get applied to sedentary older adults or people with chronic diseases, populations the original studies never tested.

Distribution of Evidence Certainty Levels in Cochrane Reviews of Exercise and PhHigh Certainty2%Moderate Certainty22%Low Certainty55%Very Low Certainty34%Source: Cochrane meta-research analysis and physical therapy systematic review data (2001-2020)

What Real Exercise Evidence Quality Looks Like

The 2025 meta-research study examining 5,116 Cochrane reviews containing 64,849 research questions revealed the actual landscape of exercise evidence quality. When researchers updated these reviews over time, studies included remained identical to the original version in 63 percent of cases, increased in 33 percent, and decreased in only 4 percent. This suggests that foundational evidence supporting exercise recommendations is relatively stable—new studies refine and expand our knowledge rather than overturn earlier findings. However, stable does not mean certain, and the number of studies included tells only part of the story. A concrete example: Cochrane’s overview of exercise and health outcomes found that physical activity reduced mortality by 13 percent across studies involving over 27,000 participants.

This mortality reduction is substantial and appeared across various types of exercise—running, walking, swimming, strength training. Yet even this dramatic public health finding carries limitations in certainty. The studies tracked participants’ self-reported exercise levels rather than assigning some people to exercise and others to sedentary control groups, so confounding factors (healthier people exercise more) may partly explain the association. People who exercise regularly also tend to eat better, manage stress more effectively, and avoid smoking—separating exercise’s unique benefit from these lifestyle factors is nearly impossible in observational studies. Cochrane reviewers account for these limitations when rating certainty, which is why even a mortality benefit may not earn the highest certainty rating.

What Real Exercise Evidence Quality Looks Like

Interpreting Certainty Levels for Your Own Training

When reading a Cochrane review about exercise, the certainty rating should shape how you interpret the findings. High certainty evidence means the finding is likely true and reliable across populations—you can confidently build your training around it. For running, the fall-prevention evidence qualifies: if you are an older adult concerned about balance, the Cochrane evidence strongly supports regular exercise as a proven fall-prevention strategy. Moderate certainty means the finding probably reflects reality, but further research might refine or change it somewhat; it is reasonable to act on moderate certainty evidence, but maintain some skepticism.

Low certainty evidence suggests the benefit is plausible but uncertain—the research base is thin, studies are small, or designs are flawed. Very low certainty evidence is essentially exploratory; it might inspire a personal experiment, but it does not support confident recommendations. The tradeoff worth noting: the reviews with the highest certainty evidence are often about outcomes that are easiest to measure objectively—falls, mortality, muscle strength, maximum oxygen uptake. The reviews with the lowest certainty often address outcomes runners actually care about—does running improve mood, boost energy levels, enhance sleep quality, or reduce anxiety? These subjective, complex outcomes are harder to measure consistently across studies, participants have trouble being blinded to their own experience, and effect sizes are often small. This creates a frustrating gap: Cochrane can confidently tell you that exercise prevents falls, but it can only tentatively suggest that running might improve your mental health, even though the mental health benefit may feel more personally relevant than fall prevention.

The Growing Certainty Crisis in Exercise Evidence

One troubling trend documented across physical therapy research is the rising proportion of very low certainty evidence—jumping from 14 percent to 34 percent over two decades. This does not mean exercise actually became less effective; rather, as Cochrane reviews became more rigorous and systematic, researchers better understood the methodological limitations in exercise studies. Older reviews conducted before GRADE was mandatory sometimes overlooked these limitations, inadvertently suggesting higher certainty than the evidence warranted. Modern Cochrane reviews catch these problems—selective outcome reporting, inadequate blinding, small sample sizes, funding bias—and downgrade certainty accordingly. In a sense, certainty is dropping because standards are rising.

This trend carries an important warning for readers. If you see headlines about “exercise for [condition]” supported by a Cochrane review, you should expect that the full Cochrane text will include substantial caveats. Very low certainty means the authors essentially cannot rule out that the intervention does nothing or even causes harm. Moderate and high certainty findings are rare enough that they are worth noting and acting on. Low certainty findings should inspire further research rather than confident clinical practice. The certainty rating also reflects what Cochrane calls “domains” of limitation—a review might show very low certainty for mortality outcomes but low certainty for fitness gains from the same intervention, if mortality studies were smaller, less rigorous, or enrolled different populations than fitness studies.

The Growing Certainty Crisis in Exercise Evidence

Fall Prevention and Mortality: Where Exercise Evidence Shines

The fall prevention evidence deserves particular attention because it represents the gold standard for exercise evidence in Cochrane reviews. The 116 trials involving over 25,000 older adults tested various exercise programs—balance training, strength training, mixed programs—and showed consistent 23 percent reductions in fall rates. The certainty was high because the studies included older adults living in the community (not nursing homes or hospitals), measured a clear, objective outcome (falls are documented events), and showed consistent results across diverse programs and populations. Someone reading this high certainty evidence can confidently incorporate balance and strength work into their training or exercise prescription, knowing the Cochrane panel stands behind the recommendation. Mortality reduction offers a second example of substantial evidence, though certainty is somewhat lower.

A 13 percent mortality reduction across 27,000 participants represents a genuinely important public health finding. However, the certainty is tempered by the observational nature of most studies—researchers tracked people’s exercise habits, they did not randomly assign people to exercise or sedentary control groups. This observational design introduces indirectness and risk of bias domains that lower certainty. The mortality data also suffers from publication bias concerns: studies showing exercise’s mortality benefits are more likely to be published than null studies, which might overstate the true effect. Despite these limitations, the convergence of evidence across many large studies, the biological plausibility of exercise reducing cardiovascular mortality and cancer risk, and the consistency of findings across different age groups and exercise types support moderate to moderately-high certainty in mortality reduction for regular physical activity.

The Future of Exercise Evidence and Training

Cochrane’s commitment to rigorous evidence assessment means that exercise research will continue facing scrutiny. The organization recently scheduled a Risk of Bias 2 workshop for August 2026 in George Town, Malaysia, specifically training Cochrane authors to better evaluate each source of bias in exercise trials. This effort signals that Cochrane recognizes the challenge—exercise science produces thousands of studies annually, but many lack the design rigor that high certainty demands. As training improves and more researchers learn to design larger, longer, better-controlled exercise trials, certainty ratings may shift.

Some currently low-certainty findings might gain confidence, while others might be downgraded further as methodological flaws become apparent. For runners and fitness enthusiasts, the practical implication is straightforward: use Cochrane’s certainty ratings as a guide, prioritize the high and moderate certainty findings for your core training decisions, and treat low certainty recommendations as interesting possibilities worth experimenting with but not anchoring your program to. The fact that exercise reduces falls with high certainty and mortality with moderate certainty provides a strong foundation for lifelong physical activity. The reality that psychological benefits, injury prevention, and performance enhancement often carry only low certainty evidence should not discourage training in these areas—it should simply temper expectations and encourage individual experimentation to discover what works for your body.

Conclusion

Cochrane reviews rate exercise evidence using the GRADE framework, which assigns one of four certainty levels—high, moderate, low, or very low—based on rigorous assessment of five domains: risk of bias, inconsistency, indirectness, imprecision, and publication bias. This system has been mandatory since the mid-2010s and provides a transparent, standardized way to communicate how much confidence we should place in findings. The data shows a sobering picture: very low certainty evidence has grown from 14 percent to 34 percent of exercise outcomes in recent reviews, low certainty comprises 55 percent, moderate certainty 22 percent, and high certainty only 2 percent. The takeaway is not that exercise is unproven—the high certainty evidence for fall prevention and the moderate certainty evidence for mortality reduction are genuinely important.

Rather, most specific exercise claims require a dose of skepticism. When you read that “running improves depression” or “strength training boosts confidence,” check whether a Cochrane review supports that claim and what certainty rating it carries. If you cannot find a Cochrane review, the evidence is likely even thinner. By using certainty ratings as your guide, you can confidently act on the strongest evidence while remaining appropriately cautious about preliminary findings that future research may refine or overturn.


You Might Also Like