Module 3.3 Validity

A valid piece of research truly measures what it sets out to measure.
To give valid results, a piece of research needs to:
- have a clear and focused research question
- include methodology to minimise bias
- draw the appropriate conclusions from the data
These questions can be asked of all levels of evidence and all types of questions, and are equally important for systematic reviews or guidelines.
Does the introduction clearly describe the research question?
- Did the researchers know what they were trying to study from the outset?
- Is the focus of the research in keeping with the research question?
Has the risk of bias been minimised in the study methodology, study reporting, and patient selection?
Study methodology
Study reporting
Patient selection
Bias minimisation: Study Methodology
Randomisation
Randomisation eliminates investigator bias in allocating patients to different treatment arms.
The most effective randomisation is done centrally by a computer. Non-randomised or inadequately randomised trials tend to overestimate treatment effects [1]. Randomisation methods should be clearly described in the ‘methods’ section of a clinical trial.
Many systematic reviews and meta-analyses only include randomised trials due to the risk of introducing bias through inclusion of poor methodological studies.
Blinding
Blinding can avoid bias due to patient or observer knowledge of treatment allocation. Blinding can be through use of placebo or other similar interventions that resemble the active treatment. In a single blind study, only the patient or observer is unaware of the treatment allocation.
A double-blind study design is preferred, as it ensures the observer or investigator measuring treatment effects, or interpreting a diagnostic investigation, is unaware of participants’ treatment allocation. This helps prevent any bias, whether conscious or unconscious, that could influence the study results.
Some studies are unable to be blinded, or blinding may be unethical. For example, in a clinical trial of chemotherapy versus best supportive care in a population with a previously untreatable cancer, it will be impossible to blind patients and doctors to the obvious side effects of the active treatment.
Adequate sample size
An adequate sample size decreases the likelihood that the findings are due to the play of chance. Results of smaller clinical trials are likely to deviate further from a true treatment effect.[2]
The use of a funnel plot (see Glossary) in meta-analyses uses this effect to suggest whether small, negative trials are likely to have been withheld from publication. This visual helps to detect asymmetry, which may indicate bias or variation in results across studies.
Published reports should include a paragraph on sample size calculation which includes a description of the α, β, anticipated effects of a treatment on the primary endpoint, and the proposed sample size.
Validated outcome measures
Validated outcome measures are important to avoid bias due to incorrect interpretation of change in an endpoint.
Outcomes like survival, or as an example, rate of myocardial infarction, may seem easy to define. However, check whether survival included all-cause mortality, or only that due to the disease in question. Check that clinical endpoints such as myocardial infarction have been defined in the study by a set of agreed criteria.
Patient-rated outcome measures such as quality of life questionnaires should be statistically validated, and validation referenced, unless the study has included a statistical validation as part of the design.
Correct statistical analysis
A detailed understanding of biostatistics is beyond the scope of this site; however, reporting of clinical trial results should concentrate on the endpoints defined before the study began. Post-hoc, exploratory analyses should be regarded with caution. This can increase the risk of false-positive findings and may not be as reliable as pre-planned analyses.
Remember that 1 in 20 statistical analyses is likely to be significant at p=0.05 through the play of chance alone. A more stringent p value of 0.01 or less may be set for significance where the researchers have looked at multiple comparisons.
Intention to treat analysis
‘Intention to treat’ means that all patients have been analysed for efficacy endpoints in the arms to which they were allocated. This is ‘best practice’ for randomised clinical trials. However, side effects (toxicities) should be analysed according to the treatment actually received by the patient.
The primary endpoint
There should always be an important, relevant, and valid primary endpoint. The sample size is usually calculated to identify clinically and statistically significant changes in the primary endpoint.
A surrogate endpoint
If a surrogate endpoint is used, it should be strongly and consistently associated with the important clinical endpoint.
The most appropriate control arm
Ensure that the most appropriate control arm or gold standard test has been used.
References
1. Turner L, Boutron I, Hróbjartsson A, Altman DG, Moher D. The evolution of assessing bias in Cochrane systematic reviews of interventions: celebrating methodological contributions of the Cochrane Collaboration. Systematic reviews. 2013;2(1)
2. Moore RA, Gavaghan D, Tramèr MR, Collins SL, McQuay HJ. Size is everything – large amounts of information are needed to overcome random effects in estimating direction and magnitude of treatment effects. Pain 1998;78(3): 217-220.
Bias minimisation: Study Reporting
Reporting of follow-up, dropouts, and crossover is important because there may be a systematic reason for loss of follow-up or dropout from one arm of a study, or for patients to not undergo one of two diagnostic tests under comparison. The study report should transparently account for all patients. Reports of randomised controlled trials are now required to include a CONSORT diagram by the ICMJE.
CONSORT diagrams, represented as a series of boxes containing patient numbers, describe how many patients withdrew, crossed over, or were lost to follow up, often with reasons for these events. Duration and completeness of follow-up should be sufficient and of the right time frame to see the effect being looked for.

Figure 1: CONSORT diagram from an RCT of an alternative process of liver preservation prior to transplant. It clearly shows where patients died or transplants did not proceed for any reason. Reprinted from Schlegel A, Mueller M, Muller X, Eden J, Panconesi R, von Felten S, et al. A multicenter randomized-controlled trial of hypothermic oxygenated perfusion (HOPE) for human liver grafts before transplantation. Journal of Hepatology. 2023;78(4):783–93. Reused under a Creative Commons CC BY licence.
Bias minimisation: Patient selection and distribution between arms
Were the right patients included in the trial?
Patient selection may be important to enable a study to detect a difference in the primary endpoint between study arms.
For example, a randomised clinical trial of an antidepressant for fatigue in patients with advanced cancer required patients to score 4 or more out of 10 for fatigue before entering the trial. This is important, because the problem needed to be there in order to potentially be improved by the intervention. Look at the study inclusion and exclusion criteria.
Were all patients treated consistently across both groups, except for the specific intervention under investigation?
Ideally, other supportive care measures and contact with study staff should be the same in all arms of an intervention study, to minimise the risk of introducing bias through factors other than the intervention.
Patient groups are balanced for important prognostic factors
The distribution of important prognostic factors between patients in study arms is shown in a table of patient characteristics, often ‘Table 1’. Prognostic factors may be unevenly distributed by chance alone. Assess whether imbalance may explain the direction of study results.
Do the authors draw conclusions that are supported by the data?
Conclusions in the abstract and discussion may be biased by the author’s pre-determined ideas about the results.
You should ensure that the results and the conclusions are concordant. Have the authors discussed and critically analysed other work that both supports and contradicts their findings, or have they focused on one aspect only?