Introduction
Missing data is a critical issue in clinical trials, particularly in confirmatory studies where the accuracy of results can directly influence regulatory and clinical decisions. There is no single method that universally adjusts for missing data, and the choice of approach can affect the conclusions of a trial. Efforts should be made to minimize missing data, as its presence can lead to biased outcomes. For instance, if the missing data disproportionately favor the experimental treatment, this may distort the analysis.
It is crucial to investigate the robustness of trial results by employing appropriate sensitivity analyses, which incorporate different assumptions about the nature of the missing data. Patients may withdraw from a trial for many reasons, including adverse events, treatment success or failure, or other unrelated circumstances like relocation. When data is missing because patients withdraw from a trial, it poses a significant risk of bias in estimating the treatment effect.
Handling of Missing Data
Missing data can be categorized under three general types:
- Missing Completely at Random (MCAR): The probability of missing data is unrelated to any observed or unobserved data. A typical example is a patient moving to another city for non-health reasons. Patients who drop out of a study for this reason could be considered a random and representative sample from the total study population.
- Missing at Random (MAR): The likelihood of missing data is related to observed data but not the unobserved data. For instance, when a patient drops out due to lack of efficacy reflected by a series of poor efficacy outcomes that have been observed, it would be appropriate to impute or model poor efficacy outcomes subsequently for this patient.
- Missing Not at Random (MNAR): The missing data is related to the unobserved outcomes, making it more challenging to handle. For example, it may happen that after a series of visits with good outcomes, a patient drops out due to lack of efficacy.
Methods of Handling Missing Data
In confirmatory trials, primary analyses must ensure that any bias in favor of the experimental treatment is excluded. Various imputation techniques can help handle missing data while considering different assumptions.
Single imputation methods, like Last Observation Carried Forward (LOCF), use the last recorded value for a patient to fill in subsequent missing data. While simple, this method can lead to biased estimates, particularly if patients withdraw early due to adverse events. Similarly, Baseline Observation Carried Forward (BOCF) assumes that a patient’s condition reverts to baseline, useful in scenarios like chronic pain trials where symptoms may return to baseline after discontinuation.
Other methods, like best/worst case imputation, replace missing data with either the best or worst possible outcomes, depending on the reason for withdrawal. This method, however, is often used in sensitivity analyses rather than as the primary approach.
Multiple imputation (MI) generates multiple datasets, replaces missing data with plausible values, and combines the results to produce more reliable estimates. MI is preferred in many cases as it incorporates the uncertainty around missing values, reducing bias compared to single imputation methods.
Mixed Models and Generalized Estimating Equations
When analyzing repeated measures data, methods like Mixed-Effect Models for Repeated Measures (MMRM) and generalized estimating equations (GEE) are commonly used. These models account for the correlation between repeated measurements over time, offering a robust approach to handling missing data.
Mixed-Effect Models for Repeated Measures (MMRM) are frequently employed in clinical trials, especially when dealing with outcomes measured at multiple time points. MMRM handles missing data without imputing values by using all available information and estimating parameters through maximum likelihood.
For example, in a clinical trial comparing a drug to a placebo for patients with chronic pain, where pain levels are measured at baseline and weeks 4, 8, and 12, some patients may miss visits. The MMRM model is given by:
Yij = β0 + β1 (Drug vs. Placebo)i + β2 (Time)j + β3 (Drug × Time)ij + bi + εij
This analyzes the outcome (Yij) for patient (i) at time (j), where the fixed effects (β) represent the treatment, time, and their interaction, while the random effect (bi) captures differences between patients, and εij is the residual error.
MMRM incorporates all available data and the correlation between measurements over time, providing a more accurate assessment of treatment effects even when data is missing. For instance, if a patient misses a follow-up visit due to worsening symptoms, MMRM still leverages the data from their previous assessments, ensuring their experience contributes to understanding the drug's effectiveness. This way, even with incomplete data, it is possible to gain insights that might be overlooked with more simplistic analysis methods.
European Medicines Agency. (2024). Guideline on missing data in confirmatory clinical trials. https://www.ema.europa.eu/en/documents/scientific-guideline/ich-e9-r1-addendum-Missing