What is a sensitivity analysis?

Randomized clinical trials are a tool to generate high-quality evidence of efficacy and safety for new interventions. The statistical analysis plan (SAP) of a trial is generally pre-specified and documented prior to seeing outcome data, and it is encouraged that researchers follow the pre-specified analysis plan. The process of pre-specification of the primary analysis involves making assumptions about methods, models, and data that may not be supported by the final trial data. Sensitivity analysis examines the robustness of the result by conducting the analyses under a range of plausible assumptions about the methods, models, or data that differ from the assumptions used in the pre-specified primary analysis. If the results of the sensitivity analyses are consistent with the primary results, researchers can be confident that the assumptions made for the primary analysis have had little impact on the results, giving strength to the trial findings. Recent guidance documents for statistical principles have emphasized the importance of sensitivity analysis in clinical trials to ensure a robust assessment of the observed results [1].

When is a sensitivity analysis valid?

While the importance of conducting sensitivity analysis has been widely acknowledged, what constitutes a valid sensitivity analysis has been unclear. To address this ambiguity, Morris et al. proposed a framework to conduct such analysis [2] and suggest that a particular analysis can be classified as a sensitivity analysis if it meets the following criteria: (1) the proposed analysis aims to answer the same question as to the primary analysis, (2) there is a possibility that the proposed analysis will lead to conclusions that differ from those of the primary analysis, and (3) there would be uncertainty as to which analysis to believe if the proposed analysis led to different conclusions than the primary analysis. These criteria can guide the conduct of sensitivity analysis and indicate what to consider when interpreting sensitivity analysis.

Criterion 1: do the sensitivity and primary analysis answer the same question?

The first criterion aims to ascertain whether the question being answered by the two analyses is the same. If the analysis addresses a different question than the primary question, then it should be referred to as a supplementary (or secondary) analysis. This may seem obvious, but it is important to consider, as if the questions being answered are different, the results could lead to unwarranted uncertainty regarding the robustness of the primary conclusions.

This misconception is commonly observed in trials where a primary analysis according to intention-to-treat (ITT) principle is followed by a per-protocol (PP) analysis, which many consider a sensitivity analysis. The ITT analysis considers the effect of a decision to treat regardless of if the treatment was received, while the PP analysis considers the effect of actually receiving treatment as intended. While the results of the PP analysis may be of value to certain stakeholders, the PP analysis is not a sensitivity analysis to a primary ITT analysis. Because the analyses address two distinct questions, it would not be surprising if the results differ. However, failure to appreciate that they ask different questions could lead to confusion over the robustness of the primary conclusions.

Criterion 2: could the sensitivity analysis yield different results than the primary analysis?

The second criterion relates to the assumptions made for the sensitivity analysis; if these assumptions will always lead to conclusions that are equivalent to the primary analysis, then we have learned nothing about the true sensitivity of the trial conclusion. Thus, a sensitivity analysis must be designed under a reasonable assumption that the findings could potentially differ from the primary analysis.

Consider the sensitivity analysis utilized in the LEAVO trial that assessed the effect of aflibercept and bevacizumab versus ranibizumab for patients with macular oedema secondary to central retinal vein occlusion [3]. The primary outcome of this study evaluated best-corrected visual acuity (BCVA) change from baseline for aflibercept, or bevacizumab, versus ranibizumab. At the end of the study, the primary outcome of the trial, BCVA score, was missing in some patients. For the purposes of imputation of the missing data, the investigators considered a range of values (from −20 to 20) as assumed values for the mean difference in BCVA scores between patients with observed and missing data. An example of this criterion not being met would be if a mean difference of 0 was used to impute BCVA scores for the missing patients, as it would be equivalent to re-running the primary analysis, leading to similar conclusions as to the primary analysis. This would provide a misleading belief in the robustness of results, as the “sensitivity” analysis conducted did not actually fulfill the appropriate criterion to be labeled as such.

On the other hand, modifying the assumptions to differ from the primary analysis by varying mean difference from −20 to 20 provides a useful analysis to assess the sensitivity of the primary analysis under a range of possible values that the missing participants may have had. One could reasonably postulate that assuming a mean change in BCVA scores of −20 to 20 to impute missing data could impact the primary analysis findings, as these values range from what one might consider a “best” and “worst” case scenario for the results observed among participants with missing data. In the LEAVO trial the authors demonstrated that, under these scenarios, the results of the sensitivity analysis support the primary conclusions of the trial.

Criterion 3: what should one believe if the sensitivity and primary analyses differ?

The third criterion assesses whether there would be uncertainty as to which analysis is to be believed if the proposed analysis leads to a different conclusion than the primary analysis. If one analysis will always be believed over another, then it is not worthwhile performing the analysis that will not be believed as it is impossible for that analysis to change our understanding of the outcome. Consider a trial in which an individual is randomized to intervention or control, and the primary outcome is measured for each eye. Because the results from each eye within a given patient are not independent, if researchers perform analyses both accounting for and not accounting for this dependence, it is clear that the analysis accounting for the dependence will be preferred. This is not a proper sensitivity analysis. In this situation, the analysis accounting for the dependence should be the primary analysis and the analysis not accounting for the dependence should not be performed, or be designated a secondary outcome.

Conclusions

Sensitivity analyses are important to perform in order to assess the robustness of the conclusions of the trial. It is critical to distinguish between sensitivity and supplementary or other analysis, and the above three criteria can inform an understanding of what constitutes a sensitivity analysis. Often, sensitivity analyses are underreported in published reports, making it difficult to assess whether appropriate sensitivity analyses were performed. We recommend that sensitivity analysis be considered a key part of any clinical trial SAP and be consistently and clearly reported with trial outcomes.