مرکز منطقه ای اطلاع رساني علوم و فناوري - Using propensity scores to control for confounding: the influence of causal models on the valdity of effect estimates

Abstract :

Purpose The propensity score, a data-based approach used to control for confounding, has been increasingly used in large-scale datasets. Recent developments in epidemiologic methods, however, suggest that data-based approaches to confounding that lack conceptual grounding may lead to less valid estimates. We examined whether the underlying causal scenario influences the extent to which the propensity score improves the validity of effect estimates. Methods Hypothetical cohorts were simulated for two alternative causal scenarios of the relationship between prescription ibuprofen and death. To investigate the effect of the causal model on the validity of the propensity score-adjusted effect estimate, in these models, we specify the relationships among several hypothetical causes of death and vary the prevalence of each cause. Each cohort was analyzed using logistic and Poisson regression to derive odds ratios and risk ratios (RRs), respectively, to calculate the effect estimates and propensity scores. Results Statistical analyses resulting from these simulations demonstrate that propensity scores derived from logistic regression may be problematic when the drug treatment of interest is common. In addition, mediators of the drug-disease process are often disguised as confounders and thus may be inadvertently included in the propensity score, leading to less valid estimates. Finally, the propensity scores may provide invalid estimates when single variables are included in the composite score without consideration of the joint effects among the variables. Under the most extreme scenario, the propensity score-adjusted estimate (RRprop) reversed the direction of the true RR (RRtrue) (e.g., RRtrue = 1.72 vs. RRprop = 0.65). Conclusion Under realistic causal models, propensity scores may lead to less valid effect estimates because they rely solely on data to identify confounding. Hypothesizing the causal model prior to generating propensity scores for an outcome of interest may lead to more valid estimates.