Balancing Tradeoffs in Causal Inference Research Approaches to Advance Racial Equity

Precarious looking stack of rocks in a lake with grasses in the background.


Historically, research funded through Evidence for Action (E4A) has focused on evaluating the impact of interventions on health outcomes. We’ve expanded the type of research we fund through our racial equity call for proposals (CFP), but we’re still especially interested in whether and how programs, policies, and institutional practices affect racial equity and health and well-being. Studies addressing these questions face inherent tradeoffs, which may become even more pronounced when conducting research focused on advancing racial equity. 

Possible Approaches

An ideal study of how a social policy or initiative affects racial health equity would start with a question(s) relevant to the priorities of the people affected and have high-quality measurements reflecting that question. It would include large, representative samples sufficient to precisely estimate effects in diverse groups of people, including potential heterogeneity in effects across intersectionally defined groups (e.g., Native American women over the age of 65). 

What would a perfect study look like? An ideal  study would offer a rigorous approach to causal identification, based on randomization or quasi-randomization, with questions and measurements capturing the experiences of people impacted by the issue. The evidence would be strong enough that people would be convinced by the findings, regardless of their prior beliefs about the effect of the intervention. In fact, such ideal studies are rare because of the tradeoffs in both cost and logistical feasibility. Serving the multiple goals of rigorous causal identification, good measurement, nuanced reflections of lived experiences of diverse participants, and  adequate statistical power is challenging. For example, randomized control trials almost never include population-representative samples and many trials fail to recruit large or diverse samples. 

Statistical power is often best achieved by utilizing large administrative datasets, which rarely include high-quality, tailored measures.  Some administrative data sets have near-complete population coverage and can therefore be informative for even relatively small subgroups. Measurement quality however, is usually higher in smaller, more intensive or focused studies. Experimental or quasi-experimental designs that draw data from large administrative sources are likely to provide accurate information about how some people are affected and offer a clear path to scale-up, but these approaches often rely on a rigid definition of the “intervention”, providing little guidance on how to make improvements to the intervention or implementation. Using administrative data sources typically requires sacrificing nuance about how and why different people are affected. 

Some of these historical research dilemmas become even more complicated when designing studies to center racial equity. Many administrative datasets lack population coverage and racially and ethnically marginalized groups are missing or systematically underrepresented. Recruitment-based research studies that did not prioritize a specific, intersectional group rarely have sufficient information for research on that group. For example, a study seeking to understand the impact of an intervention on individuals identifying as Black trans youth or Latinx LGBTIQA+ is unlikely to be adequately powered without targeted recruitment. High-quality race and ethnicity data is not always collected or available in secondary data sources, and other measures of key social risk factors and resources are even less frequently collected. Even common data collection instruments may not be validated for all populations. Primary data collection can overcome many of these issues, but it is resource intensive and vulnerable to selection bias.

Putting Evidence into Practice

As will be explored in depth in future blog posts, we are optimistic that mixed-methods studies will help address the tensions among multiple goals. Qualitative inquiry can fill gaps in quantitative data, guide better quantitative studies, or explain quantitative findings. Incorporating more precise measurement or braiding qualitative inquiry into analyses of larger administrative data sets may yield studies with both good statistical power for precise estimates and meaningful interpretation of findings. In purely quantitative studies, triangulation approaches - for example, negative control outcomes or populations - can enhance credibility of results. We also have high standards for ensuring representation in data sources and sampling strategies. Clear integration with prior theory and motivation based on specific hypothesized mechanisms of action helps ensure we learn the most from proposed studies, even when results are not as expected.

Our mandate at E4A is to fund the best feasible research that can yield action to improve population health, wellbeing, and racial equity. This means we take both these strengths and limitations into consideration when making funding decisions. We recognize that a “perfect” study may not be possible, and we try not to let the perfect be the enemy of the good. 

Tools & Resources

All data sources entail tradeoffs: the strongest designs for causal inference are RCTs or natural or quasi-experiments. For statistical power (sample size and efficiency) they are volunteer cohorts or administrative or medical record data. The strongest ones for generalizability/diversity (sampling frame) are population representative cohorts and population linked administrative data. For measurement quality are clinic based studies, qualitative research, or small intensive quantitative projects


Blog posts

About the author(s)

Maria Glymour, ScD, MS, is an Associate Director for Evidence for Action (E4A). She also leads the E4A Methods Laboratory.

Erin Hagan, PhD, MBA, is the deputy director of Evidence for Action. 

Stay Connected