Causal models for qualitative inference and mixed methods
Materials for a two day introduction for causal models for process tracing and mixed methods
Day 1:
- Lecture 1: Causality and Causal Models
- Exercises 1: Make some causal models
- Lecture 2: Queries and inferences
- Exercises 2: Defining questions, Bayesian updating
Day 2:
Other helpers:
- Shiny app to do process tracing by hand
- 64 types interpretation guide
Handouts:
Lecture 1 + Exercise 1
After the first session you should have an understanding of:
- the limits of design-based inference
- the limits of non-explicit qualitative inference
- what a potential outcome is
- what a causal effect is (under the potential outcomes model)
- what a causal model is
- what a DAG is
- what an arrow or its absence means in a DAG
- what a “causal type” is
- how to construct a model in
CausalQueries
Lecture 2 + Exercise 2
After the second session you should have an understanding of:
- the difference between case-level and population-level queries
- causal queries as questions about a case’s causal type (i.e., about theta’s)
- how to write a query in
CausalQueries - what Bayes’ rule is and how it works for discrete and continuous queries
- what conditional independence is and how to read it from a causal graph
- how conditional independence relates to the informativeness of clues for understanding queries: we hope you will start being able to see queries and clues as nodes on a DAG and understand when a clue is informative for a theory
Lecture 3 + Exercise 3
After the third session you should have an understanding of:
- when process data is potentially information about causal queries
- clues are not just mediators: they can occupy different spots on a DAG
- why a model has to have substantive assumptions—beyond the structural assumptions—to allow information inferences from process data
- a procedure for drawing qualitative inferences
- the benefits of an explicit strategy for process tracing
- how to draw qualitative inferences in
CausalQueries
Lecture 4 + Exercise 4
After the fourth session you should have an understanding of:
- what are some key population queries
- what the parameters of a mixed methods model are (the \(\lambda\)s)
- Bayesian approaches to updating on population queries
- what mixtures of quantitative and qualitative data look like
- why, fundamentally, updating procedures are the same whether you update on treatment-outcome data, process data, or mixed data
- how qualitative (process) data can help quantitative inferences
- how quantitative data can help qualitative inferences
- what data gathering you might use in practice in a mixed methods evaluation
- how to do all this in
CausalQueries
What we have not covered but you can read more about in the book:
- how to assess how dependent your inferences are on your model
- how to assess whether your model is doing more harm than good!
Advance watching and reading
The causal models approach we’ll be teaching is fairly complex, and you will get much more out of the course if you prepare in advance by engaging with our book, Integrated Inferences.
First, watch these four introductory videos to get the general idea - they will make the reading a bit easier: https://integrated-inferences.github.io/videos.html
Then, read the following chapters from the open access pre-print: https://integrated-inferences.github.io/book/
Chapters 2, 4, 5, 7, 9
The text is fairly dense, so don’t let yourself get bogged down if there are things you don’t understand. Just keep going. There will be plenty of time for questions and further explanation in class, but it’s critical that you come in with this overview.
Software installation
In class, we will be doing hands-on exercises with the software package through which the approach can be implemented, CausalQueries. You will thus need to have all necessary software installed before the start of class on the first day.
Make sure you have an up to date installation of R Make sure you have an up to date installation of RStudio Install CausalQueries from CRAN, e.g. in Rstudio. e.g. via install.packages(“CausalQueries”) To check the package is working try to make and update a model like this: model <- make_model() |> update_model()
Identify data (Optional)
During the class we will practice forming causal models and drawing inferences from them. We will work with simulated examples and some real examples that we used in our book. However if you have an application in mind we encourage you to identify data, even if imperfect, that you can try this all out on. Good data for this would have the following features:
A key outcome variable you particularly care about
One or two possible causes that you care about
Data on at least some of the cases that capture either relevant features of context, or aspects of processes connecting causes and outcomes
All variables can be plausibly dichotomized (measured as 0 or 1) Keep things simple: use a setting where each observation is independent – so avoid data with complex hierarchical, clustering, or time dependence features.