Process tracing
Macartan
The big picture: Intuition
Simple insight: If you believe this model, then seeing \(M\) should tell you something about a query regarding the \(I\), \(D\) relationship in a case.
For instance, we might have the intuition: If there was no mobilization in a high-inequality case that democratized, then inequality didn’t cause the transition.
But how do we formalize this strategy?
Key formal insight:
If you believe this model, then seeing \(M\) should tell you something about the \(\theta\)s—which are what define the effect of \(I\) on \(D\).
We can tell when some evidence might potentially matter
Three clues for figuring out the effect of \(X\) on \(Y\)
Alan
Let’s walk though the intuition with a case
We start with:
What we don’t know is whether any of those conditions caused the outcome
So:
Suppose we observe democratization as the outcome in Malawi: \(D=1\)
We also observe high inequality in Malawi: \(I=1\)
We want to know: did \(I=1\) cause \(D=1\) in Malawi?
Suppose we go to the field and we learn that mass mobilization DID occur in Malawi
What can we conclude?
NOTHING YET!
We knew \(I=1\), \(D=1\)
We then saw \(M=1\)
But which is more consistent with \(I=1\) causing \(D=1\)?
The evidence in process tracing never speaks for itself
Our inferences from PT evidence always depend on theory
Most process tracing is either silent about these beliefs or expresses them informally
We can formalize these beliefs
Macartan
for process tracing with causal models
Figure 1: Logic of simple updating on arbitrary queries.
Our DAG is:
\[X \rightarrow M \rightarrow Y\]
And we believe:
What are the types? How likely is each one? How likely is each given the data?
| Type | X | M | Y | prob | Query? | Data ? |
|---|---|---|---|---|---|---|
| X = 0, X causes M, M causes Y | 0 | 0 | 0 | 1/8 | ✓ | ✓ |
| X = 0, X causes M, M does not cause Y | 0 | 0 | 0 | 1/8 | ✓ | |
| X = 0, X does not cause M, M causes Y | 0 | 0 | 0 | 1/8 | ✓ | |
| X = 0, X does not cause M, M does not cause Y | 0 | 0 | 0 | 1/8 | ✓ | |
| X = 1, X causes M, M causes Y | 1 | 1 | 1 | 1/8 | ||
| X = 1, X causes M, M does not cause Y | 1 | 1 | 0 | 1/8 | ||
| X = 1, X does not cause M, M causes Y | 1 | 0 | 0 | 1/8 | ||
| X = 1, X does not cause M, M does not cause Y | 1 | 0 | 0 | 1/8 |
CausalQueries: Step 1Define a model
CausalQueries: Step 2Get types consistent with query
CausalQueries: Step 3Get mapping from causal types to consistent data types
CausalQueries: Step 4Get prior probabilities of each causal type
model |>
grab(what = "ambiguities_matrix") |>
data.frame() |>
mutate(
in_query = get_query_types(model, query)$types,
priors = CausalQueries:::get_type_prob(model)) |>
kable()| X0M0Y0 | X1M0Y0 | X1M1Y0 | X1M1Y1 | in_query | priors | |
|---|---|---|---|---|---|---|
| X0M00Y00 | 1 | 0 | 0 | 0 | FALSE | 0.125 |
| X1M00Y00 | 0 | 1 | 0 | 0 | FALSE | 0.125 |
| X0M01Y00 | 1 | 0 | 0 | 0 | FALSE | 0.125 |
| X1M01Y00 | 0 | 0 | 1 | 0 | FALSE | 0.125 |
| X0M00Y01 | 1 | 0 | 0 | 0 | FALSE | 0.125 |
| X1M00Y01 | 0 | 1 | 0 | 0 | FALSE | 0.125 |
| X0M01Y01 | 1 | 0 | 0 | 0 | TRUE | 0.125 |
| X1M01Y01 | 0 | 0 | 0 | 1 | TRUE | 0.125 |
CausalQueriesThree clues for figuring out the effect of \(X\) on \(Y\)
Three clues for figuring out the effect of \(X\) on \(Y\)
observed = c("Y==1 & X==1",
"Y==1 & X==1 & K1==1",
"Y==1 & X==1 & K2==1",
"Y==1 & X==1 & K3==1",
"Y==1 & X==1 & K1==1 & K2==1 & K3==1")
query_model(model = model,
query = "Y[X=1] > Y[X=0]",
given = observed)
Causal queries generated by query_model (all at population level)
|label |using | mean|
|:---------------------------------------------------------|:----------|----:|
|Y[X=1] > Y[X=0] given Y==1 & X==1 |parameters | 0.25|
|Y[X=1] > Y[X=0] given Y==1 & X==1 & K1==1 |parameters | 0.25|
|Y[X=1] > Y[X=0] given Y==1 & X==1 & K2==1 |parameters | 0.25|
|Y[X=1] > Y[X=0] given Y==1 & X==1 & K3==1 |parameters | 0.25|
|Y[X=1] > Y[X=0] given Y==1 & X==1 & K1==1 & K2==1 & K3==1 |parameters | 0.25|
model |>
set_restrictions(decreasing("X", "K1")) |>
set_restrictions(decreasing("K1", "Y")) |>
set_restrictions(decreasing("K2", "Y")) |>
set_restrictions(decreasing("Y", "K3")) |>
set_parameters(given = "Y.0001", nodal_type = "11", .9)|>
query_model(
query = "Y[X=1] > Y[X=0]",
given = observed)
Causal queries generated by query_model (all at population level)
|label |using | mean|
|:---------------------------------------------------------|:----------|-----:|
|Y[X=1] > Y[X=0] given Y==1 & X==1 |parameters | 0.200|
|Y[X=1] > Y[X=0] given Y==1 & X==1 & K1==1 |parameters | 0.250|
|Y[X=1] > Y[X=0] given Y==1 & X==1 & K2==1 |parameters | 0.154|
|Y[X=1] > Y[X=0] given Y==1 & X==1 & K3==1 |parameters | 0.212|
|Y[X=1] > Y[X=0] given Y==1 & X==1 & K1==1 & K2==1 & K3==1 |parameters | 0.224|
CausalQueriesAlso try our shiny app
Key advantages of using a causal model:
Alan
Could also come from: