Skip to contents

Adjust parameter matrix to allow confounding.

Usage

set_confound(model, confound = NULL)

Arguments

model

A causal_model. A model object generated by make_model.

confound

A list of statements indicating pairs of nodes whose types are jointly distributed (e.g. list("A <-> B", "C <-> D")).

Value

An object of class causal_model with updated parameters_df and parameter matrix.

Details

Confounding between X and Y arises when the nodal types for X and Y are not independently distributed. In the X -> Y graph, for instance, there are 2 nodal types for X and 4 for Y. There are thus 8 joint nodal types:


|          | t^X                |                    |           |
|-----|----|--------------------|--------------------|-----------|
|     |    | 0                  | 1                  | Sum       |
|-----|----|--------------------|--------------------|-----------|
| t^Y | 00 | Pr(t^X=0 & t^Y=00) | Pr(t^X=1 & t^Y=00) | Pr(t^Y=00)|
|     | 10 | .                  | .                  | .         |
|     | 01 | .                  | .                  | .         |
|     | 11 | .                  | .                  | .         |
|-----|----|--------------------|--------------------|-----------|
|     |Sum | Pr(t^X=0)          | Pr(t^X=1)          | 1         |

This table has 8 interior elements and so an unconstrained joint distribution would have 7 degrees of freedom. A no confounding assumption means that Pr(t^X | t^Y) = Pr(t^X), or Pr(t^X, t^Y) = Pr(t^X)Pr(t^Y). In this case there would be 3 degrees of freedom for Y and 1 for X, totaling 4 rather than 7.

set_confound lets you relax this assumption by increasing the number of parameters characterizing the joint distribution. Using the fact that P(A,B) = P(A)P(B|A) new parameters are introduced to capture P(B|A=a) rather than simply P(B). For instance here two parameters (and one degree of freedom) govern the distribution of types X and four parameters (with 3 degrees of freedom) govern the types for Y given the type of X for a total of 1+3+3 = 7 degrees of freedom.

Examples


make_model('X -> Y; X <-> Y') |>
grab("parameters")
#> Model parameters with associated probabilities: 
#> 
#> X.0 X.1 Y.00_X.0 Y.10_X.0 Y.01_X.0 Y.11_X.0 Y.00_X.1 Y.10_X.1 Y.01_X.1 Y.11_X.1
#> 0.5 0.5 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25

make_model('X -> M -> Y; X <-> Y') |>
grab("parameters")
#> Model parameters with associated probabilities: 
#> 
#> X.0 X.1 M.00 M.10 M.01 M.11 Y.00_X.0 Y.10_X.0 Y.01_X.0 Y.11_X.0 Y.00_X.1 Y.10_X.1 Y.01_X.1 Y.11_X.1
#> 0.5 0.5 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25

model <- make_model('X -> M -> Y; X <-> Y; M <-> Y')
model$parameters_df
#> Mapping of model parameters to nodal types: 
#> 
#> ----------------------------------------------------------------
#> 
#>  param_names: name of parameter
#>  node: name of endogeneous node associated with the parameter
#>  gen: partial causal ordering of the parameter's node
#>  param_set: parameter groupings forming a simplex
#>  given: if model has confounding gives conditioning nodal type
#>  param_value: parameter values
#>  priors: hyperparameters of the prior Dirichlet distribution 
#> 
#> ----------------------------------------------------------------
#> 
#> 
#>  first 10 rows: 
#>      param_names node gen  param_set nodal_type     given param_value priors
#> 1            X.0    X   1          X          0                  0.50      1
#> 2            X.1    X   1          X          1                  0.50      1
#> 3           M.00    M   2          M         00                  0.25      1
#> 4           M.10    M   2          M         10                  0.25      1
#> 5           M.01    M   2          M         01                  0.25      1
#> 6           M.11    M   2          M         11                  0.25      1
#> 7  Y.00_M.00_X.0    Y   3 Y.M.00.X.0         00 M.00, X.0        0.25      1
#> 8  Y.10_M.00_X.0    Y   3 Y.M.00.X.0         10 M.00, X.0        0.25      1
#> 9  Y.01_M.00_X.0    Y   3 Y.M.00.X.0         01 M.00, X.0        0.25      1
#> 10 Y.11_M.00_X.0    Y   3 Y.M.00.X.0         11 M.00, X.0        0.25      1

# Example where set_confound is implemented after restrictions
make_model("A -> B -> C") |>
set_restrictions(increasing("A", "B")) |>
set_confound("B <-> C") |>
grab("parameters")
#> Model parameters with associated probabilities: 
#> 
#> A.0 A.1 B.00 B.10 B.11 C.00_B.00 C.10_B.00 C.01_B.00 C.11_B.00 C.00_B.10 C.10_B.10 C.01_B.10 C.11_B.10 C.00_B.11 C.10_B.11 C.01_B.11 C.11_B.11
#> 0.5 0.5 0.3333333 0.3333333 0.3333333 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25

# Example where two parents are confounded
make_model('A -> B <- C; A <-> C') |>
  set_parameters(node = "C", c(0.05, .95, .95, 0.05)) |>
  make_data(n = 50) |>
  cor()
#> Warning: A specified condition matches multiple parameters. In these cases it is unclear which parameter value should be assigned to which parameter. Assignment thus defaults to the order in which parameters appear in 'parameters_df'. We advise checking that parameter assignment was carried out as you intended. 
#> Warning: You are altering parameters on confounded nodes. Alterations will be applied across all 'param_sets'. If this is not the alteration behavior you intended, try specifying the 'param_set' or 'given' option to more clearly indicate parameters whose values you wish to alter.
#>             A           C           B
#> A  1.00000000 -0.91987179 -0.08353438
#> C -0.91987179  1.00000000  0.08353438
#> B -0.08353438  0.08353438  1.00000000

 # Example with two confounds, added sequentially
model <- make_model('A -> B -> C') |>
  set_confound(list("A <-> B", "B <-> C"))
model$statement
#> [1] "A -> B -> C; B <-> A; C <-> B"
# plot(model)