causalinference package

This package contains the CausalModel class, the main interface for assessing the tools of Causalinference.

CausalModel

class causalinference.causal.CausalModel(Y, D, X)

Bases: object

Class that provides the main tools of Causal Inference.

reset()

Reinitializes data to original inputs, and drops any estimated results.

est_propensity(lin='all', qua=None)

Estimates the propensity scores given list of covariates to include linearly or quadratically.

The propensity score is the conditional probability of receiving the treatment given the observed covariates. Estimation is done via a logistic regression.

Parameters:

lin: string or list, optional

Column numbers (zero-based) of variables of the original covariate matrix X to include linearly. Defaults to the string ‘all’, which uses whole covariate matrix.

qua: list, optional

Tuples indicating which columns of the original covariate matrix to multiply and include. E.g., [(1,1), (2,3)] indicates squaring the 2nd column and including the product of the 3rd and 4th columns. Default is to not include any quadratic terms.

est_propensity_s(lin_B=None, C_lin=1, C_qua=2.71)

Estimates the propensity score with covariates selected using the algorithm suggested by [R1].

The propensity score is the conditional probability of receiving the treatment given the observed covariates. Estimation is done via a logistic regression.

The covariate selection algorithm is based on a sequence of likelihood ratio tests.

Parameters:

lin_B: list, optional

Column numbers (zero-based) of variables of the original covariate matrix X to include linearly. Defaults to empty list, meaning every column of X is subjected to the selection algorithm.

C_lin: scalar, optional

Critical value used in likelihood ratio tests to decide whether candidate linear terms should be included. Defaults to 1 as in [R1].

C_qua: scalar, optional

Critical value used in likelihood ratio tests to decide whether candidate quadratic terms should be included. Defaults to 2.71 as in [R1].

References

[R1](1, 2, 3, 4) Imbens, G. & Rubin, D. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences: An Introduction.
trim()

Trims data based on propensity score to create a subsample with better covariate balance.

The default cutoff value is set to 0.1. To set a custom cutoff value, modify the object attribute named cutoff directly.

This method should only be executed after the propensity score has been estimated.

trim_s()

Trims data based on propensity score using the cutoff selection algorithm suggested by [R2].

This method should only be executed after the propensity score has been estimated.

References

[R2](1, 2) Crump, R., Hotz, V., Imbens, G., & Mitnik, O. (2009). Dealing with Limited Overlap in Estimation of Average Treatment Effects. Biometrika, 96, 187-199.
stratify()

Stratifies the sample based on propensity score.

By default the sample is divided into five equal-sized bins. The number of bins can be set by modifying the object attribute named blocks. Alternatively, custom-sized bins can be created by setting blocks equal to a sorted list of numbers between 0 and 1 indicating the bin boundaries.

This method should only be executed after the propensity score has been estimated.

stratify_s()

Stratifies the sample based on propensity score using the bin selection procedure suggested by [R3].

The bin selection algorithm is based on a sequence of two-sample t tests performed on the log-odds ratio.

This method should only be executed after the propensity score has been estimated.

References

[R3](1, 2) Imbens, G. & Rubin, D. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences: An Introduction.
est_via_ols(adj=2)

Estimates average treatment effects using least squares.

Parameters:

adj: int (0, 1, or 2)

Indicates how covariate adjustments are to be performed. Set adj = 0 to not include any covariates. Set adj = 1 to include treatment indicator D and covariates X separately. Set adj = 2 to additionally include interaction terms between D and X. Defaults to 2.

est_via_blocking(adj=1)

Estimates average treatment effects using regression within blocks.

This method should only be executed after the sample has been stratified.

Parameters:

adj: int (0, 1, or 2)

Indicates how covariate adjustments are to be performed for each within-bin regression. Set adj = 0 to not include any covariates. Set adj = 1 to include treatment indicator D and covariates X separately. Set adj = 2 to additionally include interaction terms between D and X. Defaults to 1.

est_via_weighting()

Estimates average treatment effects using doubly-robust version of the Horvitz-Thompson weighting estimator.

est_via_matching(weights='inv', matches=1, bias_adj=False)

Estimates average treatment effects using nearest- neighborhood matching.

Matching is done with replacement. Method supports multiple matching. Correcting bias that arise due to imperfect matches is also supported. For details on methodology, see [R4].

Parameters:

weights: str or positive definite square matrix

Specifies weighting matrix used in computing distance measures. Defaults to string ‘inv’, which does inverse variance weighting. String ‘maha’ gives the weighting matrix used in the Mahalanobis metric.

matches: int

Number of matches to use for each subject.

bias_adj: bool

Specifies whether bias adjustments should be attempted.

References

[R4](1, 2) Imbens, G. & Rubin, D. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences: An Introduction.