causalinference.utils package¶
causalinference.utils.tools module¶
-
causalinference.utils.tools.
random_data
(N=5000, K=3, unobservables=False, **kwargs)¶ Function that generates data according to one of two simple models that satisfies the unconfoundedness assumption.
- The covariates and error terms are generated according to
X ~ N(mu, Sigma), epsilon ~ N(0, Gamma).
- The counterfactual outcomes are generated by
Y0 = X*beta + epsilon_0, Y1 = delta + X*(beta+theta) + epsilon_1.
- Selection is done according to the following propensity score function:
P(D=1|X) = Lambda(X*beta).
Here Lambda is the standard logistic CDF.
- Parameters
- N: int
Number of units to draw. Defaults to 5000.
- K: int
Number of covariates. Defaults to 3.
- unobservables: bool
Returns potential outcomes and true propensity score in addition to observed outcome and covariates if True. Defaults to False.
- mu, Sigma, Gamma, beta, delta, theta: NumPy ndarrays, optional
Parameter values appearing in data generating process.
- Returns
- tuple
A tuple in the form of (Y, D, X) or (Y, D, X, Y0, Y1) of observed outcomes, treatment indicators, covariate matrix, and potential outomces.