Estimate/variance of policy evaluation via non-contextual weighting.

Computes the estimate and variance of a policy evaluation based on non-contextual weights, AIPW scores, and a policy matrix.

Usage

estimate(w, gammahat, policy)

Arguments

w: Numeric vector. Non-contextual weights, length A. Must not contain NA values.
gammahat: Numeric matrix. AIPW scores, shape [A, K]. Must not contain NA values.
policy: Numeric matrix. Policy matrix \(\pi(X_t, w)\), shape [A, K]. Must have the same shape as gammahat and must not contain NA values.

Value

Named numeric vector with elements estimate and var, representing the estimated policy value and the variance of the estimate, respectively.

Examples

w <- c(0.5, 1, 0.5, 1.5)
scores <- matrix(c(0.5, 0.8, 0.6,
                   0.3, 0.9, 0.2,
                   0.5, 0.7, 0.4,
                   0.8, 0.2, 0.6), ncol = 3, byrow = TRUE)
policy <- matrix(c(0.2, 0.3, 0.5,
                   0.6, 0.1, 0.3,
                   0.4, 0.5, 0.1,
                   0.2, 0.7, 0.1), ncol = 3, byrow = TRUE)
gammahat <- scores - policy
estimate(w = w, gammahat = gammahat,
policy = policy)
#>     estimate          var 
#> -0.052857143  0.006466056