Computes AIPW/doubly robust scores based on observed rewards, pulled arms, and inverse
probability scores. If mu_hat
is provided, compute AIPW scores, otherwise compute IPW scores.
Arguments
- yobs
Numeric vector. Observed rewards. Must not contain NA values.
- ws
Integer vector. Pulled arms. Must not contain NA values. Length must match
yobs
.- balwts
Numeric matrix. Inverse probability score \(1[W_t=w]/e_t(w)\) of pulling arms, shape
[A, K]
, whereA
is the number of observations andK
is the number of arms. Must not contain NA values.- K
Integer. Number of arms. Must be a positive integer.
- mu_hat
Optional numeric matrix. Plug-in estimator of arm outcomes, shape
[A, K]
, orNULL
. Must not contain NA values if provided.
Examples
aw_scores(yobs = c(0.5, 1, 0, 1.5),
ws = c(1, 2, 2, 3),
balwts = matrix(c(0.5, 2, 1, 0.5,
1, 1.5, 0.5, 1.5,
2, 1.5, 0.5, 1),
ncol = 3),
K = 3,
mu_hat = matrix(c(0.5, 0.8, 0.6, 0.3,
0.9, 0.2, 0.5, 0.7,
0.4, 0.8, 0.2, 0.6),
ncol = 3))
#> [,1] [,2] [,3]
#> [1,] 0.5 0.90 0.4
#> [2,] 0.8 1.40 0.8
#> [3,] 0.6 0.25 0.2
#> [4,] 0.3 0.70 1.5