Run an experiment using Thompson Sampling. — run

Runs a LinTS or non-contextual TS bandit experiment, given potential outcomes and covariates.

Usage

run_experiment(
  ys,
  floor_start,
  floor_decay,
  batch_sizes,
  xs = NULL,
  balanced = NULL
)

Arguments

ys: Matrix. Potential outcomes of shape [A, K], where A is the number of observations and K is the number of arms. Must not contain NA values.
floor_start: Numeric. Specifies the initial value for the assignment probability floor. It ensures that at the start of the process, no assignment probability falls below this threshold. Must be a positive number.
floor_decay: Numeric. Decay rate of the floor. The floor decays with the number of observations in the experiment such that at each point in time, the applied floor is: floor_start/(s^{floor_decay}), where s is the starting index for a batched experiment, or the observation index for an online experiment. Must be a number between 0 and 1 (inclusive).
batch_sizes: Integer vector. Size of each batch. Must be positive integers.
xs: Optional matrix. Covariates of shape [A, p], where p is the number of features, if the LinTSModel is contextual. Default is NULL. Must not contain NA values.
balanced: Optional logical. Indicates whether to balance the batches. Default is NULL.

Value

A list containing the pulled arms (ws), observed rewards (yobs), assignment probabilities (probs), and the fitted bandit model (fitted_bandit_model).

Examples

set.seed(123)
A <- 1000
K <- 4
xs <- matrix(runif(A * K), nrow = A, ncol = K)
ys <- matrix(rbinom(A * K, 1, 0.5), nrow = A, ncol = K)
batch_sizes <- c(250, 250, 250, 250)
results <- run_experiment(ys = ys,
                          floor_start = 1/K,
                          floor_decay = 0.9,
                          batch_sizes = batch_sizes,
                          xs = xs)