Environment class

Single-state environments that contain

class bandit.environment.Environment(rewards: List[T])[source]

A single-state environment that contains a list of rewards for actions.

action(i: int) → Union[float, int][source]

Given a choice of action, produce a reward for that aciton.

Parameters:i (int) – action to be taken
Returns:(float) reward from that action
expected_rewards() → float[source]

Produce the expected rewards for all possible actions.

Returns:(List[float]) expected rewards (true values)
moments(kind: str = 'mv') → List[float][source]

Statistical moments of all actions.

Parameters:kind (str) – which moments to compute; default is “mv”
Returns:(np.ndarray) statistical moments of actions