Environment class¶

Single-state environments that contain

class bandit.environment.Environment(rewards: List[T])[source]¶

A single-state environment that contains a list of rewards for actions.

action(i: int) → Union[float, int][source]¶

Given a choice of action, produce a reward for that aciton.

Parameters:	i (int) – action to be taken
Returns:	(float) reward from that action

expected_rewards() → float[source]¶

Produce the expected rewards for all possible actions.

Returns:	(List[float]) expected rewards (true values)

moments(kind: str = 'mv') → List[float][source]¶

Statistical moments of all actions.

Parameters:	kind (str) – which moments to compute; default is “mv”
Returns:	(np.ndarray) statistical moments of actions