viberl.typing
Custom typing classes for reinforcement learning using Pydantic.
Classes:
Name | Description |
---|---|
Action |
An action taken by an agent, optionally with log probabilities. |
Transition |
A single transition in an episode. |
Trajectory |
A complete trajectory (episode) consisting of multiple transitions. |
Action
Action(**data: Any)
Bases: BaseModel
An action taken by an agent, optionally with log probabilities.
Attributes:
Name | Type | Description |
---|---|---|
model_config |
|
|
action |
int
|
|
logprobs |
Tensor | None
|
|
Source code in .venv/lib/python3.12/site-packages/pydantic/main.py
243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 |
|
model_config
class-attribute
instance-attribute
model_config = ConfigDict(arbitrary_types_allowed=True)
action
instance-attribute
action: int
logprobs
class-attribute
instance-attribute
logprobs: Tensor | None = None
Transition
Transition(**data: Any)
Bases: BaseModel
A single transition in an episode.
Attributes:
Name | Type | Description |
---|---|---|
model_config |
|
|
state |
ndarray
|
|
action |
Action
|
|
reward |
float
|
|
next_state |
ndarray
|
|
done |
bool
|
|
info |
dict[str, Any]
|
|
Source code in .venv/lib/python3.12/site-packages/pydantic/main.py
243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 |
|
model_config
class-attribute
instance-attribute
model_config = ConfigDict(arbitrary_types_allowed=True)
state
instance-attribute
state: ndarray
reward
instance-attribute
reward: float
next_state
instance-attribute
next_state: ndarray
done
instance-attribute
done: bool
info
class-attribute
instance-attribute
info: dict[str, Any] = {}
Trajectory
Trajectory(**data: Any)
Bases: BaseModel
A complete trajectory (episode) consisting of multiple transitions.
Methods:
Name | Description |
---|---|
from_transitions |
Create a trajectory from a list of transitions. |
to_dict |
Convert trajectory to dictionary format for agent learning. |
Attributes:
Name | Type | Description |
---|---|---|
model_config |
|
|
transitions |
list[Transition]
|
|
total_reward |
float
|
|
length |
int
|
|
Source code in .venv/lib/python3.12/site-packages/pydantic/main.py
243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 |
|
model_config
class-attribute
instance-attribute
model_config = ConfigDict(arbitrary_types_allowed=True)
total_reward
instance-attribute
total_reward: float
length
instance-attribute
length: int
from_transitions
classmethod
from_transitions(transitions: list[Transition]) -> Trajectory
Create a trajectory from a list of transitions.
Source code in viberl/typing.py
43 44 45 46 47 |
|
to_dict
to_dict() -> dict
Convert trajectory to dictionary format for agent learning.
Source code in viberl/typing.py
49 50 51 52 53 54 55 56 57 58 59 60 61 |
|