# Gym and Gymnasium The Gym/Gymnasium backend adapts standard Gym-like environments to TorchWM's image-first training interface. It accepts either an environment ID string or a pre-built environment instance and returns observations as `{"image": ...}`. ## Install Gymnasium and Gym are included in TorchWM's base dependency set. Optional environment families may require extras: ```bash pip install torchwm[gym] pip install "gymnasium[classic-control,box2d,atari]" ``` Install the extras needed by the specific Gymnasium environment ID you plan to use. ## Main APIs ```python from torchwm import GymImageEnv, make_gym_env env = make_gym_env("Pendulum-v1", seed=0, size=(64, 64), render_mode="rgb_array") obs = env.reset() ``` You can also wrap an already-created environment: ```python import gymnasium as gym from torchwm import GymImageEnv base_env = gym.make("CartPole-v1", render_mode="rgb_array") env = GymImageEnv(base_env, seed=123, size=(64, 64)) ``` ## Dreamer configuration ```python from torchwm import DreamerConfig cfg = DreamerConfig() cfg.env_backend = "gym" cfg.env = "Pendulum-v1" cfg.gym_render_mode = "rgb_array" cfg.image_size = 64 ``` `env_backend` can be `"gym"`, `"gymnasium"`, or `"generic"`. If `cfg.env_instance` is provided, Dreamer wraps that instance with `GymImageEnv` regardless of backend string. ## Observation conversion `GymImageEnv` always exposes: ```python {"image": uint8 array with shape (3, H, W)} ``` The wrapper handles several observation styles: - Tuple reset/step outputs from Gymnasium by taking the first item as the observation. - Dict observations by preferring image-like keys such as `image`, `pixels`, `rgb`, `observation`, or `state`. - Vector observations by rendering simple vertical intensity bands into an RGB image. - HWC, CHW, grayscale, and RGBA images by converting to RGB, resizing, and transposing to CHW. When the wrapped environment supports `render()`, TorchWM attempts to use rendered frames for visual observations. If rendering fails or only vector observations are available, it falls back to vector-to-image synthesis. ## Action conversion For continuous action spaces, `GymImageEnv.action_space` mirrors the wrapped environment's `Box` bounds. For discrete action spaces, `GymImageEnv.action_space` is a continuous `Box` of shape `(n,)` in `[-1, 1]`. The wrapper expects a one-hot-like action vector and converts it to the discrete index with `argmax` before stepping the base environment. Its `sample()` method returns one-hot vectors with `1.0` at the selected action and `-1.0` elsewhere. ## Example environments The lightweight catalog lists common IDs such as: - Classic control: `CartPole-v1`, `Pendulum-v1`, `Acrobot-v1`, `MountainCarContinuous-v0` - MuJoCo-style IDs: `HalfCheetah-v4`, `Humanoid-v4`, `Hopper-v4`, `Walker2d-v4`, `Ant-v4` - Box2D: `LunarLander-v3`, `LunarLanderContinuous-v3`, `BipedalWalker-v3`, `CarRacing-v3` - Toy text: `Blackjack-v1`, `FrozenLake-v1`, `Taxi-v3` ## CLI collection The CLI can collect random-policy rollouts from Gym-like environments: ```bash torchwm collect --env CartPole-v1 --steps 1000 --out cartpole.npz ``` The command first tries `torchwm.make_env()` and falls back to `gym.make()`. ## Troubleshooting - **Black frames or missing render output**: create the environment with `render_mode="rgb_array"` and pass the same render mode to `GymImageEnv`. - **Box2D import errors**: install the Box2D Gymnasium extra. - **Discrete policies produce invalid actions**: emit vectors of length `env.action_space.shape[0]`; the wrapper chooses `argmax`. - **Custom environment reset signatures**: Gymnasium-style `(obs, info)` and Gym-style `obs` resets are both supported by the wrapper.