Gym and Gymnasium#

The Gym/Gymnasium backend adapts standard Gym-like environments to TorchWM’s image-first training interface. It accepts either an environment ID string or a pre-built environment instance and returns observations as {"image": ...}.

Install#

Gymnasium and Gym are included in TorchWM’s base dependency set. Optional environment families may require extras:

pip install torchwm[gym]
pip install "gymnasium[classic-control,box2d,atari]"

Install the extras needed by the specific Gymnasium environment ID you plan to use.

Main APIs#

from torchwm import GymImageEnv, make_gym_env

env = make_gym_env("Pendulum-v1", seed=0, size=(64, 64), render_mode="rgb_array")
obs = env.reset()

You can also wrap an already-created environment:

import gymnasium as gym
from torchwm import GymImageEnv

base_env = gym.make("CartPole-v1", render_mode="rgb_array")
env = GymImageEnv(base_env, seed=123, size=(64, 64))

Dreamer configuration#

from torchwm import DreamerConfig

cfg = DreamerConfig()
cfg.env_backend = "gym"
cfg.env = "Pendulum-v1"
cfg.gym_render_mode = "rgb_array"
cfg.image_size = 64

env_backend can be "gym", "gymnasium", or "generic". If cfg.env_instance is provided, Dreamer wraps that instance with GymImageEnv regardless of backend string.

Observation conversion#

GymImageEnv always exposes:

{"image": uint8 array with shape (3, H, W)}

The wrapper handles several observation styles:

Tuple reset/step outputs from Gymnasium by taking the first item as the observation.
Dict observations by preferring image-like keys such as image, pixels, rgb, observation, or state.
Vector observations by rendering simple vertical intensity bands into an RGB image.
HWC, CHW, grayscale, and RGBA images by converting to RGB, resizing, and transposing to CHW.

When the wrapped environment supports render(), TorchWM attempts to use rendered frames for visual observations. If rendering fails or only vector observations are available, it falls back to vector-to-image synthesis.

Action conversion#

For continuous action spaces, GymImageEnv.action_space mirrors the wrapped environment’s Box bounds.

For discrete action spaces, GymImageEnv.action_space is a continuous Box of shape (n,) in [-1, 1]. The wrapper expects a one-hot-like action vector and converts it to the discrete index with argmax before stepping the base environment. Its sample() method returns one-hot vectors with 1.0 at the selected action and -1.0 elsewhere.

Example environments#

The lightweight catalog now queries the installed Gymnasium registry at runtime instead of maintaining a hardcoded list of versioned environment IDs. Use Gymnasium’s environment docs and torchwm envs list to inspect the IDs available in your local installation and optional extras.

CLI collection#

The CLI can collect random-policy rollouts from Gym-like environments:

torchwm collect --env CartPole-v1 --steps 1000 --out cartpole.npz

The command first tries torchwm.make_env() and falls back to gym.make().

Troubleshooting#

Black frames or missing render output: create the environment with render_mode="rgb_array" and pass the same render mode to GymImageEnv.
Box2D import errors: install the Box2D Gymnasium extra.
Discrete policies produce invalid actions: emit vectors of length env.action_space.shape[0]; the wrapper chooses argmax.
Custom environment reset signatures: Gymnasium-style (obs, info) and Gym-style obs resets are both supported by the wrapper.