# Environment Backends TorchWM ships environment adapters for pixel-based world-model training, model-based reinforcement learning, and benchmark collection. Each backend page explains the installation requirements, factory functions, observation and action conventions, configuration fields, and common troubleshooting steps for that backend. ```{toctree} :maxdepth: 1 DeepMind Control Suite DeepMind Lab Gym and Gymnasium Brax Atari Procgen MuJoCo Gymnasium Robotics Unity ML-Agents Vectorized Environments World Model Env Wrappers ``` ## Choosing a backend | Backend | Best for | Primary APIs | Typical observations | Typical actions | | --- | --- | --- | --- | --- | | [DeepMind Control Suite](dmc.md) | Dreamer-style continuous-control tasks with state and rendered image observations | `DeepMindControlEnv`, `env_backend="dmc"` | Dict with DMC state keys plus `image` | Continuous `Box` from the DMC action spec | | [DeepMind Lab](dmlab.md) | 3D navigation and puzzle tasks from DeepMind Lab | `DMLabEnv`, `make_dmlab_env`, `env_backend="dmlab"` | Dict with `image` plus requested Lab observations | Normalized one-hot `Box[-1, 1]` over native Lab actions | | [Gym and Gymnasium](gym.md) | Classic control, Box2D, custom Gym environments, and generic rendered tasks | `GymImageEnv`, `make_gym_env`, `env_backend="gym"` | Dict with `image` only | Original continuous `Box` or one-hot vector for discrete actions | | DeepMind BSuite | Small diagnostic RL benchmark tasks such as `catch/0` and `deep_sea/0` | `BSuiteImageEnv`, `make_bsuite_env`, `env_backend="bsuite"` | Dict with synthetic `image` only | One-hot vector for discrete actions | | [Brax](brax.md) | JAX/Brax continuous-control tasks through a Gym-like image adapter | `BraxImageEnv`, `make_brax_env`, `env_backend="brax"` | Dict with synthesized `image`; raw vector in `info["vector_observation"]` | Continuous `Box[-1, 1]` matching `env.action_size` | | [Atari](atari.md) | Atari 2600 environments through Gymnasium/ALE | `make_atari_env`, `make_atari_vector_env` | ALE RGB/RAM observations | Discrete Atari actions | | [Procgen](procgen.md) | Procedurally generated benchmark games | `ProcgenImageEnv`, `make_procgen_env`, `env_backend="procgen"` | Dict with `image` | One-hot vector for discrete Procgen actions | | [MuJoCo](mujoco.md) | Gymnasium MuJoCo task ids and native MJCF/MJB models | `make_mujoco_env` | Image dict via `GymImageEnv`/`MuJoCoImageEnv` | Continuous `Box` | | [Gymnasium Robotics](robotics.md) | All ids registered by the installed Gymnasium Robotics package, including moved legacy MuJoCo v2/v3 ids | `make_robotics_env`, `list_gymnasium_robotics_envs` | Image dict via `GymImageEnv` | Continuous `Box` | | [Unity ML-Agents](unity.md) | External Unity executable simulations with continuous-control behaviors | `UnityMLAgentsEnv`, `env_backend="unity_mlagents"` | Dict with `image` | Continuous `Box[-1, 1]` | | [Vectorized environments](vectorized.md) | Multiprocess/vector rollout collection and native ALE vectorization | `TorchVectorizedEnv`, `make_atari_vector_env` | Batched observations | Batched actions | | [World Model Env](world_model.md) | Model-based RL, policy optimization, and evaluation inside learned dynamics | `WorldModelEnv`, `make_world_model_env`, `env_backend="world-model"` | Adapter-defined Gymnasium space | Adapter-defined Gymnasium space | | [Wrappers](wrappers.md) | Shared preprocessing, action conversion, time limits, reward observations, and image transforms | `world_models.envs.wrappers` | Backend-dependent | Backend-dependent | ## Shared conventions Most TorchWM training code expects image observations as a dictionary entry named `image` with channel-first shape `(3, H, W)` and `uint8` values. Backend adapters that wrap vector-only environments synthesize an image representation so pixel-based agents can still run. DIAMOND-style Atari support is documented on the Atari page as a preprocessing helper for Atari rollouts. It is not a separate environment backend. Dreamer environment construction applies a standard wrapper stack after creating DMC, DMLab, Gym/Gymnasium, MuJoCo, Gymnasium Robotics, Procgen, BSuite, Brax, or Unity environments: 1. `ActionRepeat` repeats each selected action for `cfg.action_repeat` environment steps. 2. `NormalizeActions` exposes finite continuous action bounds as normalized `[-1, 1]` policy outputs. 3. `TimeLimit` truncates episodes after `cfg.time_limit // cfg.action_repeat` wrapper steps. Use `torchwm envs list` to print the lightweight backend catalog used by the CLI.