Environment Backends#

TorchWM ships environment adapters for pixel-based world-model training, model-based reinforcement learning, and benchmark collection. Each backend page explains the installation requirements, factory functions, observation and action conventions, configuration fields, and common troubleshooting steps for that backend.

Choosing a backend#

Backend

Best for

Primary APIs

Typical observations

Typical actions

DeepMind Control Suite

Dreamer-style continuous-control tasks with state and rendered image observations

DeepMindControlEnv, env_backend="dmc"

Dict with DMC state keys plus image

Continuous Box from the DMC action spec

DeepMind Lab

3D navigation and puzzle tasks from DeepMind Lab

DMLabEnv, make_dmlab_env, env_backend="dmlab"

Dict with image plus requested Lab observations

Normalized one-hot Box[-1, 1] over native Lab actions

Gym and Gymnasium

Classic control, Box2D, custom Gym environments, and generic rendered tasks

GymImageEnv, make_gym_env, env_backend="gym"

Dict with image only

Original continuous Box or one-hot vector for discrete actions

DeepMind BSuite

Small diagnostic RL benchmark tasks such as catch/0 and deep_sea/0

BSuiteImageEnv, make_bsuite_env, env_backend="bsuite"

Dict with synthetic image only

One-hot vector for discrete actions

Brax

JAX/Brax continuous-control tasks through a Gym-like image adapter

BraxImageEnv, make_brax_env, env_backend="brax"

Dict with synthesized image; raw vector in info["vector_observation"]

Continuous Box[-1, 1] matching env.action_size

Atari

Atari 2600 environments through Gymnasium/ALE

make_atari_env, make_atari_vector_env

ALE RGB/RAM observations

Discrete Atari actions

Procgen

Procedurally generated benchmark games

ProcgenImageEnv, make_procgen_env, env_backend="procgen"

Dict with image

One-hot vector for discrete Procgen actions

MuJoCo

Gymnasium MuJoCo task ids and native MJCF/MJB models

make_mujoco_env

Image dict via GymImageEnv/MuJoCoImageEnv

Continuous Box

Gymnasium Robotics

All ids registered by the installed Gymnasium Robotics package, including moved legacy MuJoCo v2/v3 ids

make_robotics_env, list_gymnasium_robotics_envs

Image dict via GymImageEnv

Continuous Box

Unity ML-Agents

External Unity executable simulations with continuous-control behaviors

UnityMLAgentsEnv, env_backend="unity_mlagents"

Dict with image

Continuous Box[-1, 1]

Vectorized environments

Multiprocess/vector rollout collection and native ALE vectorization

TorchVectorizedEnv, make_atari_vector_env

Batched observations

Batched actions

World Model Env

Model-based RL, policy optimization, and evaluation inside learned dynamics

WorldModelEnv, make_world_model_env, env_backend="world-model"

Adapter-defined Gymnasium space

Adapter-defined Gymnasium space

Wrappers

Shared preprocessing, action conversion, time limits, reward observations, and image transforms

world_models.envs.wrappers

Backend-dependent

Backend-dependent

Shared conventions#

Most TorchWM training code expects image observations as a dictionary entry named image with channel-first shape (3, H, W) and uint8 values. Backend adapters that wrap vector-only environments synthesize an image representation so pixel-based agents can still run.

DIAMOND-style Atari support is documented on the Atari page as a preprocessing helper for Atari rollouts. It is not a separate environment backend.

Dreamer environment construction applies a standard wrapper stack after creating DMC, DMLab, Gym/Gymnasium, MuJoCo, Gymnasium Robotics, Procgen, BSuite, Brax, or Unity environments:

  1. ActionRepeat repeats each selected action for cfg.action_repeat environment steps.

  2. NormalizeActions exposes finite continuous action bounds as normalized [-1, 1] policy outputs.

  3. TimeLimit truncates episodes after cfg.time_limit // cfg.action_repeat wrapper steps.

Use torchwm envs list to print the lightweight backend catalog used by the CLI.