Getting Started#

Installation#

Install from PyPI:

pip install torchwm

Install from source:

git clone https://github.com/ParamThakkar123/torchwm.git
cd torchwm
pip install -e .

For development and tests:

pip install -e ".[dev]"

Logging with Weights & Biases and TensorBoard#

TorchWM supports logging experiment results to Weights & Biases (WandB) and TensorBoard.

Weights & Biases#

To use WandB logging, you must provide an API key as anonymous logins are no longer supported.

  1. Get your WandB API key from wandb.ai.

  2. Set the key in your config:

cfg.enable_wandb = True
cfg.wandb_api_key = "your-api-key-here"
cfg.wandb_project = "torchwm"
cfg.wandb_entity = "your-entity"

TensorBoard#

Enable TensorBoard logging:

cfg.enable_tensorboard = True
cfg.log_dir = "runs"

Logs will be saved to the specified directory and can be viewed with tensorboard --logdir runs.

Quick Start: Dreamer#

TorchWM implements multiple world model algorithms. Click on each to see detailed documentation:

Algorithm

Description

Quick Start

Dreamer

Model-based RL with latent dynamics

Dreamer: Model-Based RL with Latent Dynamics

JEPA

Self-supervised visual representations

JEPA: Joint Embedding Predictive Architecture

IRIS

Sample-efficient RL with Transformers

IRIS: Transformers for Sample-Efficient World Models

DiT

Diffusion models with Transformers

DiT: Diffusion Transformer

Quick Start: Modular RSSM#

The modular RSSM allows researchers to swap encoders, decoders, and backbones for experimentation:

from world_models.models.modular_rssm import create_modular_rssm, ModularRSSM
from world_models.models.modular_rssm import ConvEncoder, ViTEncoder, GRUBackbone, LSTMBackbone

# Factory function for quick setup
rssm = create_modular_rssm(
    encoder_type="conv",      # "conv", "mlp", or "vit"
    decoder_type="conv",       # "conv" or "mlp"
    backbone_type="gru",      # "gru", "lstm", or "transformer"
    obs_shape=(3, 64, 64),
    action_size=6,
    stoch_size=32,
    deter_size=200,
    embed_size=1024,
)

# Or build manually with custom components
encoder = ViTEncoder(input_shape=(3, 64, 64), embed_size=1024, patch_size=8, depth=6)
backbone = LSTMBackbone(action_size=6, stoch_size=32, deter_size=200, hidden_size=200, embed_size=1024)
rssm = ModularRSSM(encoder=encoder, decoder=decoder, backbone=backbone, reward_decoder=reward_decoder)

Environment Backends#

Dreamer supports multiple backends through DreamerConfig.env_backend:

  • dmc: DeepMind Control Suite tasks (for example walker-walk)

  • gym: Gym/Gymnasium environment IDs or an existing environment instance

  • unity_mlagents: Unity ML-Agents executable environments

Typical Training Flow#

  1. Choose an algorithm (Dreamer, JEPA, IRIS, or DiT)

  2. Create a config object for that algorithm

  3. Override dataset/environment and optimization fields

  4. Instantiate the corresponding agent

  5. Call train() and monitor logs/checkpoints

For complete API details, see API Reference.