Training Manager (web UI)

The Training Manager is an experimental tool — built in an evening by the team to scratch an itch. It works, it’s useful, and it ships with the OS. But it’s rough around the edges. Contributions welcome.

The Training Manager is a local web server that runs on your robot and gives you a browser-based dashboard for the entire training pipeline. It’s the power-user complement to the app’s training UI — useful when you need to merge datasets, remove bad episodes, or point a training run at a custom ACT fork.

Launch it

The Training Manager is bundled with the training-client CLI. Run it from inside the robot (via SSH) or inside the Docker container:

python -m training_client.cli ui

This starts a local web server and prints two URLs:

  Training Manager
    Local:   http://localhost:8080
    Network: http://192.168.50.22:8080

Open the Network URL from any device on the same WiFi — your laptop, phone, or tablet.

CLI options

Flag	Default	Description
`--port`	`8080`	HTTP port
`--skills-dir`	`~/skills`	Root skills directory
`-s`, `--server`	env `TRAINING_SERVER_URL`	Orchestrator URL
`-t`, `--token`	env `INNATE_SERVICE_KEY`	Service key
`--issuer`	env `INNATE_AUTH_ISSUER_URL`	Auth issuer URL

The three tabs

The UI is organized into three tabs: Skills, Datasets, and Training.

Skills tab

Browse every skill directory on the robot. Each card shows the skill name, type, episode count, and whether a trained checkpoint exists. Click a skill to open its detail view, where you can:

Edit metadata — change the skill name, guidelines (the text BASIC reads to decide when to use this skill), and execution parameters
View the full metadata.json — useful for debugging or verifying that a checkpoint was activated correctly

Datasets tab

This is where the Training Manager really earns its keep. For each skill, you can:

Browse episodes — see every episode in the dataset with timestamps and metadata
Play back video — watch the recorded camera feeds for any episode directly in the browser (both main and wrist cameras)
Delete episodes — select bad episodes and create a cleaned copy of the dataset without them. The original is preserved; a new skill directory is created with the episodes re-indexed.
Merge datasets — combine episodes from multiple skills into a single new dataset. Select which episodes to include from each source. This is useful when you’ve recorded demonstrations across multiple sessions or want to mix data from different setups.
Upload to cloud — submit a skill and upload its data to Innate’s training servers, with a progress bar showing compression and upload stages

Merge workflow example: You recorded 30 episodes of “pick up cup” last week and 25 more today with a slightly different cup. Instead of retraining separately, merge both into a 55-episode “pick up cup v2” dataset and train once on the combined data.

Training tab

View all training runs across all skills, create new runs, and monitor progress. When creating a new run, you get full control over: Hyperparameters — all the same parameters from the app, plus more:

Parameter	Default	Description
`LEARNING_RATE`	5e-5	Transformer learning rate
`LEARNING_RATE_BACKBONE`	5e-5	Vision backbone (ResNet18) learning rate
`BATCH_SIZE`	96	Training batch size
`MAX_STEPS`	120,000	Total training iterations
`CHUNK_SIZE`	30	Action chunk length
`NUM_WORKERS`	4	Data loader workers
`WORLD_SIZE`	4	Number of GPUs

Repository and branch — point the training server at a custom ACT repository and branch. This is the key feature for researchers: fork the ACT training code, modify the architecture or loss function, and run training against your fork without any server-side changes.

Field	Description
Repository	GitHub `owner/repo` path (e.g. `your-org/act-custom`)
Ref	Branch name, tag, or commit SHA to check out

Infrastructure — configure GPU type, GPU count, time budget, and cost budget. Architecture parameters are shown as read-only for reference (vision backbone, model dimensions, encoder/decoder layers, VAE settings). Each run card shows its current status with live updates via server-sent events (SSE), so you can watch a run progress through the lifecycle without refreshing.

Log terminal

A collapsible terminal panel at the bottom of every page streams real-time backend logs. This shows every API call, upload progress message, and error — handy for debugging when something doesn’t work as expected.

Architecture

The Training Manager is a FastAPI backend serving a React + Tailwind SPA. The backend delegates all cloud operations to the same training_client library that the ROS training node uses, so there’s no duplicate logic.

Browser ──→ FastAPI server (port 8080)
               ├── /api/skills     → reads/writes ~/skills/*/metadata.json
               ├── /api/datasets   → episode browsing, video streaming, merge, delete
               ├── /api/training   → list runs, create runs, watch status (via SSE)
               ├── /api/logs       → real-time log stream (SSE)
               └── /*              → serves the React SPA
                        │
                        ▼
               training_client.SkillManager
                        │
                        ▼
               Innate Training Orchestrator (training-v1.innate.bot)

Documentation Index

​Launch it

​CLI options

​The three tabs

​Skills tab

​Datasets tab

​Training tab

​Log terminal

​Architecture

Launch it

CLI options

The three tabs

Skills tab

Datasets tab

Training tab

Log terminal

Architecture