LeRobot.js Conventions

Project Overview

lerobot.js is a TypeScript/JavaScript implementation of Hugging Face's lerobot robotics library. Our goal is to bring state-of-the-art AI for real-world robotics directly to the JavaScript ecosystem, enabling robot control without any Python dependencies.

Vision Statement

Lower the barrier to entry for robotics by making cutting-edge robotic AI accessible through JavaScript, the world's most widely used programming language.

Core Rules

Never Start/Stop Dev Server: The development server is already managed by the user - never run commands to start, stop, or restart the server
Python lerobot Faithfulness: Maintain exact UX/API compatibility with Python lerobot - commands, terminology, and workflows must match identically
Serial API Separation: Always use serialport package for Node.js and Web Serial API for browsers - never mix or bridge these incompatible APIs
Minimal Console Output: Only show essential information - reduce cognitive load for users
Hardware-First Testing: Always validate with real hardware, not just simulation

Project Goals

Primary Objectives

Native JavaScript/TypeScript Implementation: Complete robotics stack running purely in JS/TS
Feature Parity: Implement core functionality from the original Python lerobot
Web-First Design: Enable robotics applications to run in browsers, Edge devices, and Node.js
Real-World Robot Control: Direct hardware interface without Python bridge
Hugging Face Integration: Seamless model and dataset loading from HF Hub

Core Features to Implement

Pretrained Models: Load and run robotics policies (ACT, Diffusion, TDMPC, VQ-BeT)
Dataset Management: LeRobotDataset format with HF Hub integration
Simulation Environments: Browser-based robotics simulations
Real Robot Support: Hardware interfaces for motors, cameras, sensors
Training Infrastructure: Policy training and evaluation tools
Visualization Tools: Dataset and robot state visualization

Technical Foundation

Core Stack

Runtime: Node.js 18+ / Modern Browsers
Language: TypeScript with strict type checking
Build Tool: Vite (development and production builds)
Package Manager: pnpm
Module System: ES Modules
Target: ES2020

Architecture Principles

1. Python lerobot Faithfulness (Primary Principle)

lerobot.js must maintain UX/API compatibility with Python lerobot

Identical Commands: npx lerobot find-port matches python -m lerobot.find_port
Same Terminology: Use "MotorsBus", not "robot arms" - keep Python's exact wording
Matching Output: Error messages, prompts, and flow identical to Python version
Familiar Workflows: Python lerobot users should feel immediately at home
No "Improvements": Resist urge to add features/UX that Python version doesn't have

Why? Users are already trained on Python lerobot. Our goal is seamless migration to TypeScript, not learning a new tool.

2. Modular Design

lerobot/
├── common/
│   ├── datasets/     # Dataset loading and management
│   ├── envs/         # Simulation environments
│   ├── policies/     # AI policies and models
│   ├── devices/      # Hardware device interfaces
│   └── utils/        # Shared utilities
├── core/             # Core robotics primitives
├── node/             # Node.js-specific implementations
└── web/              # Browser-specific implementations

3. Platform Abstraction

Universal Core: Platform-agnostic robotics logic
Web Adapters: Browser-specific implementations (WebGL, WebAssembly, Web Serial API)
Node Adapters: Node.js implementations (native modules, serialport package)

4. Serial Communication Standards (Critical)

Serial communication must use platform-appropriate APIs - never mix or bridge:

Node.js Platform: ALWAYS use serialport package
- Event-based: port.on('data', callback)
- Programmatic port listing: SerialPort.list()
- Direct system access: new SerialPort({ path: 'COM4' })
Web Platform: ALWAYS use Web Serial API
- Promise/Stream-based: await reader.read()
- User permission required: navigator.serial.requestPort()
- Browser security model: User must select port via dialog

Why this matters: The APIs are completely incompatible - different patterns, different capabilities, different security models. Mixing them leads to broken implementations.

5. Progressive Enhancement

Core Functionality: Works everywhere (basic policy inference)
Enhanced Features: Leverage platform capabilities (GPU acceleration, hardware access)
Premium Features: Advanced capabilities (real-time training, complex simulations)

Development Standards

Code Style

Formatting: Prettier with default settings
Linting: ESLint with TypeScript recommended rules
Naming:
- camelCase for variables/functions
- PascalCase for classes/types
- snake_case for file names (following lerobot convention)
File Structure: Feature-based organization with index.ts barrels

TypeScript Standards

Strict Mode: All strict compiler options enabled
Type Safety: Prefer types over interfaces for data structures
Generics: Use generics for reusable components
Error Handling: Use Result<T, E> pattern for recoverable errors

Implementation Philosophy

Python First: When in doubt, check how Python lerobot does it
Port, Don't Innovate: Direct ports are better than clever improvements
User Expectations: Maintain the exact experience Python users expect
Terminology Consistency: Use Python lerobot's exact naming and messaging

Development Process

Python Reference: Always check Python lerobot implementation first
UX Matching: Test that commands, outputs, and workflows match exactly
User Story Validation: Validate against real Python lerobot users

Testing Strategy

Unit Tests: Vitest for individual functions and classes
Integration Tests: Test component interactions
E2E Tests: Playwright for full workflow testing
Hardware Tests: Mock/stub hardware interfaces for CI
UX Compatibility Tests: Verify outputs match Python version

Package Structure

NPM Package Name

Public Package: lerobot (on npm)
Development Name: lerobot.js (GitHub repository)

Dependencies Strategy

Core Dependencies

ML Inference: ONNX.js for model execution (browser + Node.js)
Tensor Operations: Custom lightweight tensor lib for data manipulation
Math: Custom math utilities for robotics
Networking: Fetch API (universal)
File I/O: Platform-appropriate abstractions

Optional Enhanced Dependencies

3D Graphics: Three.js for simulation and visualization
Hardware: Platform-specific libraries for device access
Development: Vitest, ESLint, Prettier

Hardware Implementation Lessons

Critical Hardware Compatibility

Baudrate Configuration

Feetech Motors (SO-100): MUST use 1,000,000 baud to match Python lerobot
Python Reference: DEFAULT_BAUDRATE = 1_000_000 in Python lerobot codebase
Common Mistake: Using 9600 baud causes "Read timeout" errors despite device connection
Verification: Always test with real hardware - simulation won't catch baudrate issues

Console Output Philosophy

Minimal Cognitive Load: Reduce console noise to absolute minimum
Silent Operations: Connection, initialization, cleanup should be silent unless error occurs
Error-Only Logging: Only show output when user needs to take action or when errors occur
Professional UX: Robotics tools should have clean, distraction-free interfaces

Calibration Flow Matching

Python Behavior: When user hits Enter during range recording, reading stops IMMEDIATELY
No Final Reads: Never read motor positions after user completes calibration
User Expectation: After Enter, user should be able to release robot (positions will change)
Flow Testing: Always validate against Python lerobot's exact behavior

Development Process Requirements

CLI Build Process

Critical: After TypeScript changes, MUST run pnpm run build to update CLI
Global CLI: lerobot command uses compiled dist/ files, not source
Testing Flow: Edit source → Build → Test CLI → Repeat
Common Mistake: Testing source changes without rebuilding CLI

Hardware Testing Priority

Real Hardware Required: Simulation cannot catch hardware-specific issues
Baudrate Validation: Only real devices will reveal communication problems
User Flow Testing: Test complete calibration workflows with actual hardware
Port Management: Ensure proper port cleanup between testing sessions

CRITICAL: Calibration Implementation Requirements

Calibration File Format (Learned from SO-100 Implementation)

NEVER use array-based format: Calibration files must use motor names as keys, NOT arrays
Python-Compatible Structure: Each motor must be an object with id, drive_mode, homing_offset, range_min, range_max

Wrong Format (causes Python incompatibility):

{
  "homing_offset": [47, 1013, -957, ...],
  "drive_mode": [0, 0, 0, ...],
  "motor_names": ["shoulder_pan", ...]
}

Correct Format (Python-compatible):

{
  "shoulder_pan": {
    "id": 1,
    "drive_mode": 0,
    "homing_offset": 47,
    "range_min": 985,
    "range_max": 3085
  }
}

Homing Offset Calibration Protocol (Critical for STS3215/Feetech Motors)

MUST Reset Existing Offsets: Before calculating new homing offsets, ALWAYS reset existing homing offsets to 0
Python Reference: Python's set_half_turn_homings() calls reset_calibration() first
Missing Reset Causes: Completely wrong homing offset values (~1000+ unit differences)
Reset Protocol: Write value 0 to Homing_Offset register (address 31) for each motor before reading positions
Verification: Ensure reset commands receive successful responses before proceeding

STS3215 Sign-Magnitude Encoding

Homing_Offset Uses Special Encoding: Bit 11 is sign bit, lower 11 bits are magnitude
Position Reads: Some registers may need sign-magnitude decoding - verify against Python behavior
Encoding Functions: Implement encodeSignMagnitude() and decodeSignMagnitude() for protocol compatibility
Common Symptom: Values differing by ~2048 or ~4096 indicate sign-magnitude encoding issues

Calibration Process Validation

Same Neutral Position: When comparing calibrations, ensure robot is in identical physical position
Expected Accuracy: Properly implemented calibration should match Python within 30 units
Debug Protocol: Log position values, reset confirmations, and calculation steps for troubleshooting
Range Verification: wrist_roll should always use full range (0-4095), other motors use recorded ranges

Common Calibration Mistakes to Avoid

Skipping Homing Reset: Leads to ~1000+ unit differences in homing offsets
Array-Based File Format: Makes calibration files incompatible with Python lerobot
Ignoring Sign-Magnitude Encoding: Causes specific motors (often wrist_roll) to have wrong values
Different Physical Positions: Comparing calibrations done at different robot positions
Missing Motor ID Assignment: Forgetting to assign correct motor IDs (1-6 for SO-100)

Device-Agnostic Calibration Architecture

No Hardcoded Device Values: Calibration logic must be configurable for different robot types
Configuration-Driven Protocol: Motor IDs, register addresses, resolution, etc. should come from device config
Extensible Design: Adding new robot types should only require new config files, not core logic changes
Example Bad Practice: Hardcoding const motorIds = [1,2,3,4,5,6] in calibration logic
Example Good Practice: Using config.motorIds from device-specific configuration
Protocol Abstraction: Register addresses, resolution, encoding details should be configurable per device type

CRITICAL: Calibration Sequence and Hardware State Management

The exact sequence of calibration operations is critical for Python compatibility. Getting this wrong causes major range/offset discrepancies.

The Correct Calibration Sequence (Matching Python Exactly)

Reset Existing Homing Offsets to 0: Write 0 to all Homing_Offset registers
Read Physical Positions: Get actual motor positions (will be raw, non-centered values)
Calculate New Homing Offsets: offset = position - (resolution-1)/2
IMMEDIATELY Write Homing Offsets: Write new offsets to motor registers before range recording
Read Positions for Range Init: Now positions will appear centered (~2047) due to applied offsets
Record Range of Motion: Use centered positions as starting min/max values
Write Hardware Position Limits: Write range_min/range_max to motor limit registers

Critical Implementation Details

Homing Offset Writing Must Be Immediate:

// WRONG - Only calculates, doesn't write to motors
async function setHomingOffsets(config) {
  const positions = await readMotorPositions(config);
  const offsets = calculateOffsets(positions);
  return offsets; // ❌ Not written to motors!
}

// CORRECT - Writes offsets to motors immediately
async function setHomingOffsets(config) {
  await resetHomingOffsets(config); // Reset first
  const positions = await readMotorPositions(config);
  const offsets = calculateOffsets(positions);
  await writeHomingOffsetsToMotors(config, offsets); // ✅ Written immediately
  return offsets;
}

Range Recording Initialization Must Read Actual Positions:

// WRONG - Hardcoded center values
const rangeMins = {};
const rangeMaxes = {};
for (const motor of motors) {
  rangeMins[motor] = 2047; // ❌ Hardcoded!
  rangeMaxes[motor] = 2047;
}

// CORRECT - Read actual positions (now centered due to applied homing offsets)
const startPositions = await readMotorPositions(config);
const rangeMins = {};
const rangeMaxes = {};
for (let i = 0; i < motors.length; i++) {
  rangeMins[motors[i]] = startPositions[i]; // ✅ Uses actual values
  rangeMaxes[motors[i]] = startPositions[i];
}

Hardware Position Limits Must Be Written:

// Python writes these registers, so we must too
await writeMotorRegister(config, motorId, MIN_POSITION_LIMIT_ADDR, range_min);
await writeMotorRegister(config, motorId, MAX_POSITION_LIMIT_ADDR, range_max);

Why This Sequence Matters

Problem: User moves robot to same physical position, but Python shows ~2047 and Node.js shows wildly different values (3013, 1200, etc.)

Root Cause: Python applies homing offsets immediately, making subsequent position reads appear centered. Node.js was calculating offsets but not applying them, so position reads remained raw.

Evidence of Correct Implementation: After fixing the sequence, Node.js and Python both show ~2047 for the same physical position, and final calibration ranges match within professional tolerances (±50 units).

Register Addresses for STS3215 Motors

const STS3215_REGISTERS = {
  Present_Position: { address: 56, length: 2 },
  Homing_Offset: { address: 31, length: 2 }, // Sign-magnitude encoded
  Min_Position_Limit: { address: 9, length: 2 },
  Max_Position_Limit: { address: 11, length: 2 },
};

Common Sequence Mistakes That Cause Major Issues

Not Writing Homing Offsets: Calculates but doesn't apply → position reads remain raw → wrong range initialization
Hardcoded Range Initialization: Forces 2047 instead of reading actual positions → doesn't match Python behavior
Missing Hardware Limit Writing: Python constrains motors, Node.js doesn't → different range recording behavior
Wrong Reset Timing: Not resetting existing offsets first → accumulated offset errors
Skipping Intermediate Delays: Not waiting for motor register writes to take effect → inconsistent state

This sequence debugging took extensive analysis to solve. Future implementations MUST follow this exact pattern to maintain Python compatibility.