File size: 10,078 Bytes
63ed3a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
---
title: LeRobot Arena - AI Inference Server
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
suggested_hardware: t4-small
suggested_storage: medium
short_description: Real-time ACT model inference server for robot control
tags:
  - robotics
  - ai
  - inference
  - control
  - act-model
  - transformer
  - real-time
  - gradio
  - fastapi
  - computer-vision
pinned: false
fullWidth: true
---

# Inference Server

πŸ€– **Real-time ACT Model Inference Server for Robot Control**

This server provides ACT (Action Chunking Transformer) model inference for robotics applications using the transport server communication system. It includes a user-friendly Gradio web interface for easy setup and management.

## ✨ Features

- **Real-time AI Inference**: Run ACT models for robot control at 20Hz control frequency
- **Multi-Camera Support**: Handle multiple camera streams with different names
- **Web Interface**: User-friendly Gradio UI for setup and monitoring
- **Session Management**: Create, start, stop, and monitor inference sessions
- **Automatic Timeout**: Sessions automatically cleanup after 10 minutes of inactivity
- **Debug Tools**: Built-in debugging and monitoring endpoints
- **Flexible Configuration**: Support for custom model paths, camera configurations
- **No External Dependencies**: Direct Python execution without subprocess calls

## πŸš€ Quick Start

### Prerequisites

- Python 3.12+
- UV package manager (recommended)
- Trained ACT model
- Transport server running

### 1. Installation

```bash
cd backend/ai-server

# Install dependencies using uv (recommended)
uv sync

# Or using pip
pip install -e .
```

### 2. Launch the Application

#### **πŸš€ Simple Integrated Mode (Recommended)**
```bash
# Everything runs in one process - no subprocess issues!
python launch_simple.py

# Or using the CLI
python -m inference_server.cli --simple
```

This will:
- Run everything on `http://localhost:7860`
- Direct session management (no HTTP API calls)
- No external subprocess dependencies
- Most robust and simple deployment!

#### **πŸ”§ Development Mode (Separate Processes)**
```bash
# Traditional approach with separate server and UI
python -m inference_server.cli
```

This will:
- Start the AI server on `http://localhost:8001`
- Launch the Gradio UI on `http://localhost:7860`
- Better for development and debugging

### 3. Using the Web Interface

1. **Check Server Status**: The interface will automatically check if the AI server is running
2. **Configure Your Robot**: Enter your model path and camera setup
3. **Create & Start Session**: Click the button to set up AI control
4. **Monitor Performance**: Use the status panel to monitor inference

## 🎯 Workflow Guide

### Step 1: AI Server
- The server status will be displayed at the top
- Click "Start Server" if it's not already running
- Use "Check Status" to verify connectivity

### Step 2: Set Up Robot AI
- **Session Name**: Give your session a unique name (e.g., "my-robot-01")
- **AI Model Path**: Path to your trained ACT model (e.g., "./checkpoints/act_so101_beyond")
- **Camera Names**: Comma-separated list of camera names (e.g., "front,wrist,overhead")
- Click "Create & Start AI Control" to begin

### Step 3: Control Session
- The session ID will be auto-filled after creation
- Use Start/Stop buttons to control inference
- Click "Status" to see detailed performance metrics

## πŸ› οΈ Advanced Usage

### CLI Options

```bash
# Simple integrated mode (recommended)
python -m inference_server.cli --simple

# Development mode (separate processes)
python -m inference_server.cli

# Launch only the server
python -m inference_server.cli --server-only

# Launch only the UI (server must be running separately)  
python -m inference_server.cli --ui-only

# Custom ports
python -m inference_server.cli --server-port 8002 --ui-port 7861

# Enable public sharing
python -m inference_server.cli --share

# For deployment (recommended)
python -m inference_server.cli --simple --host 0.0.0.0 --share
```

### API Endpoints

The server provides a REST API for programmatic access:

- `GET /health` - Server health check
- `POST /sessions` - Create new session
- `GET /sessions` - List all sessions
- `GET /sessions/{id}` - Get session details
- `POST /sessions/{id}/start` - Start inference
- `POST /sessions/{id}/stop` - Stop inference
- `POST /sessions/{id}/restart` - Restart inference
- `DELETE /sessions/{id}` - Delete session

#### Debug Endpoints
- `GET /debug/system` - System information (CPU, memory, GPU)
- `GET /debug/sessions/{id}/queue` - Action queue details
- `POST /debug/sessions/{id}/reset` - Reset session state

### Configuration

#### Joint Value Convention
- All joint inputs/outputs use **NORMALIZED VALUES**
- Most joints: -100 to +100 (RANGE_M100_100)
- Gripper: 0 to 100 (RANGE_0_100)
- This matches the training data format exactly

#### Camera Support
- Supports arbitrary number of camera streams
- Each camera has a unique name (e.g., "front", "wrist", "overhead")
- All camera streams are synchronized for inference
- Images expected in RGB format, uint8 [0-255]

## πŸ“Š Monitoring

### Session Status Indicators
- 🟒 **Running**: Inference active and processing
- 🟑 **Ready**: Session created but inference not started
- πŸ”΄ **Stopped**: Inference stopped
- 🟠 **Initializing**: Session being set up

### Smart Session Control
The UI now provides intelligent feedback:
- ℹ️ **Already Running**: When trying to start a running session
- ℹ️ **Already Stopped**: When trying to stop a stopped session
- πŸ’‘ **Smart Suggestions**: Context-aware tips based on current status

### Performance Metrics
- **Inferences**: Total number of model inferences performed
- **Commands Sent**: Joint commands sent to robot
- **Queue Length**: Actions waiting in the queue
- **Errors**: Number of errors encountered
- **Data Flow**: Images and joint states received

## 🐳 Docker Usage

### Build the Image
```bash
cd services/inference-server
docker build -t inference-server .
```

### Run the Container
```bash
# Basic usage
docker run -p 7860:7860 inference-server

# With environment variables
docker run -p 7860:7860 \
  -e DEFAULT_ARENA_SERVER_URL=http://your-server.com \
  -e DEFAULT_MODEL_PATH=./checkpoints/your-model \
  inference-server

# With GPU support
docker run --gpus all -p 7860:7860 inference-server
```

## πŸ”§ Troubleshooting



### Common Issues

1. **Server Won't Start**
   - Check if port 8001 is available
   - Verify model path exists and is accessible
   - Check dependencies are installed correctly

2. **Session Creation Fails**
   - Verify model path is correct
   - Check Arena server is running on specified URL
   - Ensure camera names match your robot configuration

3. **Poor Performance**
   - Monitor system resources in the debug panel
   - Check if GPU is being used for inference
   - Verify control/inference frequency settings

4. **Connection Issues**
   - Verify Arena server URL is correct
   - Check network connectivity
   - Ensure workspace/room IDs are valid

### Debug Mode

Enable debug mode for detailed logging:

```bash
uv run python -m lerobot_arena_ai_server.cli --debug
```

### System Requirements

- **CPU**: Multi-core recommended for 30Hz control
- **Memory**: 8GB+ RAM recommended
- **GPU**: CUDA-compatible GPU for fast inference (optional but recommended)
- **Network**: Stable connection to Arena server

## πŸ“š Architecture

### Integrated Mode (Recommended)
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚        Single Application           β”‚    β”‚  LeRobot Arena  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   │◄──►│   (Port 8000)   β”‚
β”‚  β”‚ Gradio UI   β”‚  β”‚ AI Server   β”‚   β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚  β”‚    (/)      β”‚  β”‚  (/api/*)   β”‚   β”‚             β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚             β”‚
β”‚       (Port 7860)                   β”‚        Robot/Cameras
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           β”‚
      Web Browser
```

### Development Mode
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Gradio UI     β”‚    β”‚   AI Server     β”‚    β”‚  LeRobot Arena  β”‚
β”‚   (Port 7860)   │◄──►│   (Port 8001)   │◄──►│   (Port 8000)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β”‚                       β”‚                       β”‚
    Web Browser              ACT Model              Robot/Cameras
                             Inference
```

### Data Flow

1. **Camera Data**: Robot cameras β†’ Arena β†’ AI Server
2. **Joint State**: Robot joints β†’ Arena β†’ AI Server  
3. **AI Inference**: Images + Joint State β†’ ACT Model β†’ Actions
4. **Control Commands**: Actions β†’ Arena β†’ Robot

### Session Lifecycle

1. **Create**: Set up rooms in Arena, load ACT model
2. **Start**: Begin inference loop (3Hz) and control loop (30Hz)
3. **Running**: Process camera/joint data, generate actions
4. **Stop**: Pause inference, maintain connections
5. **Delete**: Clean up resources, disconnect from Arena

## 🀝 Contributing

1. Follow the existing code style
2. Add tests for new features
3. Update documentation
4. Submit pull requests

## πŸ“„ License

This project follows the same license as the parent LeRobot Arena project.

---

For more information, see the [LeRobot Arena documentation](../../README.md).