SwarmChat / README.md
Mohammed-majeed's picture
SwarmChat
712d204
---
title: SwarmChat
emoji: 🤖
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.34.2
python_version: '3.10'
app_file: app.py
pinned: true
preload_from_hub:
- facebook/seamless-m4t-v2-large
- Inventors-Hub/SwarmChat-models EuroLLM-9B-Instruct-Q4_K_M.gguf
- >-
Inventors-Hub/SwarmChat-models
Falcon3-10B-Instruct-BehaviorTree-3epochs.Q4_K_M.gguf
- Inventors-Hub/SwarmChat-models llama-guard-3-8b-q4_k_m.gguf
thumbnail: >
https://cdn-uploads.huggingface.co/production/uploads/66bf0ec6399b14b08f570b0b/oVrcxcvvyl769-b3TffNe.png
short_description: Natural Language Control for Swarm Robotics
---
# SwarmChat: Enabling Intuitive Swarm Robotics with Natural Language
SwarmChat is an innovative project that enables intuitive communication with swarm robotics through natural language. This system integrates advanced audio transcription, text processing, and safety mechanisms with a live simulation environment that visualizes a swarm of agents executing behavior trees.
🚀 This project is Funded by the European Union’s [UTTER programme](https://he-utter.eu/), in collaboration with the UTTER consortium.
## Features
- **Audio Input Processing**:
- Record commands via a microphone.
- Translate speech into English using the `facebook/seamless-m4t-v2-large` model.
- Perform a safety check on the translated text before execution.
- **Text Input Processing**:
- Enter text commands for swarm control.
- Translate text using `EuroLLM` (EuroLLM-9B-Instruct).
- Detect unsafe or inappropriate content with an integrated safety module.
- **Safety Module**:
- Utilizes `Llama-Guard` model (Llama-Guard-3-8B) for safety classification.
- Identifies unsafe content across predefined categories (e.g., violent crimes, privacy violations, hate speech).
- Ensures commands comply with safety standards.
- **Swarm Simulation**:
- Visualize a swarm of agents in a live simulation powered by Violet simulator and Pygame.
- Agents are controlled by behavior trees defined in an XML file (`tree.xml`), using the `py_trees` framework.
- Real-time simulation updates streamed via a Gradio web interface.
- **Behavior Tree Generator**:
- `Falcon3-10B-Instruct-BehaviorTree` model to dynamically generate behavior trees in XML format.
- Automatically extracts available behaviors from the SwarmAgent class and constructs a detailed prompt using a predefined XML template.
- Generates and saves new behavior tree configurations (updating tree.xml) based on user-specified tasks.
- **Integrated Interface**:
- A unified Gradio web interface for both audio and text inputs.
- Live streaming of the simulation environment.
- Seamless switching between different input modalities.
## Technology Stack
- **Backend**:
- Python
- [Transformers](https://huggingface.co/transformers/) (Hugging Face)
- PyTorch
- Pygame
- Threading and Queue modules for simulation management
- **Frontend**:
- [Gradio](https://gradio.app/) for an interactive web-based interface.
- **AI Models**:
- **Speech Processing**: [Seamless-m4t-v2-large](https://huggingface.co/facebook/seamless-m4t-v2-large) for audio transcription and translation.
- **Text Processing**: [EuroLLM-9B-Instruct](https://huggingface.co/utter-project/EuroLLM-9B-Instruct) for text translation.
- **Safety Classification**: [Llama-Guard-3-8B](https://huggingface.co/meta-llama/Llama-Guard-3-8B) for content safety assessment.
- **Behavior Tree Generation**: [Falcon3-10B-Instruct-BehaviorTree](https://huggingface.co/Inventors-Hub/Falcon3-10B-Instruct-BehaviorTree-3-epochs) for creating and updating behavior trees.
- **Behavior Trees**:
- Agents utilize behavior trees—parsed from XML and built with `py_trees`—to dictate their actions within the simulation.
## Installation
1. **Clone the repository**:
```bash
git clone https://github.com/Inventors-Hub/SwarmChat.git
cd SwarmChat
```
2. **Install dependencies**:
```bash
pip install -r requirements.txt
```
3. **Setup AI Models**:
- Place the EuroLLM model file (`EuroLLM-9B-Instruct-Q4_K_M.gguf`) at the specified path in `text_processing.py`.
- Place the LLaMA Guard model file (`llama-guard-3-8b-q4_k_m.gguf`) at the specified path in `safety_module.py`.
- Place the DeepSeek model file (`Falcon3-10B-Instruct-BehaviorTree-3-epochs-GGUF`) at the specified path in `bt_generator.py`.
4. **Run the Application**:
```bash
python app.py
```
5. **Access the Interface**:
Open your browser and navigate to http://127.0.0.1:7860 to start using SwarmChat.
## Overview of Modules
- **app.py**
The main application integrates audio/text processing, behavior tree generation, and the live simulation. It sets up the Gradio interface, handles simulation streaming, and routes user inputs to the appropriate processing modules.
- **speech_processing.py**
Implements audio transcription and translation using the `facebook/seamless-m4t-v2-large` model.
- **text_processing.py**
Translates text commands using `EuroLLM` (EuroLLM-9B-Instruct).
- **safety_module.py**
Utilizes `LLaMA Guard` to assess the safety of incoming commands, ensuring compliance with safety policies.
- **bt_generator.py**
Dynamically generates behavior trees in XML format by extracting behaviors from the SwarmAgent class, constructing a prompt, and querying `Falcon3-10B-Instruct-BehaviorTree` model. The generated XML is saved to `tree.xml` for simulation use.
- **simulator_env.py**
Powers the simulation environment, manages agent behaviors using XML-defined behavior trees, and handles real-time simulation updates.
## Acknowledgments
This work was funded by the European Union under the [UTTER programme](https://he-utter.eu/).
We gratefully acknowledge the support of the entire UTTER consortium.