Spaces:
Sleeping
A newer version of the Gradio SDK is available:
5.42.0
title: SwarmChat
emoji: 🤖
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.34.2
python_version: '3.10'
app_file: app.py
pinned: true
preload_from_hub:
- facebook/seamless-m4t-v2-large
- Inventors-Hub/SwarmChat-models EuroLLM-9B-Instruct-Q4_K_M.gguf
- >-
Inventors-Hub/SwarmChat-models
Falcon3-10B-Instruct-BehaviorTree-3epochs.Q4_K_M.gguf
- Inventors-Hub/SwarmChat-models llama-guard-3-8b-q4_k_m.gguf
thumbnail: >
https://cdn-uploads.huggingface.co/production/uploads/66bf0ec6399b14b08f570b0b/oVrcxcvvyl769-b3TffNe.png
short_description: Natural Language Control for Swarm Robotics
SwarmChat: Enabling Intuitive Swarm Robotics with Natural Language
SwarmChat is an innovative project that enables intuitive communication with swarm robotics through natural language. This system integrates advanced audio transcription, text processing, and safety mechanisms with a live simulation environment that visualizes a swarm of agents executing behavior trees.
🚀 This project is Funded by the European Union’s UTTER programme, in collaboration with the UTTER consortium.
Features
Audio Input Processing:
- Record commands via a microphone.
- Translate speech into English using the
facebook/seamless-m4t-v2-large
model. - Perform a safety check on the translated text before execution.
Text Input Processing:
- Enter text commands for swarm control.
- Translate text using
EuroLLM
(EuroLLM-9B-Instruct). - Detect unsafe or inappropriate content with an integrated safety module.
Safety Module:
- Utilizes
Llama-Guard
model (Llama-Guard-3-8B) for safety classification. - Identifies unsafe content across predefined categories (e.g., violent crimes, privacy violations, hate speech).
- Ensures commands comply with safety standards.
- Utilizes
Swarm Simulation:
- Visualize a swarm of agents in a live simulation powered by Violet simulator and Pygame.
- Agents are controlled by behavior trees defined in an XML file (
tree.xml
), using thepy_trees
framework. - Real-time simulation updates streamed via a Gradio web interface.
Behavior Tree Generator:
Falcon3-10B-Instruct-BehaviorTree
model to dynamically generate behavior trees in XML format.- Automatically extracts available behaviors from the SwarmAgent class and constructs a detailed prompt using a predefined XML template.
- Generates and saves new behavior tree configurations (updating tree.xml) based on user-specified tasks.
Integrated Interface:
- A unified Gradio web interface for both audio and text inputs.
- Live streaming of the simulation environment.
- Seamless switching between different input modalities.
Technology Stack
Backend:
- Python
- Transformers (Hugging Face)
- PyTorch
- Pygame
- Threading and Queue modules for simulation management
Frontend:
- Gradio for an interactive web-based interface.
AI Models:
- Speech Processing: Seamless-m4t-v2-large for audio transcription and translation.
- Text Processing: EuroLLM-9B-Instruct for text translation.
- Safety Classification: Llama-Guard-3-8B for content safety assessment.
- Behavior Tree Generation: Falcon3-10B-Instruct-BehaviorTree for creating and updating behavior trees.
Behavior Trees:
- Agents utilize behavior trees—parsed from XML and built with
py_trees
—to dictate their actions within the simulation.
- Agents utilize behavior trees—parsed from XML and built with
Installation
Clone the repository:
git clone https://github.com/Inventors-Hub/SwarmChat.git cd SwarmChat
Install dependencies:
pip install -r requirements.txt
Setup AI Models:
- Place the EuroLLM model file (
EuroLLM-9B-Instruct-Q4_K_M.gguf
) at the specified path intext_processing.py
. - Place the LLaMA Guard model file (
llama-guard-3-8b-q4_k_m.gguf
) at the specified path insafety_module.py
. - Place the DeepSeek model file (
Falcon3-10B-Instruct-BehaviorTree-3-epochs-GGUF
) at the specified path inbt_generator.py
.
Run the Application:
python app.py
Access the Interface:
Open your browser and navigate to http://127.0.0.1:7860 to start using SwarmChat.
Overview of Modules
app.py
The main application integrates audio/text processing, behavior tree generation, and the live simulation. It sets up the Gradio interface, handles simulation streaming, and routes user inputs to the appropriate processing modules.speech_processing.py
Implements audio transcription and translation using thefacebook/seamless-m4t-v2-large
model.text_processing.py
Translates text commands usingEuroLLM
(EuroLLM-9B-Instruct).safety_module.py
UtilizesLLaMA Guard
to assess the safety of incoming commands, ensuring compliance with safety policies.bt_generator.py
Dynamically generates behavior trees in XML format by extracting behaviors from the SwarmAgent class, constructing a prompt, and queryingFalcon3-10B-Instruct-BehaviorTree
model. The generated XML is saved totree.xml
for simulation use.simulator_env.py
Powers the simulation environment, manages agent behaviors using XML-defined behavior trees, and handles real-time simulation updates.
Acknowledgments
This work was funded by the European Union under the UTTER programme.
We gratefully acknowledge the support of the entire UTTER consortium.