Spaces:
Running
on
Zero
Running
on
Zero
title: GASM Enhanced - Geometric Language AI | |
emoji: ๐ | |
colorFrom: blue | |
colorTo: purple | |
sdk: gradio | |
sdk_version: 4.16.0 | |
app_file: app.py | |
pinned: false | |
license: cc-by-nd-4.0 | |
# ๐ GASM Enhanced - Geometric Attention for Spatial Understanding | |
> *Bridging natural language and geometric reasoning through SE(3)-invariant neural architectures* | |
## What Makes This Different? | |
Traditional AI understands *what* objects are mentioned, but struggles with *where* they are and *how* they relate spatially. GASM changes this. | |
**GASM** (Geometric Attention for Spatial & Mathematical understanding) represents a breakthrough in AI spatial reasoning: | |
- **๐ง Advanced NLP**: Goes beyond keywords with spaCy + semantic categorization | |
- **๐ Proper 3D Math**: Uses SE(3) Lie groups for mathematically correct spatial relationships | |
- **๐ Geometric Optimization**: Minimizes curvature on Riemannian manifolds for optimal layouts | |
- **โจ Real-time Visualization**: Shows spatial understanding in live 3D geometry | |
## ๐ What This Enables | |
### The Spatial Intelligence Gap | |
Current language models excel at: | |
- โ "What is a keyboard?" โ *An input device* | |
- โ "Where is the keyboard relative to the monitor?" โ *Spatial confusion* | |
GASM bridges this gap through mathematical spatial reasoning. | |
### Real Applications | |
This isn't just a demo - GASM addresses actual problems in: | |
- **๐ค Robotics**: "Move the component above the platform" โ Precise 3D coordinates | |
- **๐ฌ Scientific Modeling**: "The electron orbits the nucleus" โ Proper geometric relationships | |
- **๐๏ธ Engineering**: "Place the support between the beams" โ Constraint satisfaction | |
- **๐ฅฝ AR/VR**: Natural language to 3D scene understanding | |
## ๐ฏ Try It Yourself | |
### Watch GASM in Action | |
Input any sentence with spatial relationships: | |
> *"The ball lies left of the table next to the computer, while the book sits between the keyboard and the monitor."* | |
**GASM Output:** | |
- โ **6 entities identified**: ball, table, computer, book, keyboard, monitor | |
- ๐ **5 spatial relations**: left_of, next_to, between | |
- ๐ **3D geometric layout** with proper SE(3) positioning | |
- ๐ **Curvature evolution** showing geometric convergence | |
### More Examples | |
**๐ค Robotics**: *"The robotic arm moves the satellite component above the assembly platform."* | |
**๐ฌ Scientific**: *"The electron orbits the nucleus while the magnetic field flows through the crystal."* | |
**๐ Everyday**: *"The red car parks between two buildings near the park entrance."* | |
### What You'll See | |
1. **Advanced Entity Recognition**: Far beyond simple keyword matching | |
2. **Spatial Relationship Extraction**: Understands "left of", "between", "above" in context | |
3. **3D Visualization**: Real geometric positioning in proper 3D space | |
4. **Mathematical Convergence**: Curvature evolution showing optimization progress | |
## ๐ Project Structure | |
``` | |
GASM-Huggingface/ | |
โโโ app.py # Main Gradio application with complete interface | |
โโโ gasm_core.py # Core GASM implementation with SE(3) math | |
โโโ fastapi_endpoint.py # Optional API endpoints (standalone) | |
โโโ requirements.txt # Python dependencies | |
โโโ README.md # This file | |
``` | |
## ๐งฎ The Mathematics Behind GASM | |
### What Makes It Special | |
Unlike traditional NLP that treats text as sequences of tokens, GASM understands geometry: | |
**1. SE(3) Invariant Processing** | |
- Uses Special Euclidean Group SE(3) for proper 3D transformations | |
- Maintains mathematical correctness under rotations and translations | |
- Employs Lie group operations for geometric learning | |
**2. Advanced Entity Recognition** | |
- **spaCy NLP**: Part-of-speech tagging + named entity recognition | |
- **Semantic Filtering**: Domain-specific vocabularies (robotics, scientific, everyday) | |
- **Contextual Understanding**: Extracts objects from spatial prepositions | |
**3. Geometric Optimization** | |
- **Geodesic Distances**: Shortest paths on SE(3) manifold | |
- **Discrete Curvature**: Graph Laplacian eigenvalue-based computation | |
- **Energy Minimization**: Constraint satisfaction via Lagrange multipliers | |
### Technical Architecture | |
``` | |
Text โ spaCy NLP โ Entity Extraction โ Semantic Filtering | |
โ | |
SE(3) Embedding โ Attention Mechanism โ Geometric Refinement | |
โ | |
Constraint Satisfaction โ Curvature Optimization โ 3D Visualization | |
``` | |
### Why This Matters | |
Most AI systems use simple word embeddings that lose spatial meaning. GASM preserves geometric relationships through mathematically principled operations, enabling true spatial understanding. | |
## ๐จ Visualizations | |
The Space provides two main visualizations: | |
### 1. Curvature Evolution Plot | |
- Shows geometric convergence over iterations | |
- Displays SE(3) manifold optimization progress | |
- Uses matplotlib with dark theme for clarity | |
### 2. 3D Entity Space Plot | |
- Interactive 3D positioning of extracted entities | |
- Color-coded by entity type (robotic, physical, spatial, etc.) | |
- Shows relationship connections between entities | |
## ๐ฌ How It Works | |
1. **Text Input**: User provides text for analysis | |
2. **Entity Extraction**: Regex-based extraction of meaningful entities | |
3. **Relation Detection**: Identification of spatial, temporal, physical relations | |
4. **GASM Processing**: If available, real SE(3) forward pass through geometric manifold | |
5. **Visualization**: Generate curvature evolution and 3D entity plots | |
6. **Results**: Comprehensive analysis with JSON output | |
## โก Performance | |
- **CPU Mode**: Optimized for HuggingFace Spaces CPU allocation | |
- **GPU Fallback**: Automatic ZeroGPU usage when available | |
- **Memory Efficient**: ~430MB total memory footprint | |
- **Fast Processing**: 0.1-0.8s processing time depending on text length | |
## ๐ ๏ธ Local Development | |
To run locally: | |
```bash | |
git clone <this-repo> | |
cd GASM-Huggingface | |
# Install dependencies | |
pip install -r requirements.txt | |
# Run the application | |
python app.py | |
``` | |
## ๐ Space Configuration | |
This Space is configured with: | |
- **SDK**: Gradio 4.44.1+ | |
- **Python**: 3.8+ | |
- **GPU**: ZeroGPU compatible (A10G/T4 fallback) | |
- **Memory**: 16GB RAM allocation | |
- **Storage**: Persistent storage for model caching | |
## ๐ API Endpoints | |
The Space also exposes FastAPI endpoints (when fastapi_endpoint.py is run separately): | |
- `POST /process`: Process text with geometric enhancement | |
- `GET /health`: Health check and memory usage | |
- `GET /info`: Model configuration information | |
## ๐ Use Cases | |
Perfect for analyzing: | |
- **Technical Documentation**: Spatial relationships in engineering texts | |
- **Scientific Literature**: Physical phenomena and experimental setups | |
- **Educational Content**: Geometry and physics explanations | |
- **Robotic Systems**: Assembly instructions and spatial configurations | |
## ๐ฏ Model Details | |
- **Base Architecture**: Built on transformer foundations | |
- **Geometric Processing**: SE(3) Lie group operations | |
- **Attention Mechanism**: Geodesic distance-based attention weighting | |
- **Curvature Computation**: Discrete Gaussian curvature via graph Laplacian | |
- **Constraint Handling**: Energy minimization with Lagrange multipliers | |
## ๐ Why This Matters | |
### Current State of AI | |
- โ Excellent at text understanding and generation | |
- โ Great at image recognition and computer vision | |
- โ **Struggles with spatial reasoning from language** | |
- โ **Can't bridge text โ 3D geometry gap** | |
### GASM's Contribution | |
GASM represents a step toward AI that understands space the way humans do - not just as coordinates, but as meaningful geometric relationships between objects in the world. | |
**Applications on the horizon:** | |
- ๐ค Robots that understand spatial instructions naturally | |
- ๐๏ธ AI architects that reason about 3D spaces from descriptions | |
- ๐ฌ Scientific AI that models physical systems geometrically | |
- ๐ฎ Game AI that understands spatial gameplay naturally | |
## ๐ ๏ธ Local Development | |
```bash | |
git clone https://github.com/scheitelpunk/GASM-Huggingface | |
cd GASM-Huggingface | |
pip install -r requirements.txt | |
python app.py | |
``` | |
The system gracefully handles missing dependencies with intelligent fallbacks. | |
## ๐ค Contributing | |
This is active research in spatial AI! We welcome: | |
- ๐ Bug reports and edge cases | |
- ๐ก New spatial relationship types | |
- ๐ Additional language support | |
- ๐ Evaluation datasets | |
- ๐ง Performance optimizations | |
## ๐ License & Citation | |
Licensed under CC-BY-NC 4.0. For research use, please cite: | |
```bibtex | |
@misc{gasm2025, | |
title={GASM: Geometric Attention for Spatial Understanding}, | |
author={Michael Neuberger, Versino PsiOmega GmbH}, | |
year={2025}, | |
url={https://huggingface.co/spaces/scheitelpunk/GASM} | |
} | |
``` | |
## ๐ Built With | |
- ๐ค **Hugging Face Spaces** - Deployment platform | |
- ๐ **spaCy** - Advanced NLP processing | |
- ๐ข **PyTorch** - Neural network framework | |
- ๐ **Gradio** - Interactive ML interfaces | |
- ๐ **Geomstats** - Geometric computing | |
--- | |
*GASM: Where language meets geometry, and AI begins to understand space.* ๐ | |
Built by Michael Neuberger, Versino PsiOmega GmbH |