GASM / README.md
scheitelpunk's picture
update readme
f9acb47
---
title: GASM Enhanced - Geometric Language AI
emoji: ๐Ÿš€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.16.0
app_file: app.py
pinned: false
license: cc-by-nd-4.0
---
# ๐Ÿš€ GASM Enhanced - Geometric Attention for Spatial Understanding
> *Bridging natural language and geometric reasoning through SE(3)-invariant neural architectures*
## What Makes This Different?
Traditional AI understands *what* objects are mentioned, but struggles with *where* they are and *how* they relate spatially. GASM changes this.
**GASM** (Geometric Attention for Spatial & Mathematical understanding) represents a breakthrough in AI spatial reasoning:
- **๐Ÿง  Advanced NLP**: Goes beyond keywords with spaCy + semantic categorization
- **๐Ÿ“ Proper 3D Math**: Uses SE(3) Lie groups for mathematically correct spatial relationships
- **๐Ÿ”„ Geometric Optimization**: Minimizes curvature on Riemannian manifolds for optimal layouts
- **โœจ Real-time Visualization**: Shows spatial understanding in live 3D geometry
## ๐ŸŒŸ What This Enables
### The Spatial Intelligence Gap
Current language models excel at:
- โœ… "What is a keyboard?" โ†’ *An input device*
- โŒ "Where is the keyboard relative to the monitor?" โ†’ *Spatial confusion*
GASM bridges this gap through mathematical spatial reasoning.
### Real Applications
This isn't just a demo - GASM addresses actual problems in:
- **๐Ÿค– Robotics**: "Move the component above the platform" โ†’ Precise 3D coordinates
- **๐Ÿ”ฌ Scientific Modeling**: "The electron orbits the nucleus" โ†’ Proper geometric relationships
- **๐Ÿ—๏ธ Engineering**: "Place the support between the beams" โ†’ Constraint satisfaction
- **๐Ÿฅฝ AR/VR**: Natural language to 3D scene understanding
## ๐ŸŽฏ Try It Yourself
### Watch GASM in Action
Input any sentence with spatial relationships:
> *"The ball lies left of the table next to the computer, while the book sits between the keyboard and the monitor."*
**GASM Output:**
- โœ… **6 entities identified**: ball, table, computer, book, keyboard, monitor
- ๐Ÿ”— **5 spatial relations**: left_of, next_to, between
- ๐ŸŒŒ **3D geometric layout** with proper SE(3) positioning
- ๐Ÿ“ˆ **Curvature evolution** showing geometric convergence
### More Examples
**๐Ÿค– Robotics**: *"The robotic arm moves the satellite component above the assembly platform."*
**๐Ÿ”ฌ Scientific**: *"The electron orbits the nucleus while the magnetic field flows through the crystal."*
**๐Ÿ  Everyday**: *"The red car parks between two buildings near the park entrance."*
### What You'll See
1. **Advanced Entity Recognition**: Far beyond simple keyword matching
2. **Spatial Relationship Extraction**: Understands "left of", "between", "above" in context
3. **3D Visualization**: Real geometric positioning in proper 3D space
4. **Mathematical Convergence**: Curvature evolution showing optimization progress
## ๐Ÿ“ Project Structure
```
GASM-Huggingface/
โ”œโ”€โ”€ app.py # Main Gradio application with complete interface
โ”œโ”€โ”€ gasm_core.py # Core GASM implementation with SE(3) math
โ”œโ”€โ”€ fastapi_endpoint.py # Optional API endpoints (standalone)
โ”œโ”€โ”€ requirements.txt # Python dependencies
โ””โ”€โ”€ README.md # This file
```
## ๐Ÿงฎ The Mathematics Behind GASM
### What Makes It Special
Unlike traditional NLP that treats text as sequences of tokens, GASM understands geometry:
**1. SE(3) Invariant Processing**
- Uses Special Euclidean Group SE(3) for proper 3D transformations
- Maintains mathematical correctness under rotations and translations
- Employs Lie group operations for geometric learning
**2. Advanced Entity Recognition**
- **spaCy NLP**: Part-of-speech tagging + named entity recognition
- **Semantic Filtering**: Domain-specific vocabularies (robotics, scientific, everyday)
- **Contextual Understanding**: Extracts objects from spatial prepositions
**3. Geometric Optimization**
- **Geodesic Distances**: Shortest paths on SE(3) manifold
- **Discrete Curvature**: Graph Laplacian eigenvalue-based computation
- **Energy Minimization**: Constraint satisfaction via Lagrange multipliers
### Technical Architecture
```
Text โ†’ spaCy NLP โ†’ Entity Extraction โ†’ Semantic Filtering
โ†“
SE(3) Embedding โ†’ Attention Mechanism โ†’ Geometric Refinement
โ†“
Constraint Satisfaction โ†’ Curvature Optimization โ†’ 3D Visualization
```
### Why This Matters
Most AI systems use simple word embeddings that lose spatial meaning. GASM preserves geometric relationships through mathematically principled operations, enabling true spatial understanding.
## ๐ŸŽจ Visualizations
The Space provides two main visualizations:
### 1. Curvature Evolution Plot
- Shows geometric convergence over iterations
- Displays SE(3) manifold optimization progress
- Uses matplotlib with dark theme for clarity
### 2. 3D Entity Space Plot
- Interactive 3D positioning of extracted entities
- Color-coded by entity type (robotic, physical, spatial, etc.)
- Shows relationship connections between entities
## ๐Ÿ”ฌ How It Works
1. **Text Input**: User provides text for analysis
2. **Entity Extraction**: Regex-based extraction of meaningful entities
3. **Relation Detection**: Identification of spatial, temporal, physical relations
4. **GASM Processing**: If available, real SE(3) forward pass through geometric manifold
5. **Visualization**: Generate curvature evolution and 3D entity plots
6. **Results**: Comprehensive analysis with JSON output
## โšก Performance
- **CPU Mode**: Optimized for HuggingFace Spaces CPU allocation
- **GPU Fallback**: Automatic ZeroGPU usage when available
- **Memory Efficient**: ~430MB total memory footprint
- **Fast Processing**: 0.1-0.8s processing time depending on text length
## ๐Ÿ› ๏ธ Local Development
To run locally:
```bash
git clone <this-repo>
cd GASM-Huggingface
# Install dependencies
pip install -r requirements.txt
# Run the application
python app.py
```
## ๐Ÿ“Š Space Configuration
This Space is configured with:
- **SDK**: Gradio 4.44.1+
- **Python**: 3.8+
- **GPU**: ZeroGPU compatible (A10G/T4 fallback)
- **Memory**: 16GB RAM allocation
- **Storage**: Persistent storage for model caching
## ๐Ÿ” API Endpoints
The Space also exposes FastAPI endpoints (when fastapi_endpoint.py is run separately):
- `POST /process`: Process text with geometric enhancement
- `GET /health`: Health check and memory usage
- `GET /info`: Model configuration information
## ๐Ÿ“ˆ Use Cases
Perfect for analyzing:
- **Technical Documentation**: Spatial relationships in engineering texts
- **Scientific Literature**: Physical phenomena and experimental setups
- **Educational Content**: Geometry and physics explanations
- **Robotic Systems**: Assembly instructions and spatial configurations
## ๐ŸŽฏ Model Details
- **Base Architecture**: Built on transformer foundations
- **Geometric Processing**: SE(3) Lie group operations
- **Attention Mechanism**: Geodesic distance-based attention weighting
- **Curvature Computation**: Discrete Gaussian curvature via graph Laplacian
- **Constraint Handling**: Energy minimization with Lagrange multipliers
## ๐Ÿš€ Why This Matters
### Current State of AI
- โœ… Excellent at text understanding and generation
- โœ… Great at image recognition and computer vision
- โŒ **Struggles with spatial reasoning from language**
- โŒ **Can't bridge text โ†” 3D geometry gap**
### GASM's Contribution
GASM represents a step toward AI that understands space the way humans do - not just as coordinates, but as meaningful geometric relationships between objects in the world.
**Applications on the horizon:**
- ๐Ÿค– Robots that understand spatial instructions naturally
- ๐Ÿ—๏ธ AI architects that reason about 3D spaces from descriptions
- ๐Ÿ”ฌ Scientific AI that models physical systems geometrically
- ๐ŸŽฎ Game AI that understands spatial gameplay naturally
## ๐Ÿ› ๏ธ Local Development
```bash
git clone https://github.com/scheitelpunk/GASM-Huggingface
cd GASM-Huggingface
pip install -r requirements.txt
python app.py
```
The system gracefully handles missing dependencies with intelligent fallbacks.
## ๐Ÿค Contributing
This is active research in spatial AI! We welcome:
- ๐Ÿ› Bug reports and edge cases
- ๐Ÿ’ก New spatial relationship types
- ๐ŸŒ Additional language support
- ๐Ÿ“Š Evaluation datasets
- ๐Ÿ”ง Performance optimizations
## ๐Ÿ“„ License & Citation
Licensed under CC-BY-NC 4.0. For research use, please cite:
```bibtex
@misc{gasm2025,
title={GASM: Geometric Attention for Spatial Understanding},
author={Michael Neuberger, Versino PsiOmega GmbH},
year={2025},
url={https://huggingface.co/spaces/scheitelpunk/GASM}
}
```
## ๐Ÿ™ Built With
- ๐Ÿค— **Hugging Face Spaces** - Deployment platform
- ๐ŸŒ **spaCy** - Advanced NLP processing
- ๐Ÿ”ข **PyTorch** - Neural network framework
- ๐Ÿ“Š **Gradio** - Interactive ML interfaces
- ๐Ÿ“ **Geomstats** - Geometric computing
---
*GASM: Where language meets geometry, and AI begins to understand space.* ๐Ÿš€
Built by Michael Neuberger, Versino PsiOmega GmbH