Spaces:
Running
on
Zero
Running
on
Zero
File size: 12,211 Bytes
8354c2c 2ee6344 8354c2c 0400df3 8354c2c 0400df3 7a57e5b 2ee6344 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 |
---
title: Collar Multimodal RAG Demo
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.44.1
app_file: app.py
pinned: false
---
# Collar Multimodal RAG Demo - Production Ready
A production-ready multimodal RAG (Retrieval-Augmented Generation) system with team management, chat history, and advanced document processing capabilities.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
## π New Production Features
### 1. **Multi-Page Citations**
- **Complex Query Support**: The AI can now retrieve and cite multiple pages when queries reference information across different documents
- **Smart Citation System**: Automatically identifies and displays which pages contain relevant information
- **Configurable Results**: Users can specify how many pages to retrieve (1-10 pages)
### 2. **Team-Based Repository Management**
- **Folder Uploads**: Upload multiple documents as organized collections
- **Team Isolation**: Each team has access only to their own document collections
- **Master Repository**: Documents are organized in team-specific repositories for easy access
- **Collection Naming**: Optional custom names for document collections
### 3. **Authentication & Team Management**
- **User Authentication**: Secure login system with bcrypt password hashing
- **Team-Based Access**: Separate entry points for Team A and Team B
- **Session Management**: Secure session handling with automatic timeout
- **Access Control**: Users can only access and manage their team's documents
### 4. **Chat History & Persistence**
- **Conversation Tracking**: All queries and responses are saved to a SQLite database
- **Historical Context**: View previous conversations with timestamps
- **Cited Pages History**: Track which pages were referenced in each conversation
- **Team-Specific History**: Each team sees only their own conversation history
### 5. **Advanced Relevance Scoring**
- **Multimodal Embeddings**: ColPali-based semantic understanding of text and visual content
- **Intelligent Ranking**: Sophisticated relevance scoring with cosine similarity and dot product
- **Quality Assessment**: Automatic evaluation of information relevance and completeness
- **Diversity Optimization**: Ensures comprehensive coverage across document collections
## π§ Installation & Setup
### Prerequisites
- Python 3.8+
- Docker Desktop
- Ollama
- CUDA-compatible GPU (recommended)
### 1. Install Dependencies
```bash
pip install -r requirements.txt
```
### 2. Environment Configuration
Create a `.env` file with the following variables:
```env
colpali=your_colpali_model
ollama=your_ollama_model
flashattn=1
temperature=0.8
batchsize=5
metrictype=IP
mnum=16
efnum=500
topk=50
```
### 3. Start Services
The application will automatically:
- Start Docker Desktop (Windows)
- Start Ollama server
- Initialize Docker containers
- Create default users
## π₯ Default Users
The system creates default users for each team:
| Team | Username | Password |
|------|----------|----------|
| Team A | admin_team_a | admin123_team_a |
| Team B | admin_team_b | admin123_team_b |
## π Usage Guide
### 1. **Authentication**
1. Navigate to the "π Authentication" tab
2. Enter your username and password
3. Click "Login" to access team-specific features
### 2. **Document Management**
1. Go to "π Document Management" tab
2. Optionally enter a collection name for organization
3. Set the maximum pages to extract per document
4. Upload multiple PPT/PDF files
5. Click "Upload to Repository" to process documents
6. Use "Refresh Collections" to see available document collections
### 3. **Advanced Querying**
1. Navigate to "π Advanced Query" tab
2. Enter your query in the text box
3. Adjust the number of pages to retrieve (1-10)
4. Click "Search Documents" to get AI response with citations
5. View the cited pages and retrieved document images
6. Check relevance scores to understand information quality (see "Relevance Score Calculation" section)
### 4. **Chat History**
1. Go to "π¬ Chat History" tab
2. Adjust the number of conversations to display
3. Click "Refresh History" to view recent conversations
4. Each entry shows query, response, cited pages, and timestamp
### 5. **Data Management**
1. Access "βοΈ Data Management" tab
2. Select collections to delete (team-restricted)
3. Configure database parameters for optimal performance
4. Update settings as needed
## ποΈ Architecture
### Database Schema
- **users**: User accounts with team assignments
- **chat_history**: Conversation tracking with citations
- **document_collections**: Team-specific document organization
### Security Features
- **Password Hashing**: bcrypt for secure password storage
- **Session Management**: UUID-based session tokens
- **Access Control**: Team-based document isolation
- **Input Validation**: Comprehensive error handling
### Performance Optimizations
- **Multi-threading**: Concurrent document processing
- **Memory Management**: Efficient image and vector handling
- **Caching**: Session-based caching for improved response times
- **Batch Processing**: Configurable batch sizes for GPU optimization
## π Relevance Score Calculation
The system uses sophisticated relevance scoring to determine how well retrieved documents align with user queries. This process is crucial for selecting the most pertinent information for generating accurate and contextually appropriate responses.
### How Relevance Scores Work
#### 1. **Document Embedding Process**
- **Page Segmentation**: Each document page is processed as a complete unit
- **Multimodal Encoding**: Both text and visual elements are captured using ColPali embeddings
- **Vector Representation**: Pages are transformed into high-dimensional numerical vectors (typically 768-1024 dimensions)
- **Semantic Capture**: The embedding captures semantic meaning, not just keyword matches
#### 2. **Query Embedding**
- **Query Processing**: User queries are converted into embeddings using the same ColPali model
- **Semantic Understanding**: The system understands query intent, not just literal words
- **Context Preservation**: Query context and meaning are maintained in the embedding
#### 3. **Similarity Computation**
- **Cosine Similarity**: Primary similarity measure between query and document embeddings
- **Dot Product**: Alternative similarity calculation for high-dimensional vectors
- **Normalized Scores**: Similarity scores are normalized to a 0-1 range
- **Distance Metrics**: Lower distances indicate higher relevance
#### 4. **Score Aggregation & Ranking**
- **Individual Page Scores**: Each page gets a relevance score based on similarity
- **Collection Diversity**: Scores are adjusted to promote diversity across document collections
- **Consecutive Page Optimization**: Adjacent pages are considered for better context
- **Final Ranking**: Pages are ranked by their aggregated relevance scores
### Relevance Score Interpretation
| Score Range | Relevance Level | Description |
|-------------|----------------|-------------|
| 0.90 - 1.00 | **Excellent** | Highly relevant, directly answers the query |
| 0.80 - 0.89 | **Very Good** | Very relevant, provides substantial information |
| 0.70 - 0.79 | **Good** | Relevant, contains useful information |
| 0.60 - 0.69 | **Moderate** | Somewhat relevant, may contain partial answers |
| 0.50 - 0.59 | **Basic** | Minimally relevant, limited usefulness |
| < 0.50 | **Poor** | Not relevant, unlikely to be useful |
### Example Relevance Calculation
**Query**: "What are the safety procedures for handling explosives?"
**Document Pages**:
1. **Page 15**: "Safety protocols for explosive materials" β Score: 0.95 (Excellent)
2. **Page 23**: "Equipment requirements for explosive handling" β Score: 0.92 (Very Good)
3. **Page 8**: "General laboratory safety guidelines" β Score: 0.88 (Very Good)
4. **Page 45**: "Chemical storage procedures" β Score: 0.65 (Moderate)
**Selection Process**:
- Pages 15, 23, and 8 are selected for their high relevance
- Page 45 is excluded due to lower relevance
- The system ensures diversity across different aspects of safety procedures
### Advanced Features
#### **Multi-Modal Relevance**
- **Visual Elements**: Images, charts, and diagrams contribute to relevance scores
- **Text-Vision Alignment**: ColPali captures relationships between text and visual content
- **Layout Understanding**: Document structure and formatting influence relevance
#### **Context-Aware Scoring**
- **Query Complexity**: Complex queries may retrieve more pages with varied scores
- **Cross-Reference Detection**: Pages that reference each other get boosted scores
- **Temporal Relevance**: Recent documents may receive slight score adjustments
#### **Quality Assurance**
- **Score Verification**: System validates that selected pages meet minimum relevance thresholds
- **Diversity Optimization**: Ensures selected pages provide comprehensive coverage
- **Redundancy Reduction**: Avoids selecting multiple pages with very similar content
### Configuration Parameters
```env
# Relevance scoring configuration
metrictype=IP # Inner Product similarity
mnum=16 # Number of connections in HNSW graph
efnum=500 # Search depth for high-quality results
topk=50 # Maximum results to consider
```
### Performance Impact
- **Search Speed**: Relevance scoring adds minimal overhead (~10-50ms per query)
- **Accuracy**: High-quality embeddings ensure accurate relevance assessment
- **Scalability**: Efficient vector operations support large document collections
- **Memory Usage**: Optimized to handle thousands of document pages efficiently
## π Security Considerations
### Production Deployment
1. **HTTPS**: Always use HTTPS in production
2. **Environment Variables**: Store sensitive data in environment variables
3. **Database Security**: Use production-grade database (PostgreSQL/MySQL)
4. **Rate Limiting**: Implement API rate limiting
5. **Logging**: Add comprehensive logging for security monitoring
### Recommended Security Enhancements
```python
# Add to production deployment
import logging
from flask_limiter import Limiter
from flask_limiter.util import get_remote_address
# Rate limiting
limiter = Limiter(
app,
key_func=get_remote_address,
default_limits=["200 per day", "50 per hour"]
)
# Security headers
@app.after_request
def add_security_headers(response):
response.headers['X-Content-Type-Options'] = 'nosniff'
response.headers['X-Frame-Options'] = 'DENY'
response.headers['X-XSS-Protection'] = '1; mode=block'
return response
```
## π Deployment
### Docker Deployment
```dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 7860
CMD ["python", "app.py"]
```
### Environment Variables for Production
```env
# Database
DATABASE_URL=postgresql://user:password@localhost/dbname
SECRET_KEY=your-secret-key-here
# Security
BCRYPT_ROUNDS=12
SESSION_TIMEOUT=3600
# Performance
WORKER_THREADS=4
MAX_UPLOAD_SIZE=100MB
```
## π Monitoring & Analytics
### Key Metrics to Track
- **Query Response Time**: Average time for AI responses
- **Document Processing Time**: Time to index new documents
- **User Activity**: Login frequency and session duration
- **Error Rates**: Failed queries and system errors
- **Storage Usage**: Database and file system utilization
### Logging Configuration
```python
import logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler('app.log'),
logging.StreamHandler()
]
)
```
## π€ Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new features
5. Submit a pull request
## π License
This project is licensed under the MIT License - see the LICENSE file for details.
## π Support
For support and questions:
- Create an issue in the repository
- Check the documentation
- Review the troubleshooting guide
---
**Made by Collar** - Enhanced with Team Management & Chat History |