File size: 12,211 Bytes
8354c2c
 
 
 
 
 
2ee6344
8354c2c
 
 
 
0400df3
 
 
 
8354c2c
 
0400df3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7a57e5b
 
2ee6344
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
---
title: Collar Multimodal RAG Demo
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.44.1
app_file: app.py
pinned: false
---

# Collar Multimodal RAG Demo - Production Ready

A production-ready multimodal RAG (Retrieval-Augmented Generation) system with team management, chat history, and advanced document processing capabilities.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

## πŸš€ New Production Features

### 1. **Multi-Page Citations**
- **Complex Query Support**: The AI can now retrieve and cite multiple pages when queries reference information across different documents
- **Smart Citation System**: Automatically identifies and displays which pages contain relevant information
- **Configurable Results**: Users can specify how many pages to retrieve (1-10 pages)

### 2. **Team-Based Repository Management**
- **Folder Uploads**: Upload multiple documents as organized collections
- **Team Isolation**: Each team has access only to their own document collections
- **Master Repository**: Documents are organized in team-specific repositories for easy access
- **Collection Naming**: Optional custom names for document collections

### 3. **Authentication & Team Management**
- **User Authentication**: Secure login system with bcrypt password hashing
- **Team-Based Access**: Separate entry points for Team A and Team B
- **Session Management**: Secure session handling with automatic timeout
- **Access Control**: Users can only access and manage their team's documents

### 4. **Chat History & Persistence**
- **Conversation Tracking**: All queries and responses are saved to a SQLite database
- **Historical Context**: View previous conversations with timestamps
- **Cited Pages History**: Track which pages were referenced in each conversation
- **Team-Specific History**: Each team sees only their own conversation history

### 5. **Advanced Relevance Scoring**
- **Multimodal Embeddings**: ColPali-based semantic understanding of text and visual content
- **Intelligent Ranking**: Sophisticated relevance scoring with cosine similarity and dot product
- **Quality Assessment**: Automatic evaluation of information relevance and completeness
- **Diversity Optimization**: Ensures comprehensive coverage across document collections

## πŸ”§ Installation & Setup

### Prerequisites
- Python 3.8+
- Docker Desktop
- Ollama
- CUDA-compatible GPU (recommended)

### 1. Install Dependencies
```bash
pip install -r requirements.txt
```

### 2. Environment Configuration
Create a `.env` file with the following variables:
```env
colpali=your_colpali_model
ollama=your_ollama_model
flashattn=1
temperature=0.8
batchsize=5
metrictype=IP
mnum=16
efnum=500
topk=50
```

### 3. Start Services
The application will automatically:
- Start Docker Desktop (Windows)
- Start Ollama server
- Initialize Docker containers
- Create default users

## πŸ‘₯ Default Users

The system creates default users for each team:

| Team | Username | Password |
|------|----------|----------|
| Team A | admin_team_a | admin123_team_a |
| Team B | admin_team_b | admin123_team_b |

## πŸ“– Usage Guide

### 1. **Authentication**
1. Navigate to the "πŸ” Authentication" tab
2. Enter your username and password
3. Click "Login" to access team-specific features

### 2. **Document Management**
1. Go to "πŸ“ Document Management" tab
2. Optionally enter a collection name for organization
3. Set the maximum pages to extract per document
4. Upload multiple PPT/PDF files
5. Click "Upload to Repository" to process documents
6. Use "Refresh Collections" to see available document collections

### 3. **Advanced Querying**
1. Navigate to "πŸ” Advanced Query" tab
2. Enter your query in the text box
3. Adjust the number of pages to retrieve (1-10)
4. Click "Search Documents" to get AI response with citations
5. View the cited pages and retrieved document images
6. Check relevance scores to understand information quality (see "Relevance Score Calculation" section)

### 4. **Chat History**
1. Go to "πŸ’¬ Chat History" tab
2. Adjust the number of conversations to display
3. Click "Refresh History" to view recent conversations
4. Each entry shows query, response, cited pages, and timestamp

### 5. **Data Management**
1. Access "βš™οΈ Data Management" tab
2. Select collections to delete (team-restricted)
3. Configure database parameters for optimal performance
4. Update settings as needed

## πŸ—οΈ Architecture

### Database Schema
- **users**: User accounts with team assignments
- **chat_history**: Conversation tracking with citations
- **document_collections**: Team-specific document organization

### Security Features
- **Password Hashing**: bcrypt for secure password storage
- **Session Management**: UUID-based session tokens
- **Access Control**: Team-based document isolation
- **Input Validation**: Comprehensive error handling

### Performance Optimizations
- **Multi-threading**: Concurrent document processing
- **Memory Management**: Efficient image and vector handling
- **Caching**: Session-based caching for improved response times
- **Batch Processing**: Configurable batch sizes for GPU optimization

## πŸ” Relevance Score Calculation

The system uses sophisticated relevance scoring to determine how well retrieved documents align with user queries. This process is crucial for selecting the most pertinent information for generating accurate and contextually appropriate responses.

### How Relevance Scores Work

#### 1. **Document Embedding Process**
- **Page Segmentation**: Each document page is processed as a complete unit
- **Multimodal Encoding**: Both text and visual elements are captured using ColPali embeddings
- **Vector Representation**: Pages are transformed into high-dimensional numerical vectors (typically 768-1024 dimensions)
- **Semantic Capture**: The embedding captures semantic meaning, not just keyword matches

#### 2. **Query Embedding**
- **Query Processing**: User queries are converted into embeddings using the same ColPali model
- **Semantic Understanding**: The system understands query intent, not just literal words
- **Context Preservation**: Query context and meaning are maintained in the embedding

#### 3. **Similarity Computation**
- **Cosine Similarity**: Primary similarity measure between query and document embeddings
- **Dot Product**: Alternative similarity calculation for high-dimensional vectors
- **Normalized Scores**: Similarity scores are normalized to a 0-1 range
- **Distance Metrics**: Lower distances indicate higher relevance

#### 4. **Score Aggregation & Ranking**
- **Individual Page Scores**: Each page gets a relevance score based on similarity
- **Collection Diversity**: Scores are adjusted to promote diversity across document collections
- **Consecutive Page Optimization**: Adjacent pages are considered for better context
- **Final Ranking**: Pages are ranked by their aggregated relevance scores

### Relevance Score Interpretation

| Score Range | Relevance Level | Description |
|-------------|----------------|-------------|
| 0.90 - 1.00 | **Excellent** | Highly relevant, directly answers the query |
| 0.80 - 0.89 | **Very Good** | Very relevant, provides substantial information |
| 0.70 - 0.79 | **Good** | Relevant, contains useful information |
| 0.60 - 0.69 | **Moderate** | Somewhat relevant, may contain partial answers |
| 0.50 - 0.59 | **Basic** | Minimally relevant, limited usefulness |
| < 0.50 | **Poor** | Not relevant, unlikely to be useful |

### Example Relevance Calculation

**Query**: "What are the safety procedures for handling explosives?"

**Document Pages**:
1. **Page 15**: "Safety protocols for explosive materials" β†’ Score: 0.95 (Excellent)
2. **Page 23**: "Equipment requirements for explosive handling" β†’ Score: 0.92 (Very Good)
3. **Page 8**: "General laboratory safety guidelines" β†’ Score: 0.88 (Very Good)
4. **Page 45**: "Chemical storage procedures" β†’ Score: 0.65 (Moderate)

**Selection Process**:
- Pages 15, 23, and 8 are selected for their high relevance
- Page 45 is excluded due to lower relevance
- The system ensures diversity across different aspects of safety procedures

### Advanced Features

#### **Multi-Modal Relevance**
- **Visual Elements**: Images, charts, and diagrams contribute to relevance scores
- **Text-Vision Alignment**: ColPali captures relationships between text and visual content
- **Layout Understanding**: Document structure and formatting influence relevance

#### **Context-Aware Scoring**
- **Query Complexity**: Complex queries may retrieve more pages with varied scores
- **Cross-Reference Detection**: Pages that reference each other get boosted scores
- **Temporal Relevance**: Recent documents may receive slight score adjustments

#### **Quality Assurance**
- **Score Verification**: System validates that selected pages meet minimum relevance thresholds
- **Diversity Optimization**: Ensures selected pages provide comprehensive coverage
- **Redundancy Reduction**: Avoids selecting multiple pages with very similar content

### Configuration Parameters

```env
# Relevance scoring configuration
metrictype=IP          # Inner Product similarity
mnum=16                # Number of connections in HNSW graph
efnum=500              # Search depth for high-quality results
topk=50                # Maximum results to consider
```

### Performance Impact

- **Search Speed**: Relevance scoring adds minimal overhead (~10-50ms per query)
- **Accuracy**: High-quality embeddings ensure accurate relevance assessment
- **Scalability**: Efficient vector operations support large document collections
- **Memory Usage**: Optimized to handle thousands of document pages efficiently

## πŸ”’ Security Considerations

### Production Deployment
1. **HTTPS**: Always use HTTPS in production
2. **Environment Variables**: Store sensitive data in environment variables
3. **Database Security**: Use production-grade database (PostgreSQL/MySQL)
4. **Rate Limiting**: Implement API rate limiting
5. **Logging**: Add comprehensive logging for security monitoring

### Recommended Security Enhancements
```python
# Add to production deployment
import logging
from flask_limiter import Limiter
from flask_limiter.util import get_remote_address

# Rate limiting
limiter = Limiter(
    app,
    key_func=get_remote_address,
    default_limits=["200 per day", "50 per hour"]
)

# Security headers
@app.after_request
def add_security_headers(response):
    response.headers['X-Content-Type-Options'] = 'nosniff'
    response.headers['X-Frame-Options'] = 'DENY'
    response.headers['X-XSS-Protection'] = '1; mode=block'
    return response
```

## πŸš€ Deployment

### Docker Deployment
```dockerfile
FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
EXPOSE 7860

CMD ["python", "app.py"]
```

### Environment Variables for Production
```env
# Database
DATABASE_URL=postgresql://user:password@localhost/dbname
SECRET_KEY=your-secret-key-here

# Security
BCRYPT_ROUNDS=12
SESSION_TIMEOUT=3600

# Performance
WORKER_THREADS=4
MAX_UPLOAD_SIZE=100MB
```

## πŸ“Š Monitoring & Analytics

### Key Metrics to Track
- **Query Response Time**: Average time for AI responses
- **Document Processing Time**: Time to index new documents
- **User Activity**: Login frequency and session duration
- **Error Rates**: Failed queries and system errors
- **Storage Usage**: Database and file system utilization

### Logging Configuration
```python
import logging

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('app.log'),
        logging.StreamHandler()
    ]
)
```

## 🀝 Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new features
5. Submit a pull request

## πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

## πŸ†˜ Support

For support and questions:
- Create an issue in the repository
- Check the documentation
- Review the troubleshooting guide

---

**Made by Collar** - Enhanced with Team Management & Chat History