Spaces:
Running
on
Zero
Running
on
Zero
π Quick Start Guide - Collar Multimodal RAG Demo
Get your production-ready multimodal RAG system up and running in minutes!
β‘ 5-Minute Setup
1. Install Dependencies
pip install -r requirements.txt
2. Start the Application
python app.py
3. Access the Application
Open your browser and go to: http://localhost:7860
4. Login with Default Users
- Team A:
admin_team_a
/admin123_team_a
- Team B:
admin_team_b
/admin123_team_b
π― Key Features to Try
Enhanced Multi-Page Citations
- Upload multiple documents
- Ask complex queries like: "What are the different types of explosives and their safety procedures?"
- The system automatically detects complex queries and retrieves multiple relevant pages
- See intelligent citations grouped by document collections with relevance scores
- View multiple pages in the gallery display
Team Repository Management
- Login as Team A user
- Upload documents with a collection name like "Safety Manuals"
- Switch to Team B user - notice you can't see Team A's documents
Chat History
- Make several queries
- Go to "π¬ Chat History" tab
- See your conversation history with timestamps and cited pages
Advanced Querying
- Set "Number of pages to retrieve" to 5
- Ask a complex question
- View multiple relevant pages and AI response with citations
Enhanced Detailed Responses
- Ask any question and receive comprehensive, detailed answers
- Get extensive background information and context
- See step-by-step explanations and practical applications
- Receive safety considerations and best practices
- Get technical specifications and measurements
- View quality assessment and recommendations for further research
CSV Table Generation
- Ask for data in table format: "Show me a table of safety procedures"
- Request CSV data: "Create a CSV with the comparison data"
- Get structured responses with downloadable CSV content
- View table information including rows, columns, and data sources
- Copy CSV content to use in Excel, Google Sheets, or other applications
π§ Configuration
Environment Variables (.env file)
# AI Models
colpali=colpali-v1.3
ollama=llama2
# Performance
flashattn=1
temperature=0.8
batchsize=5
# Database
metrictype=IP
mnum=16
efnum=500
topk=50
Customizing for Your Use Case
For Large Document Collections
batchsize=10
topk=100
efnum=1000
For Faster Processing
batchsize=2
flashattn=0
For Higher Accuracy
temperature=0.3
topk=200
π File Structure
colpali-milvus-multimodal-rag-master/
βββ app.py # Main application
βββ requirements.txt # Dependencies
βββ README.md # Full documentation
βββ QUICK_START.md # This file
βββ test_production_features.py # Test suite
βββ deploy_production.py # Production deployment
βββ app_database.db # SQLite database (auto-created)
βββ pages/ # Document pages (auto-created)
βββ logs/ # Application logs
βββ uploads/ # Uploaded files
π§ͺ Testing
Run the test suite to verify everything works:
python test_production_features.py
Test the multi-page citation system:
python test_multipage_citations.py
Test the page count fix:
python test_page_count_fix.py
Test the enhanced detailed responses:
python test_detailed_responses.py
Test the page usage fix:
python test_page_usage_fix.py
Test the table generation functionality:
python test_table_generation.py
π Production Deployment
For production deployment, run:
python deploy_production.py
This will:
- β Check prerequisites
- β Setup environment
- β Install dependencies
- β Create database
- β Setup logging
- β Create Docker configurations
- β Run tests
π Troubleshooting
Common Issues
"No module named 'bcrypt'"
pip install bcrypt
"Docker not running"
- Start Docker Desktop
- Wait for it to fully initialize
"Ollama not found"
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve
"CUDA out of memory"
Reduce batch size in .env:
batchsize=2
"Database locked"
# Stop the application and restart
# Or delete the database file to start fresh
rm app_database.db
"Getting fewer pages than requested"
- The system now ensures exactly the requested number of pages are returned
- Check the console logs for debugging information
- Run the page count test:
python test_page_count_fix.py
- If issues persist, check that documents have enough content for the query
"LLM only cites 2 pages when 3 are requested"
- The system now verifies that LLM uses all provided pages
- Enhanced prompts explicitly instruct to use ALL pages
- Page usage verification detects missing references
- Run the page usage test:
python test_page_usage_fix.py
- Check console logs for page usage verification messages
Performance Optimization
For GPU Users
flashattn=1
batchsize=8
For CPU Users
flashattn=0
batchsize=2
For Large Datasets
topk=200
efnum=1000
mnum=32
π Monitoring
Check Application Status
- View logs in
logs/app.log
- Monitor database size:
ls -lh app_database.db
- Check uploaded documents:
ls -la pages/
Performance Metrics
- Query response time
- Document processing time
- Memory usage
- GPU utilization (if applicable)
π Security Best Practices
For Development
- Use default passwords (already configured)
- Run on localhost only
For Production
- Change default passwords
- Use HTTPS
- Set up proper firewall rules
- Regular database backups
- Monitor access logs
π Support
Getting Help
- Check the troubleshooting section above
- Review the full README.md
- Run the test suite:
python test_production_features.py
- Check application logs:
tail -f logs/app.log
Feature Requests
- Multi-language support
- Advanced analytics dashboard
- API endpoints
- Mobile app
- Integration with external systems
π What's Next?
After getting familiar with the basic features:
- Upload Your Documents: Replace the sample documents with your own
- Customize Models: Experiment with different AI models
- Scale Up: Add more users and teams
- Integrate: Connect with your existing systems
- Deploy: Move to production with the deployment script
Happy RAG-ing! π
Made by Collar - Enhanced with Team Management & Chat History