--- license: apache-2.0 title: Long Context Caching Gemini PDF QA sdk: docker emoji: 📚 colorFrom: yellow --- # 📚 Smart Document Analysis Platform A modern web application that leverages Google Gemini API's caching capabilities to provide efficient document analysis. Upload documents once, ask questions forever! ## 🚀 Features - **Document Upload**: Upload PDF files via drag-and-drop or URL - **Gemini API Caching**: Documents are cached using Gemini's explicit caching feature - **Cost-Effective**: Save on API costs by reusing cached document tokens - **Real-time Chat**: Ask multiple questions about your documents - **Beautiful UI**: Modern, responsive design with smooth animations - **Token Tracking**: See how many tokens are cached for cost transparency - **Smart Error Handling**: Graceful handling of small documents that don't meet caching requirements ## 🎯 Use Cases This platform is perfect for: - **Research Analysis**: Upload research papers and ask detailed questions - **Legal Document Review**: Analyze contracts, legal documents, and policies - **Academic Studies**: Study course materials and textbooks - **Business Reports**: Analyze quarterly reports, whitepapers, and presentations - **Technical Documentation**: Review manuals, specifications, and guides ## ⚡️ Deploy on Hugging Face Spaces You can deploy this app on [Hugging Face Spaces](https://huggingface.co/spaces) using the **Docker** SDK. ### 1. **Select Docker SDK** - When creating your Space, choose **Docker** (not Gradio, not Static). ### 2. **Project Structure** Make sure your repo includes: - `app.py` (Flask app) - `requirements.txt` - `Dockerfile` - `.env.example` (for reference, do not include secrets) ### 3. **Dockerfile** A sample Dockerfile is provided: ```dockerfile FROM python:3.10-slim WORKDIR /app RUN apt-get update && apt-get install -y build-essential && rm -rf /var/lib/apt/lists/* COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt COPY . . EXPOSE 7860 CMD ["python", "app.py"] ``` ### 4. **Port Configuration** The app will run on the port provided by the `PORT` environment variable (default 7860), as required by Hugging Face Spaces. ### 5. **Set Environment Variables** - In your Space settings, add your `GOOGLE_API_KEY` as a secret environment variable. ### 6. **Push to Hugging Face** - Push your code to the Space's Git repository. - The build and deployment will happen automatically. --- ## 📋 Prerequisites - Python 3.8 or higher - Google Gemini API key - Internet connection for API calls ## 🔧 Local Installation 1. **Clone the repository** ```bash git clone cd smart-document-analysis ``` 2. **Install dependencies** ```bash pip install -r requirements.txt ``` 3. **Set up environment variables** ```bash cp .env.example .env ``` Edit `.env` and add your Google Gemini API key: ``` GOOGLE_API_KEY=your_actual_api_key_here ``` 4. **Get your API key** - Visit [Google AI Studio](https://makersuite.google.com/app/apikey) - Create a new API key - Copy it to your `.env` file ## 🚀 Running the Application Locally 1. **Start the server** ```bash python app.py ``` 2. **Open your browser** Navigate to `http://localhost:7860` 3. **Upload a document** - Drag and drop a PDF file, or - Click to select a file, or - Provide a URL to a PDF 4. **Start asking questions** Once your document is cached, you can ask unlimited questions! ## 💡 How It Works ### 1. Document Upload When you upload a PDF, the application: - Uploads the file to Gemini's File API - Checks if the document meets minimum token requirements (4,096 tokens) - If eligible, creates a cache with the document content - If too small, provides helpful error message and suggestions - Stores cache metadata locally - Returns a cache ID for future reference ### 2. Question Processing When you ask a question: - The question is sent to Gemini API - The cached document content is automatically included - You only pay for the question tokens, not the document tokens - Responses are generated based on the cached content ### 3. Cost Savings - **Without caching**: You pay for document tokens + question tokens every time - **With caching**: You pay for document tokens once + question tokens for each question ## 🔍 API Endpoints - `GET /` - Main application interface - `POST /upload` - Upload PDF file - `POST /upload-url` - Upload PDF from URL - `POST /ask` - Ask question about cached document - `GET /caches` - List all cached documents - `DELETE /cache/` - Delete specific cache ## 📊 Cost Analysis ### Example Scenario - Document: 10,000 tokens - Question: 50 tokens - 10 questions asked **Without Caching:** - Cost = (10,000 + 50) × 10 = 100,500 tokens **With Caching:** - Cost = 10,000 + (50 × 10) = 10,500 tokens - **Savings: 90% cost reduction!** ### Token Requirements - **Minimum for caching**: 4,096 tokens - **Recommended minimum**: 5,000 tokens for cost-effectiveness - **Optimal range**: 10,000 - 100,000 tokens - **Maximum**: Model-specific limits (check Gemini API docs) ## 🎨 Customization ### Changing the Model Edit `app.py` and change the model name: ```python model="models/gemini-2.0-flash-001" # Current model="models/gemini-2.0-pro-001" # Alternative ``` ### Custom System Instructions Modify the system instruction in the cache creation: ```python system_instruction="Your custom instruction here" ``` ### Cache TTL Add TTL configuration to cache creation: ```python config=types.CreateCachedContentConfig( system_instruction=system_instruction, contents=[document], ttl='24h' # Cache for 24 hours ) ``` ## 🔒 Security Considerations - API keys are stored in environment variables - File uploads are validated for PDF format - Cached content is managed securely through Gemini API - No sensitive data is stored locally ## 🚧 Production Deployment For production deployment: 1. **Use a production WSGI server** ```bash pip install gunicorn gunicorn -w 4 -b 0.0.0.0:7860 app:app ``` 2. **Add database storage** - Replace in-memory storage with PostgreSQL/MySQL - Add user authentication - Implement session management 3. **Add monitoring** - Log API usage and costs - Monitor cache hit rates - Track user interactions 4. **Security enhancements** - Add rate limiting - Implement file size limits - Add input validation ## 🤝 Contributing 1. Fork the repository 2. Create a feature branch 3. Make your changes 4. Add tests if applicable 5. Submit a pull request ## 📝 License This project is licensed under the MIT License - see the LICENSE file for details. ## 🙏 Acknowledgments - Google Gemini API for providing the caching functionality - Flask community for the excellent web framework - The open-source community for inspiration and tools ## 📞 Support If you encounter any issues: 1. Check the [Gemini API documentation](https://ai.google.dev/docs) 2. Verify your API key is correct 3. Ensure your PDF files are valid 4. Check the browser console for JavaScript errors 5. **For small document errors**: Upload a larger document or combine multiple documents ## 🔮 Future Enhancements - [ ] Support for multiple file formats (Word, PowerPoint, etc.) - [ ] User authentication and document sharing - [ ] Advanced analytics and usage tracking - [ ] Integration with cloud storage (Google Drive, Dropbox) - [ ] Mobile app version - [ ] Multi-language support - [ ] Advanced caching strategies - [ ] Real-time collaboration features - [ ] Document preprocessing to meet token requirements - [ ] Batch document processing