license: apache-2.0
title: Long Context Caching Gemini PDF QA
sdk: docker
emoji: ๐
colorFrom: yellow
๐ Smart Document Analysis Platform
A modern web application that leverages Google Gemini API's caching capabilities to provide efficient document analysis. Upload documents once, ask questions forever!
๐ Features
- Document Upload: Upload PDF files via drag-and-drop or URL
- Gemini API Caching: Documents are cached using Gemini's explicit caching feature
- Cost-Effective: Save on API costs by reusing cached document tokens
- Real-time Chat: Ask multiple questions about your documents
- Beautiful UI: Modern, responsive design with smooth animations
- Token Tracking: See how many tokens are cached for cost transparency
- Smart Error Handling: Graceful handling of small documents that don't meet caching requirements
๐ฏ Use Cases
This platform is perfect for:
- Research Analysis: Upload research papers and ask detailed questions
- Legal Document Review: Analyze contracts, legal documents, and policies
- Academic Studies: Study course materials and textbooks
- Business Reports: Analyze quarterly reports, whitepapers, and presentations
- Technical Documentation: Review manuals, specifications, and guides
โก๏ธ Deploy on Hugging Face Spaces
You can deploy this app on Hugging Face Spaces using the Docker SDK.
1. Select Docker SDK
- When creating your Space, choose Docker (not Gradio, not Static).
2. Project Structure
Make sure your repo includes:
app.py
(Flask app)requirements.txt
Dockerfile
.env.example
(for reference, do not include secrets)
3. Dockerfile
A sample Dockerfile is provided:
FROM python:3.10-slim
WORKDIR /app
RUN apt-get update && apt-get install -y build-essential && rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 7860
CMD ["python", "app.py"]
4. Port Configuration
The app will run on the port provided by the PORT
environment variable (default 7860), as required by Hugging Face Spaces.
5. Set Environment Variables
- In your Space settings, add your
GOOGLE_API_KEY
as a secret environment variable.
6. Push to Hugging Face
- Push your code to the Space's Git repository.
- The build and deployment will happen automatically.
๐ Prerequisites
- Python 3.8 or higher
- Google Gemini API key
- Internet connection for API calls
๐ง Local Installation
Clone the repository
git clone <repository-url> cd smart-document-analysis
Install dependencies
pip install -r requirements.txt
Set up environment variables
cp .env.example .env
Edit
.env
and add your Google Gemini API key:GOOGLE_API_KEY=your_actual_api_key_here
Get your API key
- Visit Google AI Studio
- Create a new API key
- Copy it to your
.env
file
๐ Running the Application Locally
Start the server
python app.py
Open your browser Navigate to
http://localhost:7860
Upload a document
- Drag and drop a PDF file, or
- Click to select a file, or
- Provide a URL to a PDF
Start asking questions Once your document is cached, you can ask unlimited questions!
๐ก How It Works
1. Document Upload
When you upload a PDF, the application:
- Uploads the file to Gemini's File API
- Checks if the document meets minimum token requirements (4,096 tokens)
- If eligible, creates a cache with the document content
- If too small, provides helpful error message and suggestions
- Stores cache metadata locally
- Returns a cache ID for future reference
2. Question Processing
When you ask a question:
- The question is sent to Gemini API
- The cached document content is automatically included
- You only pay for the question tokens, not the document tokens
- Responses are generated based on the cached content
3. Cost Savings
- Without caching: You pay for document tokens + question tokens every time
- With caching: You pay for document tokens once + question tokens for each question
๐ API Endpoints
GET /
- Main application interfacePOST /upload
- Upload PDF filePOST /upload-url
- Upload PDF from URLPOST /ask
- Ask question about cached documentGET /caches
- List all cached documentsDELETE /cache/<cache_id>
- Delete specific cache
๐ Cost Analysis
Example Scenario
- Document: 10,000 tokens
- Question: 50 tokens
- 10 questions asked
Without Caching:
- Cost = (10,000 + 50) ร 10 = 100,500 tokens
With Caching:
- Cost = 10,000 + (50 ร 10) = 10,500 tokens
- Savings: 90% cost reduction!
Token Requirements
- Minimum for caching: 4,096 tokens
- Recommended minimum: 5,000 tokens for cost-effectiveness
- Optimal range: 10,000 - 100,000 tokens
- Maximum: Model-specific limits (check Gemini API docs)
๐จ Customization
Changing the Model
Edit app.py
and change the model name:
model="models/gemini-2.0-flash-001" # Current
model="models/gemini-2.0-pro-001" # Alternative
Custom System Instructions
Modify the system instruction in the cache creation:
system_instruction="Your custom instruction here"
Cache TTL
Add TTL configuration to cache creation:
config=types.CreateCachedContentConfig(
system_instruction=system_instruction,
contents=[document],
ttl='24h' # Cache for 24 hours
)
๐ Security Considerations
- API keys are stored in environment variables
- File uploads are validated for PDF format
- Cached content is managed securely through Gemini API
- No sensitive data is stored locally
๐ง Production Deployment
For production deployment:
Use a production WSGI server
pip install gunicorn gunicorn -w 4 -b 0.0.0.0:7860 app:app
Add database storage
- Replace in-memory storage with PostgreSQL/MySQL
- Add user authentication
- Implement session management
Add monitoring
- Log API usage and costs
- Monitor cache hit rates
- Track user interactions
Security enhancements
- Add rate limiting
- Implement file size limits
- Add input validation
๐ค Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Google Gemini API for providing the caching functionality
- Flask community for the excellent web framework
- The open-source community for inspiration and tools
๐ Support
If you encounter any issues:
- Check the Gemini API documentation
- Verify your API key is correct
- Ensure your PDF files are valid
- Check the browser console for JavaScript errors
- For small document errors: Upload a larger document or combine multiple documents
๐ฎ Future Enhancements
- Support for multiple file formats (Word, PowerPoint, etc.)
- User authentication and document sharing
- Advanced analytics and usage tracking
- Integration with cloud storage (Google Drive, Dropbox)
- Mobile app version
- Multi-language support
- Advanced caching strategies
- Real-time collaboration features
- Document preprocessing to meet token requirements
- Batch document processing