metadata

license: apache-2.0
title: Long Context Caching Gemini PDF QA
sdk: docker
emoji: 📚
colorFrom: yellow

📚 Smart Document Analysis Platform

A modern web application that leverages Google Gemini API's caching capabilities to provide efficient document analysis. Upload documents once, ask questions forever!

🚀 Features

Document Upload: Upload PDF files via drag-and-drop or URL
Gemini API Caching: Documents are cached using Gemini's explicit caching feature
Cost-Effective: Save on API costs by reusing cached document tokens
Real-time Chat: Ask multiple questions about your documents
Beautiful UI: Modern, responsive design with smooth animations
Token Tracking: See how many tokens are cached for cost transparency
Smart Error Handling: Graceful handling of small documents that don't meet caching requirements

🎯 Use Cases

This platform is perfect for:

Research Analysis: Upload research papers and ask detailed questions
Legal Document Review: Analyze contracts, legal documents, and policies
Academic Studies: Study course materials and textbooks
Business Reports: Analyze quarterly reports, whitepapers, and presentations
Technical Documentation: Review manuals, specifications, and guides

⚡️ Deploy on Hugging Face Spaces

You can deploy this app on Hugging Face Spaces using the Docker SDK.

1. Select Docker SDK

When creating your Space, choose Docker (not Gradio, not Static).

2. Project Structure

Make sure your repo includes:

app.py (Flask app)
requirements.txt
Dockerfile
.env.example (for reference, do not include secrets)

3. Dockerfile

A sample Dockerfile is provided:

FROM python:3.10-slim
WORKDIR /app
RUN apt-get update && apt-get install -y build-essential && rm -rf /var/lib/apt/lists/*
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 7860
CMD ["python", "app.py"]

4. Port Configuration

The app will run on the port provided by the PORT environment variable (default 7860), as required by Hugging Face Spaces.

5. Set Environment Variables

In your Space settings, add your GOOGLE_API_KEY as a secret environment variable.

6. Push to Hugging Face

Push your code to the Space's Git repository.
The build and deployment will happen automatically.

📋 Prerequisites

Python 3.8 or higher
Google Gemini API key
Internet connection for API calls

🔧 Local Installation

Clone the repository

git clone <repository-url>
cd smart-document-analysis

Install dependencies
```
pip install -r requirements.txt
```
Set up environment variables
```
cp .env.example .env
```
Edit .env and add your Google Gemini API key:
```
GOOGLE_API_KEY=your_actual_api_key_here
```
Get your API key
- Visit Google AI Studio
- Create a new API key
- Copy it to your .env file

🚀 Running the Application Locally

Start the server
```
python app.py
```
Open your browser Navigate to http://localhost:7860
Upload a document
- Drag and drop a PDF file, or
- Click to select a file, or
- Provide a URL to a PDF
Start asking questions Once your document is cached, you can ask unlimited questions!

💡 How It Works

1. Document Upload

When you upload a PDF, the application:

Uploads the file to Gemini's File API
Checks if the document meets minimum token requirements (4,096 tokens)
If eligible, creates a cache with the document content
If too small, provides helpful error message and suggestions
Stores cache metadata locally
Returns a cache ID for future reference

2. Question Processing

When you ask a question:

The question is sent to Gemini API
The cached document content is automatically included
You only pay for the question tokens, not the document tokens
Responses are generated based on the cached content

3. Cost Savings

Without caching: You pay for document tokens + question tokens every time
With caching: You pay for document tokens once + question tokens for each question

🔍 API Endpoints

GET / - Main application interface
POST /upload - Upload PDF file
POST /upload-url - Upload PDF from URL
POST /ask - Ask question about cached document
GET /caches - List all cached documents
DELETE /cache/<cache_id> - Delete specific cache

📊 Cost Analysis

Example Scenario

Document: 10,000 tokens
Question: 50 tokens
10 questions asked

Without Caching:

Cost = (10,000 + 50) × 10 = 100,500 tokens

With Caching:

Cost = 10,000 + (50 × 10) = 10,500 tokens
Savings: 90% cost reduction!

Token Requirements

Minimum for caching: 4,096 tokens
Recommended minimum: 5,000 tokens for cost-effectiveness
Optimal range: 10,000 - 100,000 tokens
Maximum: Model-specific limits (check Gemini API docs)

🎨 Customization

Changing the Model

Edit app.py and change the model name:

model="models/gemini-2.0-flash-001"  # Current
model="models/gemini-2.0-pro-001"    # Alternative

Custom System Instructions

Modify the system instruction in the cache creation:

system_instruction="Your custom instruction here"

Cache TTL

Add TTL configuration to cache creation:

config=types.CreateCachedContentConfig(
    system_instruction=system_instruction,
    contents=[document],
    ttl='24h'  # Cache for 24 hours
)

🔒 Security Considerations

API keys are stored in environment variables
File uploads are validated for PDF format
Cached content is managed securely through Gemini API
No sensitive data is stored locally

🚧 Production Deployment

For production deployment:

Use a production WSGI server

pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:7860 app:app

Add database storage
- Replace in-memory storage with PostgreSQL/MySQL
- Add user authentication
- Implement session management
Add monitoring
- Log API usage and costs
- Monitor cache hit rates
- Track user interactions
Security enhancements
- Add rate limiting
- Implement file size limits
- Add input validation

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Google Gemini API for providing the caching functionality
Flask community for the excellent web framework
The open-source community for inspiration and tools

📞 Support

If you encounter any issues:

Check the Gemini API documentation
Verify your API key is correct
Ensure your PDF files are valid
Check the browser console for JavaScript errors
For small document errors: Upload a larger document or combine multiple documents

🔮 Future Enhancements

Support for multiple file formats (Word, PowerPoint, etc.)
User authentication and document sharing
Advanced analytics and usage tracking
Integration with cloud storage (Google Drive, Dropbox)
Mobile app version
Multi-language support
Advanced caching strategies
Real-time collaboration features
Document preprocessing to meet token requirements
Batch document processing