|
# Hugging Face Spaces Setup Guide |
|
|
|
## Quick Setup |
|
|
|
1. **Create a new Space** on Hugging Face with: |
|
- SDK: Docker |
|
- Hardware: CPU basic (or GPU if you need faster processing) |
|
|
|
2. **Environment Variables/Secrets**: |
|
Add these to your Space's settings > Variables and Secrets: |
|
|
|
``` |
|
NEBIUS_API_KEY=your_nebius_api_key |
|
MODAL_TOKEN_ID=your_modal_token_id |
|
MODAL_TOKEN_SECRET=your_modal_token_secret |
|
GITHUB_TOKEN=your_github_token |
|
``` |
|
|
|
3. **Upload Control** (Optional): |
|
- By default, uploads are **enabled** in Spaces |
|
- To disable uploads, add: `DISABLE_UPLOADS=true` |
|
|
|
## Storage in Hugging Face Spaces |
|
|
|
### Persistent vs Temporary Storage |
|
|
|
Hugging Face Spaces provide: |
|
- **Temporary storage**: `/tmp` directory (used by this app) |
|
- β
Survives app restarts |
|
- β
Suitable for uploads and database |
|
- β Not persistent across Space rebuilds |
|
- β Limited size (~10GB) |
|
|
|
### How This App Handles Storage |
|
|
|
The app automatically detects Hugging Face Spaces and: |
|
|
|
1. **Database**: Stored in `/tmp/knowledgebridge.db` |
|
2. **Uploads**: Stored in `/tmp/uploads/` |
|
3. **Vector Indexes**: Stored in `/tmp/` via Modal |
|
|
|
### Adding Persistent Storage (Advanced) |
|
|
|
For truly persistent storage, you have these options: |
|
|
|
#### Option 1: External Database |
|
```bash |
|
# Add to your Space environment |
|
DATABASE_URL=postgresql://user:pass@host:port/db |
|
``` |
|
|
|
#### Option 2: Cloud Storage |
|
```bash |
|
# Add to your Space environment |
|
AWS_ACCESS_KEY_ID=your_key |
|
AWS_SECRET_ACCESS_KEY=your_secret |
|
S3_BUCKET=your_bucket |
|
``` |
|
|
|
#### Option 3: Hugging Face Hub (Git LFS) |
|
For large files, use Git LFS: |
|
```bash |
|
git lfs track "*.pdf" "*.db" |
|
git add .gitattributes |
|
``` |
|
|
|
## Troubleshooting |
|
|
|
### "File uploads disabled" Error |
|
|
|
If you see this error, check: |
|
|
|
1. **Environment Detection**: Look for this log message: |
|
``` |
|
π Environment check: { NODE_ENV: 'production', DISABLE_UPLOADS: undefined, isHuggingFaceSpace: true, hasWritableStorage: true, isDocumentUploadEnabled: true } |
|
``` |
|
|
|
2. **Force Enable**: Add this environment variable: |
|
``` |
|
ENABLE_UPLOADS=true |
|
``` |
|
|
|
3. **Check Logs**: Look for these success messages: |
|
``` |
|
β
Created uploads directory: /tmp/uploads |
|
β
Upload directory is writable: /tmp/uploads |
|
β
Document uploads enabled - full functionality available |
|
``` |
|
|
|
### Vector Search Issues |
|
|
|
If vector search shows "disabled": |
|
|
|
1. **Check Modal Integration**: Ensure Modal tokens are set |
|
2. **Check Logs**: Look for Modal API connectivity |
|
3. **Fallback**: The app will use text search if vector search fails |
|
|
|
### Performance Optimization |
|
|
|
For better performance in Spaces: |
|
|
|
1. **Use GPU Hardware**: For faster AI processing |
|
2. **Enable Caching**: Vector indexes are cached in `/tmp` |
|
3. **Batch Operations**: Upload multiple files at once |
|
|
|
## Environment Detection |
|
|
|
The app detects Hugging Face Spaces using these environment variables: |
|
- `SPACE_ID` |
|
- `HF_SPACE_ID` |
|
- `HUGGINGFACE_SPACE_ID` |
|
- `HF_TOKEN` |
|
|
|
When detected, uploads are automatically enabled regardless of other settings. |
|
|
|
## Monitoring |
|
|
|
Check your Space logs for these indicators: |
|
|
|
### β
Successful Setup |
|
``` |
|
β
Document uploads enabled - full functionality available |
|
β
Created uploads directory: /tmp/uploads |
|
π Environment check: { isHuggingFaceSpace: true, isDocumentUploadEnabled: true } |
|
``` |
|
|
|
### β Issues |
|
``` |
|
β Failed to create or write to uploads directory |
|
βΉοΈ Document uploads disabled - using fallback routes |
|
π Environment check: { isDocumentUploadEnabled: false } |
|
``` |
|
|
|
## Support |
|
|
|
If you encounter issues: |
|
|
|
1. Check the Space logs for error messages |
|
2. Verify environment variables are set correctly |
|
3. Test with a small PDF file first |
|
4. Check the app health endpoint: `/api/health` |