KnowledgeBridge / HF_SPACES_SETUP.md
fazeel007's picture
Add Hugging Face Spaces configuration and documentation
c4f2ce6
# Hugging Face Spaces Setup Guide
## Quick Setup
1. **Create a new Space** on Hugging Face with:
- SDK: Docker
- Hardware: CPU basic (or GPU if you need faster processing)
2. **Environment Variables/Secrets**:
Add these to your Space's settings > Variables and Secrets:
```
NEBIUS_API_KEY=your_nebius_api_key
MODAL_TOKEN_ID=your_modal_token_id
MODAL_TOKEN_SECRET=your_modal_token_secret
GITHUB_TOKEN=your_github_token
```
3. **Upload Control** (Optional):
- By default, uploads are **enabled** in Spaces
- To disable uploads, add: `DISABLE_UPLOADS=true`
## Storage in Hugging Face Spaces
### Persistent vs Temporary Storage
Hugging Face Spaces provide:
- **Temporary storage**: `/tmp` directory (used by this app)
- βœ… Survives app restarts
- βœ… Suitable for uploads and database
- ❌ Not persistent across Space rebuilds
- ❌ Limited size (~10GB)
### How This App Handles Storage
The app automatically detects Hugging Face Spaces and:
1. **Database**: Stored in `/tmp/knowledgebridge.db`
2. **Uploads**: Stored in `/tmp/uploads/`
3. **Vector Indexes**: Stored in `/tmp/` via Modal
### Adding Persistent Storage (Advanced)
For truly persistent storage, you have these options:
#### Option 1: External Database
```bash
# Add to your Space environment
DATABASE_URL=postgresql://user:pass@host:port/db
```
#### Option 2: Cloud Storage
```bash
# Add to your Space environment
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
S3_BUCKET=your_bucket
```
#### Option 3: Hugging Face Hub (Git LFS)
For large files, use Git LFS:
```bash
git lfs track "*.pdf" "*.db"
git add .gitattributes
```
## Troubleshooting
### "File uploads disabled" Error
If you see this error, check:
1. **Environment Detection**: Look for this log message:
```
πŸ” Environment check: { NODE_ENV: 'production', DISABLE_UPLOADS: undefined, isHuggingFaceSpace: true, hasWritableStorage: true, isDocumentUploadEnabled: true }
```
2. **Force Enable**: Add this environment variable:
```
ENABLE_UPLOADS=true
```
3. **Check Logs**: Look for these success messages:
```
βœ… Created uploads directory: /tmp/uploads
βœ… Upload directory is writable: /tmp/uploads
βœ… Document uploads enabled - full functionality available
```
### Vector Search Issues
If vector search shows "disabled":
1. **Check Modal Integration**: Ensure Modal tokens are set
2. **Check Logs**: Look for Modal API connectivity
3. **Fallback**: The app will use text search if vector search fails
### Performance Optimization
For better performance in Spaces:
1. **Use GPU Hardware**: For faster AI processing
2. **Enable Caching**: Vector indexes are cached in `/tmp`
3. **Batch Operations**: Upload multiple files at once
## Environment Detection
The app detects Hugging Face Spaces using these environment variables:
- `SPACE_ID`
- `HF_SPACE_ID`
- `HUGGINGFACE_SPACE_ID`
- `HF_TOKEN`
When detected, uploads are automatically enabled regardless of other settings.
## Monitoring
Check your Space logs for these indicators:
### βœ… Successful Setup
```
βœ… Document uploads enabled - full functionality available
βœ… Created uploads directory: /tmp/uploads
πŸ” Environment check: { isHuggingFaceSpace: true, isDocumentUploadEnabled: true }
```
### ❌ Issues
```
❌ Failed to create or write to uploads directory
ℹ️ Document uploads disabled - using fallback routes
πŸ” Environment check: { isDocumentUploadEnabled: false }
```
## Support
If you encounter issues:
1. Check the Space logs for error messages
2. Verify environment variables are set correctly
3. Test with a small PDF file first
4. Check the app health endpoint: `/api/health`