Spaces:
Build error
Build error
title: Crawl4AI Web Content Extractor | |
emoji: 🕷️ | |
colorFrom: blue | |
colorTo: indigo | |
sdk: docker | |
pinned: false | |
# Crawl4AI Demo - Docker Deployment | |
This is a Docker-ready version of the Crawl4AI demo application, specifically designed for deployment on Hugging Face Spaces. | |
## Features | |
- Web interface built with Gradio | |
- Support for multiple crawler types (Basic, LLM, Cosine, JSON/CSS) | |
- Configurable word count threshold | |
- Markdown output with metadata | |
- Sub-page crawling capabilities | |
- Lazy loading support | |
- Docker-optimized configuration | |
## Deployment Instructions | |
1. Create a new Space on Hugging Face: | |
- Go to huggingface.co/spaces | |
- Click "Create new Space" | |
- Choose "Docker" as the SDK | |
- Set the hardware requirements (recommended: CPU + 16GB RAM) | |
2. Upload the files: | |
- Upload all files from this directory to your Space | |
- Make sure to include: | |
- `Dockerfile` | |
- `app.py` | |
- `requirements.txt` | |
- `README.md` | |
3. The Space will automatically build and deploy the application. | |
## Environment Variables | |
No environment variables are required for basic functionality. The application is configured to run out of the box. | |
## Hardware Requirements | |
- CPU: 2+ cores recommended | |
- RAM: 16GB recommended | |
- Disk: 5GB minimum | |
## Browser Support | |
The application uses Chrome in headless mode for web crawling. The Dockerfile includes all necessary dependencies. | |
## Limitations | |
- Memory usage increases with the number of pages crawled | |
- Some websites may block automated crawling | |
- JavaScript-heavy sites may require additional configuration | |
## Troubleshooting | |
If you encounter issues: | |
1. Check the Space logs for error messages | |
2. Ensure the Chrome browser is running correctly | |
3. Verify network connectivity | |
4. Check memory usage | |
## Development | |
To run locally with Docker: | |
```bash | |
docker build -t crawl4ai-demo . | |
docker run -p 7860:7860 crawl4ai-demo | |
``` | |
Visit http://localhost:7860 to access the application. | |
## License | |
This project is licensed under the MIT License - see the LICENSE file for details. |