File size: 2,070 Bytes
a09426f
 
 
 
 
 
 
 
 
43aa272
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
title: Crawl4AI Web Content Extractor
emoji: 🕷️
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
---

# Crawl4AI Demo - Docker Deployment

This is a Docker-ready version of the Crawl4AI demo application, specifically designed for deployment on Hugging Face Spaces.

## Features

- Web interface built with Gradio
- Support for multiple crawler types (Basic, LLM, Cosine, JSON/CSS)
- Configurable word count threshold
- Markdown output with metadata
- Sub-page crawling capabilities
- Lazy loading support
- Docker-optimized configuration

## Deployment Instructions

1. Create a new Space on Hugging Face:
   - Go to huggingface.co/spaces
   - Click "Create new Space"
   - Choose "Docker" as the SDK
   - Set the hardware requirements (recommended: CPU + 16GB RAM)

2. Upload the files:
   - Upload all files from this directory to your Space
   - Make sure to include:
     - `Dockerfile`
     - `app.py`
     - `requirements.txt`
     - `README.md`

3. The Space will automatically build and deploy the application.

## Environment Variables

No environment variables are required for basic functionality. The application is configured to run out of the box.

## Hardware Requirements

- CPU: 2+ cores recommended
- RAM: 16GB recommended
- Disk: 5GB minimum

## Browser Support

The application uses Chrome in headless mode for web crawling. The Dockerfile includes all necessary dependencies.

## Limitations

- Memory usage increases with the number of pages crawled
- Some websites may block automated crawling
- JavaScript-heavy sites may require additional configuration

## Troubleshooting

If you encounter issues:

1. Check the Space logs for error messages
2. Ensure the Chrome browser is running correctly
3. Verify network connectivity
4. Check memory usage

## Development

To run locally with Docker:

```bash
docker build -t crawl4ai-demo .
docker run -p 7860:7860 crawl4ai-demo
```

Visit http://localhost:7860 to access the application.

## License

This project is licensed under the MIT License - see the LICENSE file for details.