File size: 1,537 Bytes
35926c3
 
 
 
 
 
 
 
 
 
 
83dd2a8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
title: FormIQ - Intelligent Document Parser
emoji: πŸ“„
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.32.0
app_file: app.py
pinned: false
---

# FormIQ - Intelligent Document Parser

FormIQ is an intelligent document parser that uses advanced AI models to extract and validate information from various types of documents.

## Features

- Document image upload and processing
- OCR text extraction using Tesseract
- Advanced document understanding using LayoutLMv3
- Structured information extraction using Perplexity AI
- Interactive web interface built with Streamlit

## Technologies Used

- **Frontend**: Streamlit
- **OCR**: Tesseract
- **Document Understanding**: LayoutLMv3
- **Text Processing**: Perplexity AI
- **Data Processing**: Pandas, NumPy
- **Visualization**: Plotly

## Setup

1. Clone the repository
2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```
3. Set up environment variables:
   ```bash
   PERPLEXITY_API_KEY=your_api_key_here
   ```

## Usage

1. Run the Streamlit app:
   ```bash
   streamlit run app.py
   ```
2. Open your browser and navigate to the provided URL
3. Upload a document image
4. Click "Process Document" to extract information

## Hugging Face Spaces Deployment

This project is deployed on Hugging Face Spaces. You can access the live demo at: [Your Spaces URL]

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the LICENSE file for details.