Spaces:
Running
Running
title: Typhoon OCR | |
emoji: π | |
colorFrom: gray | |
colorTo: red | |
sdk: gradio | |
sdk_version: 5.29.1 | |
app_file: app.py | |
pinned: false | |
license: apache-2.0 | |
short_description: Convert Image & PDF to Markdown | |
## Typhoon OCR | |
Typhoon OCR is a model for extracting structured markdown from images or PDFs. It supports document layout analysis and table extraction, returning results in markdown or HTML. This package is a simple Gradio website to demonstrate the performance of Typhoon OCR. | |
### Features | |
- Upload a PDF or image (single page) | |
- Extracts and reconstructs document content as markdown | |
- Supports different prompt modes for layout or structure | |
- Language: English, Thai | |
- Uses a local or remote OpenAI-compatible API (e.g., vllm) | |
### Install | |
```bash | |
pip install -r requirements.txt | |
# edit .env | |
# pip install vllm # optional for hosting a local server | |
``` | |
### Mac specific | |
``` | |
brew install poppler | |
# The following binaries are required and provided by poppler: | |
# - pdfinfo | |
# - pdftoppm | |
``` | |
### Linux specific | |
``` | |
sudo apt-get update | |
sudo apt-get install poppler-utils | |
# The following binaries are required and provided by poppler-utils: | |
# - pdfinfo | |
# - pdftoppm | |
``` | |
### Start vllm | |
```bash | |
vllm serve scb10x/typhoon-ocr-7b --served-model-name typhoon-ocr --dtype bfloat16 --port 8101 | |
``` | |
### Run Gradio demo | |
```bash | |
python app.py | |
``` | |
### Dependencies | |
- openai | |
- python-dotenv | |
- ftfy | |
- pypdf | |
- gradio | |
- vllm (for hosting an inference server) | |
- pillow | |
### License | |
This project is licensed under the Apache 2.0 License. See individual datasets and checkpoints for their respective licenses. |