Spaces:

opentyphoon
/

typhoon-ocr

Running

App Files Files Community

typhoon-ocr / README.md

opentyphoon

Update README.md

b6a381b verified about 2 months ago

preview code

raw

history blame contribute delete

1.6 kB

	---
	title: Typhoon OCR
	emoji: 🌍
	colorFrom: gray
	colorTo: red
	sdk: gradio
	sdk_version: 5.29.1
	app_file: app.py
	pinned: false
	license: apache-2.0
	short_description: Convert Image & PDF to Markdown
	---
	## Typhoon OCR

	Typhoon OCR is a model for extracting structured markdown from images or PDFs. It supports document layout analysis and table extraction, returning results in markdown or HTML. This package is a simple Gradio website to demonstrate the performance of Typhoon OCR.


	### Features
	- Upload a PDF or image (single page)
	- Extracts and reconstructs document content as markdown
	- Supports different prompt modes for layout or structure
	- Language: English, Thai
	- Uses a local or remote OpenAI-compatible API (e.g., vllm)

	### Install
	```bash
	pip install -r requirements.txt
	# edit .env
	# pip install vllm # optional for hosting a local server
	```

	### Mac specific
	```
	brew install poppler
	# The following binaries are required and provided by poppler:
	# - pdfinfo
	# - pdftoppm
	```
	### Linux specific
	```
	sudo apt-get update
	sudo apt-get install poppler-utils
	# The following binaries are required and provided by poppler-utils:
	# - pdfinfo
	# - pdftoppm
	```


	### Start vllm
	```bash
	vllm serve scb10x/typhoon-ocr-7b --served-model-name typhoon-ocr --dtype bfloat16 --port 8101
	```

	### Run Gradio demo
	```bash
	python app.py
	```

	### Dependencies
	- openai
	- python-dotenv
	- ftfy
	- pypdf
	- gradio
	- vllm (for hosting an inference server)
	- pillow

	### License
	This project is licensed under the Apache 2.0 License. See individual datasets and checkpoints for their respective licenses.