sam_kraken

Sleeping

File size: 2,097 Bytes

---
title: Kraken OCR on Samaritan Manuscripts
emoji: 📜
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.33.0
app_file: app.py
pinned: false
---

# Kraken OCR on Samaritan Manuscripts - Gradio App

This is a Gradio web application for OCR on Samaritan Manuscripts.

## Setup

1. Install requirements:
```bash
pip install -r requirements.txt
```

2. Place your models:
   - Put segmentation models (`.mlmodel` files) in `app/models/seg/`
   - Put recognition models (`.mlmodel` files) in `app/models/rec/`

## Running the App

```bash
python app.py
```

## Usage

1. Select segmentation and recognition models from the dropdown menus
2. Upload an image file (supported formats: PNG, JPG, JPEG, TIF, TIFF)
3. Click "Process Image" to run OCR
4. View the results and download the XML output

## Features

- Dynamic model selection
- Image preview
- XML output display
- Download processed results
- Error handling and progress indicators 

## Hugging Face Space Configuration

To run this app on Hugging Face Spaces, you need to:

1. Create a new Space with Gradio SDK
2. Add the following files to your Space:
   - `app.py`
   - `requirements.txt`
   - `models/` directory with your models
   - `templates/` directory with your templates

3. Make sure your `requirements.txt` includes:
```
gradio>=4.0.0
kraken
Pillow
numpy
opencv-python
jinja2
```

4. The Space should be configured with:
   - Python 3.10 runtime
   - GPU if available
   - At least 8GB RAM

5. Your Space's `app.py` should be in the root directory, not in an `app/` subdirectory

6. Update the model paths in `app.py` to use relative paths:
```python
MODELS_DIR = Path("models")
SEG_MODELS_DIR = MODELS_DIR / "seg"
REC_MODELS_DIR = MODELS_DIR / "rec"
```

7. Make sure all model files are included in your Space's repository

## Troubleshooting

If you encounter issues on Hugging Face Spaces:

1. Check the Space logs for errors
2. Verify all model files are present
3. Ensure all dependencies are in `requirements.txt`
4. Check file permissions and paths
5. Make sure the app is running on the correct port (7860)