Work in progress.
Gokturk ResNet OCR Model
This is a ResNet-based OCR model specifically trained to recognize Old Turkic (Gokturk) script characters. It is optimized for inference using ONNX Runtime, making it highly portable and efficient.
Model Description
- Repository: OldTurkicOCR
- Task: Optical Character Recognition (OCR) / Image Classification
- Classes: 75 characters (Unicode range
U+10C00–U+10C4A) - Input Shape: (Batch, 64, 64, 1) (Grayscale)
- Format: ONNX
How to use
This model is primarily used within the OldTurkicOCR Rust project.
In Rust (with ort crate)
use ort::session::Session;
let session = Session::builder()?
.commit_from_file("gokturk_resnet_v1.onnx")?;
In Python (with onnxruntime)
import onnxruntime as ort
import numpy as np
session = ort.InferenceSession("gokturk_resnet_v1.onnx")
# Expects a 64x64 grayscale image normalized to [0, 1]
# input_data shape: (1, 64, 64, 1)
result = session.run(None, {"input_1": input_data})
Dataset and Training
The model was trained on a curated dataset of Gokturk script characters, covering various styles and weights of the Orkhon and Yenisei variants.
Files
- gokturk_resnet_v1.onnx: The main inference model.
- model_labels.json: Mapping of model output indices to character labels.