File size: 4,011 Bytes
c9ee533
 
aebded9
005f8b5
c9ee533
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aebded9
 
 
c9ee533
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87d06c3
c9ee533
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
# AnyTable

<a href="https://huggingface.co/anyforge/anytable" target="_blank"><img src="https://img.shields.io/badge/%F0%9F%A4%97-HuggingFace-blue"></a>
<a href="https://www.modelscope.cn/models/anyforge/anytable" target="_blank"><img alt="Static Badge" src="https://img.shields.io/badge/%E9%AD%94%E6%90%AD-ModelScope-blue"></a>
<a href=""><img src="https://img.shields.io/badge/Python->=3.6-aff.svg"></a>
<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-pink.svg"></a>
<a href=""><img alt="Static Badge" src="https://img.shields.io/badge/engine-cpu_gpu_onnxruntime-blue"></a>

```
    ___               ______      __    __   
   /   |  ____  __  _/_  __/___ _/ /_  / /__ 
  / /| | / __ \/ / / // / / __ `/ __ \/ / _ \
 / ___ |/ / / / /_/ // / / /_/ / /_/ / /  __/
/_/  |_/_/ /_/\__, //_/  \__,_/_.___/_/\___/ 
             /____/                          

```

English | [简体中文](./README.md)

<div align="left">
    <img src="./assets/sample1.jpg">
</div>

## 1. Introduction

AnyTable is a modeling tool that focuses on parsing tables from documents or images, mainly divided into two parts:
-Anytable det: used for table region detection (open)
-Anytable rec: used for table structure recognition (open in the future)

Project Address:
- github地址:[AnyTable](https://github.com/anyforge/anytable)
- Hugging Face: [AnyTable](https://huggingface.co/anyforge/anytable)
- ModelScope: [AnyTable](https://www.modelscope.cn/models/anyforge/anytable)

## 2. Origin

At present, there are a lot of mixed table data on the market, making it difficult to have a clean and complete data and model. Therefore, we collected and organized a lot of table data and trained our model.

Detecting dataset distribution:

- pubtables: 947642
- synthtabnet.marketing: 149999
- tablebank: 278582
- fintabnet.c: 97475
- pubtabnet: 519030
- synthtabnet.sparse: 150000
- synthtabnet.fintabnet: 149999
- docbank: 24517
- synthtabnet.pubtabnet: 150000
- cTDaRTRACKA: 1639
- SciTSR: 14971
- doclaynet.large: 21185
- IIITAR13K: 9905
- selfbuilt: 121157

Total dataset: greater than 2.6M (approximately 2633869 images).

### Train

- train set:`2.6M(Only 42000 samples were taken for the portion greater than 100000,Due to poverty, the cards are limited.)`
- eval set:`4k`
- python: 3.12
- pytorch: 2.6.0
- cuda: 12.3
- ultralytics: 8.3.128

### Model introduction

The table detection model is located in the det folder:

- YOLO series: Training YOLO detection using ultralytics
- Rt detr: Training rt detr detection using ultralytics

Note: You can directly predict the model or fine tune the private dataset as a pre trained model

### Eval

self built evaluation set:`4K`

| model | imgsz | epochs | metrics/precision |
|---|---|---|---|
|rt-detr-l|960|10|0.97|
|yolo11s|960|10|0.97|
|yolo11m|960|10|0.964|
|yolo12s|960|10|0.978|


## 3. Usage

### Install dependencies

```bash
pip install ultralytics pillow
```

### Usage

```python
## simple
## After downloading the model, simply use ultralytics directly

from ultralytics import YOLO,RTDETR

# Load a model
model = YOLO("/path/to/download_model")  # pretrained YOLO11n model

# Run batched inference on a list of images
results = model(["/path/to/your_image"],imgsz = 960)  # return a list of Results objects

# Process results list
for result in results:
    boxes = result.boxes  # Boxes object for bounding box outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    keypoints = result.keypoints  # Keypoints object for pose outputs
    probs = result.probs  # Probs object for classification outputs
    obb = result.obb  # Oriented boxes object for OBB outputs
    result.show()  # display to screen
    result.save(filename="result.jpg")  # save to disk

```

## Buy me a coffee

- 微信(WeChat)

<div align="left">
    <img src="./zanshan.jpg" width="30%" height="30%">
</div>

## Special thanks

- Ultralytics publicly available training models and documentation
- Various dataset providers