File size: 3,933 Bytes
6333d20 c9ee533 d7fcbe3 223e029 c9ee533 d7fcbe3 c9ee533 cc4193a c9ee533 6333d20 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
---
license: apache-2.0
pipeline_tag: object-detection
tags:
- table-detect
- table-recong
---
# AnyTable
<a href="https://huggingface.co/anyforge/anytable" target="_blank"><img src="https://img.shields.io/badge/%F0%9F%A4%97-HuggingFace-blue"></a>
<a href="https://www.modelscope.cn/models/anyforge/anytable" target="_blank"><img alt="Static Badge" src="https://img.shields.io/badge/%E9%AD%94%E6%90%AD-ModelScope-blue"></a>
<a href=""><img src="https://img.shields.io/badge/Python->=3.6-aff.svg"></a>
<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-pink.svg"></a>
<a href=""><img alt="Static Badge" src="https://img.shields.io/badge/engine-cpu_gpu_onnxruntime-blue"></a>
```
___ ______ __ __
/ | ____ __ _/_ __/___ _/ /_ / /__
/ /| | / __ \/ / / // / / __ `/ __ \/ / _ \
/ ___ |/ / / / /_/ // / / /_/ / /_/ / / __/
/_/ |_/_/ /_/\__, //_/ \__,_/_.___/_/\___/
/____/
```
简体中文 | [English](./README_en.md)
<div align="left">
<img src="./assets/sample1.jpg">
</div>
## 1. 简介
AnyTable是一个专注于从文档或者图片中表格解析的模型工具,主要分成两个部分:
- anytable-det:用于表格区域检测(已开放)
- anytable-rec:用于表格结构识别(未来开放)
项目地址:
- github地址:[AnyTable](https://github.com/anyforge/anytable)
- Hugging Face: [AnyTable](https://huggingface.co/anyforge/anytable)
- ModelScope: [AnyTable](https://www.modelscope.cn/models/anyforge/anytable)
## 2. 缘起
目前市面上表格数据非常多且混杂,很难有一个干净的完整数据和模型,为此我们收集并整理了很多表格数据,训练了我们的模型。
检测数据集分布:
- pubtables: 947642
- synthtabnet.marketing: 149999
- tablebank: 278582
- fintabnet.c: 97475
- pubtabnet: 519030
- synthtabnet.sparse: 150000
- synthtabnet.fintabnet: 149999
- docbank: 24517
- synthtabnet.pubtabnet: 150000
- cTDaRTRACKA: 1639
- SciTSR: 14971
- doclaynet.large: 21185
- IIITAR13K: 9905
- selfbuilt: 121157
数据集总计:大于`2.6M`(大约2633869张图片)。
### 扩展训练
- 训练集:`2.6M(大于10万的部分只抽样了42000, 没办法因为贫穷,卡有限。)`
- 测试集:`4k`
- python: 3.12
- pytorch: 2.6.0
- cuda: 12.3
- ultralytics: 8.3.128
### 模型介绍
表格检测模型位于det文件夹下:
- yolo系列:使用ultralytics训练yolo检测
- rt-detr:使用ultralytics训练rt-detr检测
注释:您可以直接模型预测,也可以作为预训练模型微调私有数据集
### 评估
自建评估集:`4K`
| model | imgsz | epochs | metrics/precision |
|---|---|---|---|
|rt-detr-l|960|10|0.97|
|yolo11s|960|10|0.97|
|yolo11m|960|10|0.964|
|yolo12s|960|10|0.978|
## 3. 使用方法
### 安装依赖
```bash
pip install ultralytics pillow
```
### 使用方法
```python
## simple
## 下载模型后直接使用ultralytics即可
from ultralytics import YOLO,RTDETR
# Load a model
model = YOLO("/path/to/download_model") # pretrained YOLO11n model
# Run batched inference on a list of images
results = model(["/path/to/your_image"],imgsz = 960) # return a list of Results objects
# Process results list
for result in results:
boxes = result.boxes # Boxes object for bounding box outputs
masks = result.masks # Masks object for segmentation masks outputs
keypoints = result.keypoints # Keypoints object for pose outputs
probs = result.probs # Probs object for classification outputs
obb = result.obb # Oriented boxes object for OBB outputs
result.show() # display to screen
result.save(filename="result.jpg") # save to disk
```
## Buy me a coffee
- 微信(WeChat)
<div align="left">
<img src="./zanshan.jpg" width="30%" height="30%">
</div>
## 特别鸣谢
- ultralytics公开的训练模型和文档
- 各种数据集提供者 |