Tingquan commited on
Commit
6fe4667
·
verified ·
1 Parent(s): d073fe6

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +39 -27
README.md CHANGED
@@ -1,5 +1,14 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
  # PP-DocBee-7B
@@ -52,16 +61,19 @@ You can quickly experience the functionality with a single command:
52
  ```bash
53
  paddleocr doc_vlm \
54
  --model_name PP-DocBee-7B \
55
- -i "{'image': 'https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png', 'query': '识别这份表格的内容, markdown格式输出'}"
56
  ```
57
 
58
- You can also integrate the model inference of the text recognition module into your project. Before running the following code, please download the sample image to your local machine.
59
 
60
  ```python
61
  from paddleocr import DocVLM
62
  model = DocVLM(model_name="PP-DocBee-7B")
63
  results = model.predict(
64
- input={"image": "medal_table.png", "query": "识别这份表格的内容, 以markdown格式输出"},
 
 
 
65
  batch_size=1
66
  )
67
  for res in results:
@@ -72,29 +84,29 @@ for res in results:
72
  After running, the obtained result is as follows:
73
 
74
  ```bash
75
- {"res": {'image': 'medal_table.png', 'query': '识别这份表格的内容, 以markdown格式输出', 'result': '| 名次 | 国家/地区 | 金牌 | 银牌 | 铜牌 | 奖牌总数 |\n| --- | --- | --- | --- | --- | --- |\n| 1 | 中国(CHN | 48 | 22 | 30 | 100 |\n| 2 | 美国(USA | 36 | 39 | 37 | 112 |\n| 3 | 俄罗斯(RUS | 24 | 13 | 23 | 60 |\n| 4 | 英国(GBR | 19 | 13 | 19 | 51 |\n| 5 | 德国(GER | 16 | 11 | 14 | 41 |\n| 6 | 澳大利亚(AUS | 14 | 15 | 17 | 46 |\n| 7 | 韩国(KOR | 13 | 11 | 8 | 32 |\n| 8 | 日本(JPN | 9 | 8 | 8 | 25 |\n| 9 | 意大利(ITA | 8 | 9 | 10 | 27 |\n| 10 | 法国(FRA | 7 | 16 | 20 | 43 |\n| 11 | 荷兰(NED | 7 | 5 | 4 | 16 |\n| 12 | 乌克兰(UKR | 7 | 4 | 11 | 22 |\n| 13 | 肯尼亚(KEN | 6 | 4 | 6 | 16 |\n| 14 | 西班牙(ESP | 5 | 11 | 3 | 19 |\n| 15 | 牙买加(JAM | 5 | 4 | 2 | 11 |\n'}}
76
  ```
77
 
78
  The visualized result is as follows:
79
 
80
  ```bash
81
- | 名次 | 国家/地区 | 金牌 | 银牌 | 铜牌 | 奖牌总数 |
82
- | --- | --- | --- | --- | --- | --- |
83
- | 1 | 中国(CHN | 48 | 22 | 30 | 100 |
84
- | 2 | 美国(USA | 36 | 39 | 37 | 112 |
85
- | 3 | 俄罗斯(RUS | 24 | 13 | 23 | 60 |
86
- | 4 | 英国(GBR | 19 | 13 | 19 | 51 |
87
- | 5 | 德国(GER | 16 | 11 | 14 | 41 |
88
- | 6 | 澳大利亚(AUS | 14 | 15 | 17 | 46 |
89
- | 7 | 韩国(KOR | 13 | 11 | 8 | 32 |
90
- | 8 | 日本(JPN | 9 | 8 | 8 | 25 |
91
- | 9 | 意大利(ITA | 8 | 9 | 10 | 27 |
92
- | 10 | 法国(FRA | 7 | 16 | 20 | 43 |
93
- | 11 | 荷兰(NED | 7 | 5 | 4 | 16 |
94
- | 12 | 乌克兰(UKR | 7 | 4 | 11 | 22 |
95
- | 13 | 肯尼亚(KEN | 6 | 4 | 6 | 16 |
96
- | 14 | 西班牙(ESP | 5 | 11 | 3 | 19 |
97
- | 15 | 牙买加(JAM | 5 | 4 | 2 | 11 |
98
  ```
99
 
100
  For details about usage command and descriptions of parameters, please refer to the [Document](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/module_usage/doc_vlm.html#iii-quick-start).
@@ -111,18 +123,18 @@ The document understanding pipeline is an advanced document processing technolog
111
  Run a single command to quickly experience the OCR pipeline:
112
 
113
  ```bash
114
- paddleocr doc_understanding -i "{'image': 'https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png', 'query': '识别这份表格的内容, markdown格式输出'}"
115
  ```
116
 
117
  Results are printed to the terminal:
118
 
119
  ```bash
120
- {'res': {'image': 'https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png', 'query': '识别这份表格的内容, 以markdown格式输出', 'result': '| 名次 | 国家/地区 | 金牌 | 银牌 | 铜牌 | 奖牌总数 |\n| --- | --- | --- | --- | --- | --- |\n| 1 | 中国(CHN | 48 | 22 | 30 | 100 |\n| 2 | 美国(USA | 36 | 39 | 37 | 112 |\n| 3 | 俄罗斯(RUS | 24 | 13 | 23 | 60 |\n| 4 | 英国(GBR | 19 | 13 | 19 | 51 |\n| 5 | 德国(GER | 16 | 11 | 14 | 41 |\n| 6 | 澳大利亚(AUS | 14 | 15 | 17 | 46 |\n| 7 | 韩国(KOR | 13 | 11 | 8 | 32 |\n| 8 | 日本(JPN | 9 | 8 | 8 | 25 |\n| 9 | 意大利(ITA | 8 | 9 | 10 | 27 |\n| 10 | 法国(FRA | 7 | 16 | 20 | 43 |\n| 11 | 荷兰(NED | 7 | 5 | 4 | 16 |\n| 12 | 乌克兰(UKR | 7 | 4 | 11 | 22 |\n| 13 | 肯尼亚(KEN | 6 | 4 | 6 | 16 |\n| 14 | 西班牙(ESP | 5 | 11 | 3 | 19 |\n| 15 | 牙买加(JAM | 5 | 4 | 2 | 11 |\n'}}
121
  ```
122
 
123
  If save_path is specified, the visualization results will be saved under `save_path`. The visualization output is shown below:
124
 
125
- ![image/png](https://raw.githubusercontent.com/cuicheng01/PaddleX_doc_images/refs/heads/main/images/pipelines/doc_understanding/doc_understanding.png)
126
 
127
  The command-line method is for quick experience. For project integration, also only a few codes are needed as well:
128
 
@@ -134,8 +146,8 @@ pipeline = DocUnderstanding(
134
  )
135
  output = pipeline.predict(
136
  {
137
- "image": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/medal_table.png",
138
- "query": "识别这份表格的内容, markdown格式输出"
139
  }
140
  )
141
  for res in output:
@@ -143,7 +155,7 @@ for res in output:
143
  res.save_to_json("./output/")
144
  ```
145
 
146
- The default model used in pipeline is `PP-DocBee2-3B`, so it is needed that specifing to `PP-DocBee-7B` by argument `doc_understanding_model_name`. And you can also use the local model file by argument `doc_understanding_model_dir`. For details about usage command and descriptions of parameters, please refer to the [Document](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/pipeline_usage/doc_understanding.html#2-quick-start).
147
 
148
  ## Links
149
 
 
1
  ---
2
  license: apache-2.0
3
+ library_name: PaddleOCR
4
+ language:
5
+ - en
6
+ - zh
7
+ pipeline_tag: image-to-text
8
+ tags:
9
+ - OCR
10
+ - PaddlePaddle
11
+ - PaddleOCR
12
  ---
13
 
14
  # PP-DocBee-7B
 
61
  ```bash
62
  paddleocr doc_vlm \
63
  --model_name PP-DocBee-7B \
64
+ -i "{'image': 'https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/l5xpHbfLn75dKInhQZ84I.png', 'query': 'Recognize the content of this table and output it in markdown format.'}"
65
  ```
66
 
67
+ You can also integrate the model inference of the document visual-language module into your project. Before running the following code, please download the sample image to your local machine.
68
 
69
  ```python
70
  from paddleocr import DocVLM
71
  model = DocVLM(model_name="PP-DocBee-7B")
72
  results = model.predict(
73
+ input={
74
+ "image": "https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/l5xpHbfLn75dKInhQZ84I.png",
75
+ "query": "Recognize the content of this table and output it in markdown format."
76
+ },
77
  batch_size=1
78
  )
79
  for res in results:
 
84
  After running, the obtained result is as follows:
85
 
86
  ```bash
87
+ {'res': {'image': 'medal_table_en.png', 'query': 'Recognize the content of this table and output it in markdown format', 'result': '| Rank | Country/Region | Gold | Silver | Bronze | Total Medals |\n|---|---|---|---|---|---|\n| 1 | China (CHN) | 48 | 22 | 30 | 100 |\n| 2 | United States (USA) | 36 | 39 | 37 | 112 |\n| 3 | Russia (RUS) | 24 | 13 | 23 | 60 |\n| 4 | Great Britain (GBR) | 19 | 13 | 19 | 51 |\n| 5 | Germany (GER) | 16 | 11 | 14 | 41 |\n| 6 | Australia (AUS) | 14 | 15 | 17 | 46 |\n| 7 | South Korea (KOR) | 13 | 11 | 8 | 32 |\n| 8 | Japan (JPN) | 9 | 8 | 8 | 25 |\n| 9 | Italy (ITA) | 8 | 9 | 10 | 27 |\n| 10 | France (FRA) | 7 | 16 | 20 | 43 |\n| 11 | Netherlands (NED) | 7 | 5 | 4 | 16 |\n| 12 | Ukraine (UKR) | 7 | 4 | 11 | 22 |\n| 13 | Kenya (KEN) | 6 | 4 | 6 | 16 |\n| 14 | Spain (ESP) | 5 | 11 | 3 | 19 |\n| 15 | Jamaica (JAM) | 5 | 4 | 2 | 11 |\n'}}
88
  ```
89
 
90
  The visualized result is as follows:
91
 
92
  ```bash
93
+ | Rank | Country/Region | Gold | Silver | Bronze | Total Medals |
94
+ |---|---|---|---|---|---|
95
+ | 1 | China (CHN) | 48 | 22 | 30 | 100 |
96
+ | 2 | United States (USA) | 36 | 39 | 37 | 112 |
97
+ | 3 | Russia (RUS) | 24 | 13 | 23 | 60 |
98
+ | 4 | Great Britain (GBR) | 19 | 13 | 19 | 51 |
99
+ | 5 | Germany (GER) | 16 | 11 | 14 | 41 |
100
+ | 6 | Australia (AUS) | 14 | 15 | 17 | 46 |
101
+ | 7 | South Korea (KOR) | 13 | 11 | 8 | 32 |
102
+ | 8 | Japan (JPN) | 9 | 8 | 8 | 25 |
103
+ | 9 | Italy (ITA) | 8 | 9 | 10 | 27 |
104
+ | 10 | France (FRA) | 7 | 16 | 20 | 43 |
105
+ | 11 | Netherlands (NED) | 7 | 5 | 4 | 16 |
106
+ | 12 | Ukraine (UKR) | 7 | 4 | 11 | 22 |
107
+ | 13 | Kenya (KEN) | 6 | 4 | 6 | 16 |
108
+ | 14 | Spain (ESP) | 5 | 11 | 3 | 19 |
109
+ | 15 | Jamaica (JAM) | 5 | 4 | 2 | 11 |
110
  ```
111
 
112
  For details about usage command and descriptions of parameters, please refer to the [Document](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/module_usage/doc_vlm.html#iii-quick-start).
 
123
  Run a single command to quickly experience the OCR pipeline:
124
 
125
  ```bash
126
+ paddleocr doc_understanding -i "{'image': 'https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/l5xpHbfLn75dKInhQZ84I.png', 'query': 'Recognize the content of this table and output it in markdown format.'}"
127
  ```
128
 
129
  Results are printed to the terminal:
130
 
131
  ```bash
132
+ {'res': {'image': 'medal_table_en.png', 'query': 'Recognize the content of this table and output it in markdown format', 'result': '| Rank | Country/Region | Gold | Silver | Bronze | Total Medals |\n|---|---|---|---|---|---|\n| 1 | China (CHN) | 48 | 22 | 30 | 100 |\n| 2 | United States (USA) | 36 | 39 | 37 | 112 |\n| 3 | Russia (RUS) | 24 | 13 | 23 | 60 |\n| 4 | Great Britain (GBR) | 19 | 13 | 19 | 51 |\n| 5 | Germany (GER) | 16 | 11 | 14 | 41 |\n| 6 | Australia (AUS) | 14 | 15 | 17 | 46 |\n| 7 | South Korea (KOR) | 13 | 11 | 8 | 32 |\n| 8 | Japan (JPN) | 9 | 8 | 8 | 25 |\n| 9 | Italy (ITA) | 8 | 9 | 10 | 27 |\n| 10 | France (FRA) | 7 | 16 | 20 | 43 |\n| 11 | Netherlands (NED) | 7 | 5 | 4 | 16 |\n| 12 | Ukraine (UKR) | 7 | 4 | 11 | 22 |\n| 13 | Kenya (KEN) | 6 | 4 | 6 | 16 |\n| 14 | Spain (ESP) | 5 | 11 | 3 | 19 |\n| 15 | Jamaica (JAM) | 5 | 4 | 2 | 11 |\n'}}
133
  ```
134
 
135
  If save_path is specified, the visualization results will be saved under `save_path`. The visualization output is shown below:
136
 
137
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/kFGo9nlHuHs2uyN1voSTg.png)
138
 
139
  The command-line method is for quick experience. For project integration, also only a few codes are needed as well:
140
 
 
146
  )
147
  output = pipeline.predict(
148
  {
149
+ "image": "https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/l5xpHbfLn75dKInhQZ84I.png",
150
+ "query": "Recognize the content of this table and output it in markdown format."
151
  }
152
  )
153
  for res in output:
 
155
  res.save_to_json("./output/")
156
  ```
157
 
158
+ The default model used in pipeline is `PP-DocBee2-3B`, so you need to specify `doc_understanding_model_name` to `PP-DocBee-7B`. And you can also use the local model file by argument `doc_understanding_model_dir`. For details about usage command and descriptions of parameters, please refer to the [Document](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/pipeline_usage/doc_understanding.html#2-quick-start).
159
 
160
  ## Links
161