File size: 1,422 Bytes
69869a8
 
 
 
43f378e
 
00455b1
 
43f378e
 
 
 
b2a77f0
43f378e
00455b1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
---
datasets:
- Caraaaaa/non_text_image_captioning
pipeline_tag: image-to-text
---

This is a [GenerativeImage2Text](https://huggingface.co/microsoft/git-base) model finetuned on [non-text images](https://huggingface.co/datasets/Caraaaaa/non_text_image_captioning) extracted from documents (i.e.PDF). It is used to analyze the content of the image and produce a descriptive caption.
It is part of a [project]((https://github.com/caraaaaa/doc_accessibility?tab=readme-ov-file)) to build a software solution capable of processing offline documents (PDFs, Word, PowerPoint, PPT, etc.) to detect WCAG accessibility issues.

Example document with non-text images:
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64b539ab4dd3e248953a6e69/IlcbNsHuzK5JHHixh_dwN.png)
Extracted Image:
![Alt text](https://datasets-server.huggingface.co/assets/Caraaaaa/non_text_image_captioning/--/ca73cb435a60096ff7194f9616a54fde01f69039/--/default/train/10/image/image.jpg?Expires=1707337881&Signature=EpH8a0j4oVQZq2zM52KdkLURUseDcAXIlrUH3Grli8DQH2JzmJdl8J7AnEnwBi7oiO8fmFqkHP5bp-SmRehi-5pZkEQKzPUmbvgzzZJWKYttcyql1MnafITBoIpDbAQB8YkFeAnzJ7leKE6E1wSzlolMIorfFYO~x8Xzq-N5dg6CtiCmO6WIY0BMJgMliNpyUJqcVytJ1p95wZckOZmKxZ6CFPBDLF6jQEAbYVvV2f8cDDZBOkd7bsHlAZg0Zvxfau06v3nu26frvqhHxXq8LY3v2FvEdQ1CljuvrLOYqWiyHxZCm1aNQrhhtN6aJDlGbMSzCDhGwuf2cM6q9STXEw__&Key-Pair-Id=K3EI6M078Z3AC3)
Generated caption:
"Indication of correct signature"