Update README.md
Browse files
README.md
CHANGED
|
@@ -168,3 +168,26 @@ print(processor.decode(predictions[0], skip_special_tokens=True))
|
|
| 168 |
# Contribution
|
| 169 |
|
| 170 |
This model was originally contributed by Kenton Lee, Mandar Joshi et al. and added to the Hugging Face ecosystem by [Younes Belkada](https://huggingface.co/ybelkada).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 168 |
# Contribution
|
| 169 |
|
| 170 |
This model was originally contributed by Kenton Lee, Mandar Joshi et al. and added to the Hugging Face ecosystem by [Younes Belkada](https://huggingface.co/ybelkada).
|
| 171 |
+
|
| 172 |
+
# Citation
|
| 173 |
+
|
| 174 |
+
If you want to cite this work, please consider citing the original paper:
|
| 175 |
+
```
|
| 176 |
+
@misc{https://doi.org/10.48550/arxiv.2210.03347,
|
| 177 |
+
doi = {10.48550/ARXIV.2210.03347},
|
| 178 |
+
|
| 179 |
+
url = {https://arxiv.org/abs/2210.03347},
|
| 180 |
+
|
| 181 |
+
author = {Lee, Kenton and Joshi, Mandar and Turc, Iulia and Hu, Hexiang and Liu, Fangyu and Eisenschlos, Julian and Khandelwal, Urvashi and Shaw, Peter and Chang, Ming-Wei and Toutanova, Kristina},
|
| 182 |
+
|
| 183 |
+
keywords = {Computation and Language (cs.CL), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
|
| 184 |
+
|
| 185 |
+
title = {Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding},
|
| 186 |
+
|
| 187 |
+
publisher = {arXiv},
|
| 188 |
+
|
| 189 |
+
year = {2022},
|
| 190 |
+
|
| 191 |
+
copyright = {Creative Commons Attribution 4.0 International}
|
| 192 |
+
}
|
| 193 |
+
```
|