| This model repository presents "TinyPubMedBERT", a distillated [PubMedBERT (Gu et al., 2021)](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract) model. | |
| The model is composed of 4-layers and distillated following methods introduced in the [TinyBERT paper](https://aclanthology.org/2020.findings-emnlp.372/) (Jiao et al., 2020). | |
| * For the framework, please visit https://github.com/AstraZeneca/KAZU | |
| * For the demo, please visit http://kazu.korea.ac.kr | |
| * For details about the model, please see our paper entitled **Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework**, (EMNLP 2022 industry track). | |
| TinyPubMedBERT is used as the initial weights for the training of the [dmis-lab/KAZU-NER-module-distil-v1.0](https://huggingface.co/dmis-lab/KAZU-NER-module-distil-v1.0) for the KAZU (Korea University and AstraZeneca) framework. | |
| ### Citation info | |
| Joint-first authorship of **Richard Jackson** (AstraZeneca) and **WonJin Yoon** (Korea University). | |
| <br>Please cite the paper using the simplified citation format provided in the following section, or find the [full citation information here](https://aclanthology.org/2022.emnlp-industry.63.bib) | |
| ``` | |
| @inproceedings{YoonAndJackson2022BiomedicalNER, | |
| title="Biomedical {NER} for the Enterprise with Distillated {BERN}2 and the Kazu Framework", | |
| author="Yoon, Wonjin and Jackson, Richard and Ford, Elliot and Poroshin, Vladimir and Kang, Jaewoo", | |
| booktitle="Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track", | |
| month = dec, | |
| year = "2022", | |
| address = "Abu Dhabi, UAE", | |
| publisher = "Association for Computational Linguistics", | |
| url = "https://aclanthology.org/2022.emnlp-industry.63", | |
| pages = "619--626", | |
| } | |
| ``` | |
| This model used resources from [PubMedBERT paper](https://dl.acm.org/doi/10.1145/3458754) and [TinyBERT paper](https://aclanthology.org/2020.findings-emnlp.372/). | |
| ``` | |
| Gu, Yu, et al. "Domain-specific language model pretraining for biomedical natural language processing." | |
| ACM Transactions on Computing for Healthcare (HEALTH) 3.1 (2021): 1-23. | |
| ``` | |
| ``` | |
| Jiao, Xiaoqi, et al. "TinyBERT: Distilling BERT for Natural Language Understanding." | |
| Findings of the Association for Computational Linguistics: EMNLP 2020. 2020. | |
| ``` | |
| ### Contact Information | |
| For help or issues using the codes or model (NER module of KAZU) in this repository, please contact WonJin Yoon (wonjin.info (at) gmail.com) or submit a GitHub issue. | |