File size: 2,433 Bytes
f82ab29 683a106 f82ab29 34a063a dbbea01 34a063a 5f74857 34a063a e27570a 34a063a 2a7e6c0 34a063a e2e497c 71d546b e2e497c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
---
language:
- uz
tags:
- Text Generation
- PyTorch
- TensorFlow
- Transformers
- mit
- uz
- gpt2
license: apache-2.0
widget:
- text: "Covid-19 га қарши эмлаш бошланди,"
example_title: "Namuna 1"
- text: "Суъний интеллект энг ривожланган"
example_title: "Namuna 2"
---
<p><b>GPTuzmodel.</b>
GPTuz GPT-2 kichik modelga asoslangan Uzbek tili uchun state-of-the-art til modeli.
Bu model GPU NVIDIA V100 32GB va 0.53 GB malumotlarni kun.uz dan foydalanilgan holda Transfer Learning va Fine-tuning texnikasi asosida 1 kundan ziyod vaqt davomida o'qitilgan.
<p><b>Qanday foydaniladi</b>
<pre><code class="language-python">
from transformers import AutoTokenizer, AutoModelWithLMHead
import torch
tokenizer = AutoTokenizer.from_pretrained("rifkat/GPTuz")
model = AutoModelWithLMHead.from_pretrained("rifkat/GPTuz")
tokenizer.model_max_length=1024
</code></pre>
<p><b>Bitta so'z yaratish</b>
<pre><code class="language-python">
text = "Covid-19 га қарши эмлаш бошланди,"
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs, labels=inputs["input_ids"])
loss, logits = outputs[:2]
predicted_index = torch.argmax(logits[0, -1, :]).item()
predicted_text = tokenizer.decode([predicted_index])
print('input text:', text)
print('predicted text:', predicted_text)
</code></pre>
<p><b>Bitta to'liq ketma-ketlikni yarating </b>
<pre><code class="language-python">
text = "Covid-19 га қарши эмлаш бошланди, "
inputs = tokenizer(text, return_tensors="pt")
sample_outputs = model.generate(inputs.input_ids,
pad_token_id=50256,
do_sample=True,
max_length=50, # kerakli token raqamini qo'ying
top_k=40,
num_return_sequences=1)
for i, sample_output in enumerate(sample_outputs):
print(">> Generated text {}\n\n{}".format(i+1, tokenizer.decode(sample_output.tolist())))
</code></pre>
<pre><code class="language-python">
@misc {rifkat_davronov_2022,
authors = { {Adilova Fatima,Rifkat Davronov, Samariddin Kushmuratov, Ruzmat Safarov} },
title = { GPTuz (Revision 2a7e6c0) },
year = 2022,
url = { https://huggingface.co/rifkat/GPTuz },
doi = { 10.57967/hf/0143 },
publisher = { Hugging Face }
}
</code></pre> |