peterjandre commited on
Commit
42b1426
Β·
verified Β·
1 Parent(s): ced1a58

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +47 -3
  2. model_card.json +16 -0
README.md CHANGED
@@ -1,3 +1,47 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # πŸš€ CodeT5 VB.NET β†’ C# Translator
3
+
4
+ This is a fine-tuned version of [Salesforce/CodeT5-base](https://huggingface.co/Salesforce/codet5-base) for translating VB.NET to C#.
5
+
6
+ ---
7
+
8
+ ## πŸ“Š Evaluation Metrics
9
+
10
+ **BLEU Score:** 0.4506
11
+ - 1-gram: 0.6698
12
+ - 2-gram: 0.5402
13
+ - 3-gram: 0.4656
14
+ - 4-gram: 0.4132
15
+ - Brevity penalty: 0.8773
16
+ - Length ratio: 0.8843
17
+
18
+ **ROUGE Scores:**
19
+ - ROUGE-1: 0.5836
20
+ - ROUGE-2: 0.4586
21
+ - ROUGE-L: 0.5378
22
+ - ROUGE-Lsum: 0.5781
23
+
24
+ ---
25
+
26
+ ## πŸ”§ Usage
27
+
28
+ ```python
29
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
30
+
31
+ model = AutoModelForSeq2SeqLM.from_pretrained("{repo_id}")
32
+ tokenizer = AutoTokenizer.from_pretrained("{repo_id}")
33
+
34
+ vb_code = "Dim x As Integer = 5"
35
+ inputs = tokenizer(f"translate VB.NET to C#: {vb_code}", return_tensors="pt")
36
+ outputs = model.generate(**inputs)
37
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
38
+
39
+ ## πŸ“ Dataset Format
40
+
41
+ Training data was in JSONL with fields:
42
+ "vb_code": VB.NET input
43
+ "csharp_code": corresponding C# output
44
+
45
+ ## πŸ“„ License
46
+
47
+ MIT
model_card.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "language": [
3
+ "vbnet",
4
+ "csharp"
5
+ ],
6
+ "tags": [
7
+ "code",
8
+ "code-to-code",
9
+ "translation",
10
+ "codet5"
11
+ ],
12
+ "license": "mit",
13
+ "pipeline_tag": "text2text-generation",
14
+ "library_name": "transformers",
15
+ "model_type": "CodeT5"
16
+ }