File size: 6,692 Bytes
d062970
61b9676
 
eb06f7b
 
 
 
 
 
 
 
c68da8b
61b9676
eb06f7b
 
9c81128
61b9676
 
c68da8b
61b9676
c68da8b
61b9676
c68da8b
61b9676
c68da8b
61b9676
c68da8b
61b9676
 
c68da8b
61b9676
 
 
 
 
 
 
c68da8b
61b9676
 
 
 
 
 
 
 
 
c68da8b
61b9676
 
 
 
 
c68da8b
61b9676
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c68da8b
61b9676
c68da8b
 
 
 
 
 
 
 
 
 
61b9676
 
c68da8b
 
61b9676
eb17ecb
c68da8b
61b9676
 
c68da8b
d73accc
 
f49971f
d73accc
61b9676
d062970
eb06f7b
 
 
 
 
 
647cc20
 
 
 
 
 
271d302
eb06f7b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
271d302
eb06f7b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
---
language:
- en
license:
- apache-2.0
- cc-by-nc-4.0
tags:
- generated_from_trainer
- instruct
- instructions
- code
- instructiongen
datasets: pszemraj/fleece2instructions-codealpaca
metrics:
- rouge
widget:
- text: 'git lfs install

    huggingface-cli lfs-enable-largefiles .

    git lfs track "*.bin"

    git add .

    git commit -a -m "add fp32 chkpt"

    git push

    '
  example_title: bash
- text: "export interface DocumentParams {\n  pageContent: string;\n\n  // eslint-disable-next-line\
    \ @typescript-eslint/no-explicit-any\n  metadata: Record<string, any>;\n}\n\n\
    /**\n * Interface for interacting with a document.\n */\nexport class Document\
    \ implements DocumentParams {\n  pageContent: string;\n\n  // eslint-disable-next-line\
    \ @typescript-eslint/no-explicit-any\n  metadata: Record<string, any>;\n\n  constructor(fields?:\
    \ Partial<DocumentParams>) {\n    this.pageContent = fields?.pageContent ?? this.pageContent;\n\
    \    this.metadata = fields?.metadata ?? {};\n  }\n}\n"
  example_title: js
- text: "def merge(left, right):\n    if len(left) == 0:\n        return right\n\n\
    \    if len(right) == 0:\n        return left\n\n    result = []\n    index_left\
    \ = index_right = 0\n\n    while len(result) < len(left) + len(right):\n     \
    \   if left[index_left] <= right[index_right]:\n            result.append(left[index_left])\n\
    \            index_left += 1\n        else:\n            result.append(right[index_right])\n\
    \            index_right += 1\n\n        if index_right == len(right):\n     \
    \       result += left[index_left:]\n            break\n\n        if index_left\
    \ == len(left):\n            result += right[index_right:]\n            break\n\
    \n    return result\n"
  example_title: merge
- text: "import pandas as pd\nimport plotly.graph_objects as go\n\ndf = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2014_apple_stock.csv')\n\
    \nfig = go.Figure(go.Scatter(x = df['AAPL_x'], y = df['AAPL_y'],\n           \
    \       name='Share Prices (in USD)'))\n\nfig.update_layout(title='Apple Share\
    \ Prices over time (2014)',\n                   plot_bgcolor='rgb(230, 230,230)',\n\
    \                   showlegend=True)\n\nfig.show()\n"
  example_title: plot
- text: "from spellchecker import SpellChecker\n\nspell = SpellChecker()\n\ndef check_word_spelling(word:\
    \ str):\n    misspelled = spell.unknown([word])\n    return len(misspelled) ==\
    \ 0\n\ndef eval_and_replace(text: str, match_token: str = \"- \"):\n    if match_token\
    \ not in text:\n        return text\n    else:\n        while True:\n        \
    \    full_before_text = text.split(match_token, maxsplit=1)[0]\n            before_text\
    \ = [\n                char for char in full_before_text.split()[-1] if char.isalpha()\n\
    \            ]\n            before_text = \"\".join(before_text)\n           \
    \ full_after_text = text.split(match_token, maxsplit=1)[-1]\n            after_text\
    \ = [char for char in full_after_text.split()[0] if char.isalpha()]\n        \
    \    after_text = \"\".join(after_text)\n            full_text = before_text +\
    \ after_text\n            if check_word_spelling(full_text):\n               \
    \ text = full_before_text + full_after_text\n            else:\n             \
    \   text = full_before_text + \" \" + full_after_text\n            if match_token\
    \ not in text:\n                break\n        return text\n\ntext = \"I- am-\
    \ a go- od- boy\"\neval_and_replace(text)\n"
  example_title: spell check
- text: 'import torch

    from transformers import AutoTokenizer, AutoModelForSequenceClassification


    checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"

    tokenizer = AutoTokenizer.from_pretrained(checkpoint)

    model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

    sequences = ["I''ve been waiting for a HuggingFace course my whole life.", "So
    have I!"]


    tokens = tokenizer(sequences, padding=True, truncation=True, return_tensors="pt")

    output = model(**tokens)

    '
  example_title: model inference
inference:
  parameters:
    max_length: 96
    num_beams: 4
base_model: facebook/bart-large
---


# bart-large-code-instructiongen

Use this text2text model to find out what LLM instructions might be able to generate an arbitary piece of code!

- Check out a [basic demo on Spaces](https://huggingface.co/spaces/pszemraj/generate-instructions)
- An example of how to use instructiongen models in a CLI script can be found [here](https://gist.github.com/pszemraj/8b0213e700763106074d3ac15d041c14)
- You can find other models fine-tuned for instruction generation by [searching for the instructiongen tag](https://huggingface.co/models?other=instructiongen)

## about 

This model is a fine-tuned version of [facebook/bart-large](https://huggingface.co/facebook/bart-large) on the `pszemraj/fleece2instructions-codealpaca` dataset.
It achieves the following results on the evaluation set:
- Loss: 0.9222
- Rouge1: 62.0692
- Rouge2: 36.1947
- Rougel: 57.5128
- Rougelsum: 58.6613
- Gen Len: 31.0060


## Intended uses & limitations

🚨 **note:** as the authors elected to release the [original dataset](https://github.com/sahil280114/codealpaca) under `cc-by-nc`, the license carries over to this model and **cannot be used for commercial activity**. 

Intended use: Research on domain adaptation and/or other improvements to LLMs by extending instruction:text data pairs.

## Training and evaluation data

Refer to the linked dataset card for `pszemraj/fleece2instructions-codealpaca` or the [original dataset](https://github.com/sahil280114/codealpaca) repo.

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.03
- num_epochs: 3.0

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
| 1.0914        | 1.0   | 563  | 1.0303          | 60.288  | 34.1884 | 55.9293 | 57.0714   | 30.6267 |
| 0.8688        | 2.0   | 1126 | 0.9333          | 61.0409 | 34.9823 | 56.4887 | 57.6662   | 31.7255 |
| 0.6773        | 3.0   | 1689 | 0.9222          | 62.0692 | 36.1947 | 57.5128 | 58.6613   | 31.0060 |