Spaces:

codeparrot
/

README

Running

App Files Files Community

loubnabnl HF Staff commited on Jun 22, 2022

Commit

7dca969

1 Parent(s): 6d0e573

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ pinned: false
 <li>
 <p>
-    Interactive blog where we compare different code models and explain how they are trained and evaluated: <a
     href="https://huggingface.co/spaces/loubnabnl/code-generation-models"
     class="underline">Code generation with 🤗</a
     >
@@ -25,19 +25,19 @@ pinned: false
 <li>
 <p>
-Spaces: code generation with: <a href="https://huggingface.co/codeparrot/codeparrot">CodeParrot</a> (1.5B), <a href="https://huggingface.co/facebook/incoder-6B">InCoder</a> (6B) and <a href="https://github.com/salesforce/CodeGen">CodeGen</a> (6B)
 </p>
 </li>
 <li>Models: CodeParrot (1.5B) and CodeParrot-small (110M), each repo has different ongoing experiments in the branches.</li>
 <li>Datasets:<ul>
-<li><a href="https://huggingface.co/datasets/codeparrot/codeparrot-clean">codeparrot-clean</a>, dataset on which we trained and evaluated CodeParrot, the splits are available under <a href="https://huggingface.co/datasets/codeparrot/codeparrot-clean-train">codeparrot-clean-train</a> and <a href="https://huggingface.co/datasets/codeparrot/codeparrot-clean-valid">codeparrot-clean-valid</a>.</li>
-<li>A more filtered version of codeparrot-clean under <a href="https://huggingface.co/datasets/codeparrot/codeparrot-train-more-filtering">codeparrot-train-more-filtering</a> and <a href="https://huggingface.co/datasets/codeparrot/codeparrot-valid-more-filtering">codeparrot-train-more-filtering</a>.</li>
-<li>CodeParrot dataset after near deduplication since initially only exact match deduplication was performed, it&#39;s available under <a href="https://huggingface.co/datasets/codeparrot/codeparrot-train-near-deduplication">codeparrot-train-near-deduplication</a> and <a href="https://huggingface.co/datasets/codeparrot/codeparrot-valid-near-deduplication">codeparrot-train-near-deduplication</a>.</li>
-<li><a href="https://huggingface.co/datasets/codeparrot/github-code">GitHub-Code</a>, a 1TB dataset of 32 programming languages with 60 from GitHub files.</li>
-<li><a href="https://huggingface.co/datasets/codeparrot/github-jupyter">GitHub-Jupyter</a>, a 16.3GB dataset of Jupyter Notebooks  from BigQuery GitHub.</li>
-<li><a href="https://huggingface.co/datasets/codeparrot/apps">APPS</a>, a benchmark for code generation with 10000 problems.</li>
 </ul>
 </li>
 </ul>

 <li>
 <p>
+    Interactive blog: where we compare different code models and explain how they are trained and evaluated <a
     href="https://huggingface.co/spaces/loubnabnl/code-generation-models"
     class="underline">Code generation with 🤗</a
     >
 <li>
 <p>
+Spaces: code generation with: <a ref="https://huggingface.co/codeparrot/codeparrot" class="underline">CodeParrot (1.5B)</a>, <a href="https://huggingface.co/facebook/incoder-6B" class="underline">InCoder</a> (6B) and <a href="https://github.com/salesforce/CodeGen" class="underline">CodeGen</a> (6B)
 </p>
 </li>
 <li>Models: CodeParrot (1.5B) and CodeParrot-small (110M), each repo has different ongoing experiments in the branches.</li>
 <li>Datasets:<ul>
+<li><a href="https://huggingface.co/datasets/codeparrot/codeparrot-clean" class="underline">codeparrot-clean</a>, dataset on which we trained and evaluated CodeParrot, the splits are available under <a href="https://huggingface.co/datasets/codeparrot/codeparrot-clean-train" class="underline">codeparrot-clean-train</a> and <a href="https://huggingface.co/datasets/codeparrot/codeparrot-clean-valid" class="underline">codeparrot-clean-valid</a>.</li>
+<li>A more filtered version of codeparrot-clean under <a href="https://huggingface.co/datasets/codeparrot/codeparrot-train-more-filtering" class="underline">codeparrot-train-more-filtering</a> and <a href="https://huggingface.co/datasets/codeparrot/codeparrot-valid-more-filtering" class="underline">codeparrot-train-more-filtering</a>.</li>
+<li>CodeParrot dataset after near deduplication since initially only exact match deduplication was performed, it's available under <a href="https://huggingface.co/datasets/codeparrot/codeparrot-train-near-deduplication" class="underline">codeparrot-train-near-deduplication</a> and <a href="https://huggingface.co/datasets/codeparrot/codeparrot-valid-near-deduplication" class="underline">codeparrot-train-near-deduplication</a>.</li>
+<li><a href="https://huggingface.co/datasets/codeparrot/github-code" class="underline">GitHub-Code</a>, a 1TB dataset of 32 programming languages with 60 from GitHub files.</li>
+<li><a href="https://huggingface.co/datasets/codeparrot/github-jupyter" class="underline">GitHub-Jupyter</a>, a 16.3GB dataset of Jupyter Notebooks  from BigQuery GitHub.</li>
+<li><a href="https://huggingface.co/datasets/codeparrot/apps" class="underline">APPS</a>, a benchmark for code generation with 10000 problems.</li>
 </ul>
 </li>
 </ul>