metadata
license: cc-by-4.0
language:
- en
library_name: transformers
pipeline_tag: text-classification
tags:
- code
metrics:
- accuracy
- f1
CodeBERT-SO
Repository for CodeBERT, fine-tuned on Stack Overflow snippets with respect to NL-PL pairs of 6 languages (Python, Java, JavaScript, PHP, Ruby, Go).
Training Objective
This model is initialized with CodeBERT-base and trained to classify whether a user will drop out given their posts and code snippets.
Training Regime
Training was done across 8 epochs with a batch size of 8, learning rate of 1e-5, epsilon (weight update denominator) of 1e-8. A random 20% sample of the entire dataset was used as the validation set.
Performance
- Final validation accuracy: 0.822
- Final validation F1: 0.809
- Final validation loss: 0.5