---
library_name: transformers
license: mit
language:
- en
metrics:
- f1
- precision
- recall
base_model:
- microsoft/codebert-base
pipeline_tag: text-classification
---

# CodeBERT base for classifying developer questions

This model classifies questions in developer forums (e.g., Stack Overflow) as 'LQ_CLOSE' (low-quality), 'LQ_EDIT' (low-quality, require community edits), 'HQ' (high-quality).

- **Developed by:** Fabian C. Peña, Steffen Herbold
- **Finetuned from:** [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base)
- **Replication kit:** [https://github.com/aieng-lab/senlp-benchmark](https://github.com/aieng-lab/senlp-benchmark)
- **Language:** English
- **License:** MIT

## Citation

```
@misc{pena2025benchmark,
  author    = {Fabian Peña and Steffen Herbold},
  title     = {Evaluating Large Language Models on Non-Code Software Engineering Tasks},
  year      = {2025}
}
```