English
code
TraceBERT
CodeBERT-Java / README.md
tobhey's picture
Update README.md
e44fb7f verified
metadata
license: mit
datasets:
  - code_search_net
language:
  - en
  - code
library_name: TraceBERT

Model Information

This model is a Java-only version of the CodeBERT-model, trained using the TraceBERT library.

We initialized the model with the original CodeBERT-base and then again trained only on the Java part of the dataset.

The model was used for predicting trace links between software architecture documentation and Java source code on traceability link recovery benchmarks.

Training Data

The original CodeBERT-base model is again trained on the Java-based bi-modal data (documents & code) of CodeSearchNet.

Usages

For TLR usage scenarios please take a look at our replication package.

Reference

  1. CodeBERT-base model
  2. TraceBERT library
  3. Replication Package

Citation

@inproceedings{keim_recovering_2024,
    author       = {Keim, Jan and Corallo, Sophie and Fuchß, Dominik and Hey, Tobias and Telge, Tobias and Koziolek, Anne},
    year         = {2024},
    title        = {Recovering Trace Links Between Software Documentation And Code},
    eventtitle   = {46th International Conference on Software Engineering},
    eventtitleaddon = {ICSE 2024},
    eventdate    = {2024-04-14/2024-04-20},
    venue        = {Lissabon, Portugal},
    booktitle    = {Proceedings of 46th International Conference on Software Engineering (ICSE 2024)},
    isbn         = {979-8-4007-0217-4},
    doi          = {10.1145/3597503.3639130},
    keywords     = {software traceability, software architecture, documentation, transitive links, intermediate artifacts, information retrieval},
    language     = {english}
}