English
code
TraceBERT
File size: 2,434 Bytes
0c43004
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e44fb7f
0c43004
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
license: mit
datasets:
- code_search_net
language:
- en
- code
library_name: TraceBERT
---
# Model Information

<!-- Provide a quick summary of what the model is/does. -->

This model is a Java-only version of the [CodeBERT-model](https://huggingface.co/microsoft/codebert-base), trained using the [TraceBERT library](https://github.com/jinfenglin/TraceBERT).

We initialized the model with the original CodeBERT-base and then again trained only on the Java part of the dataset.

The model was used for predicting trace links between software architecture documentation and Java source code on [traceability link recovery benchmarks](https://github.com/ArDoCo/Benchmark).

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The original CodeBERT-base model is again trained on the Java-based bi-modal data (documents & code) of [CodeSearchNet](https://huggingface.co/datasets/code_search_net).

### Usages

For TLR usage scenarios please take a look at our [replication package](https://github.com/ArDoCo/Replication-Package-ICSE24_Recovering-Trace-Links-Between-Software-Documentation-And-Code).

### Reference
1. [CodeBERT-base model](https://huggingface.co/microsoft/codebert-base)
2. [TraceBERT library](https://github.com/jinfenglin/TraceBERT)
3. [Replication Package](https://github.com/ArDoCo/Replication-Package-ICSE24_Recovering-Trace-Links-Between-Software-Documentation-And-Code)

### Citation
```bibtex
@inproceedings{keim_recovering_2024,
    author       = {Keim, Jan and Corallo, Sophie and Fuchß, Dominik and Hey, Tobias and Telge, Tobias and Koziolek, Anne},
    year         = {2024},
    title        = {Recovering Trace Links Between Software Documentation And Code},
    eventtitle   = {46th International Conference on Software Engineering},
    eventtitleaddon = {ICSE 2024},
    eventdate    = {2024-04-14/2024-04-20},
    venue        = {Lissabon, Portugal},
    booktitle    = {Proceedings of 46th International Conference on Software Engineering (ICSE 2024)},
    isbn         = {979-8-4007-0217-4},
    doi          = {10.1145/3597503.3639130},
    keywords     = {software traceability, software architecture, documentation, transitive links, intermediate artifacts, information retrieval},
    language     = {english}
}
```