regularpooria
/

blaze_code_embedding

Model card Files Files and versions

blaze_code_embedding / readme.md

regularpooria's picture

initial commit

a03a283 about 2 months ago

|

history blame contribute delete

1.5 kB

	# Blaze (Finetuned Variant)

	This model is a finetuned version of the BLAZE model described in the paper:

	_BLAZE: Cross‑Language and Cross‑Project Bug Localization via Dynamic Chunking and Hard Example Learning_
	DOI: [10.5281/zenodo.15122980](https://doi.org/10.5281/zenodo.15122980) :contentReference[oaicite:1]{index=1}

	---

	## 📘 What’s Inside

	A Transformer-based bug localization model fine-tuned on additional cross-project datasets to enhance its ability to pinpoint bugs in unseen codebases.

	---

	## 🧪 Fine-tuning Details

	- Starting point: Pretrained Blaze model from Zenodo.
	- Enhancements: Further trained on a curated dataset using dynamic chunking and hard-negative sampling — all detailed in the original manuscript and accompanying code release.

	---

	## 🔍 Intended Usage

	- Primary task: Automatic identification of buggy code segments across languages and projects.
	- How to use: Load the model and feed it code snippets to receive localized bug predictions.

	---

	## 📄 Citation

	If you use this model, please cite:

	> _BLAZE: Cross‑Language and Cross‑Project Bug Localization via Dynamic Chunking and Hard Example Learning_, available at DOI: 10.5281/zenodo.15122980 :contentReference[oaicite:2]{index=2}

	---

	## 📁 Contents of This Repo

	- `config.json`
	- `pytorch_model.bin` (or `tf_model.h5`)
	- `tokenizer.json` (if applicable)
	- `README.md`

	---

	## 🛠️ Setup

	```bash
	pip install transformers huggingface_hub