README.md · Etherll/Mellum-4b-sft-rust at main

Mellum-4b-sft-rust / README.md

Etherll

Update README.md

540aa23 verified 3 months ago

preview code

raw

history blame contribute delete

3.08 kB

	---
	base_model: JetBrains/Mellum-4b-base
	datasets:
	- Etherll/CodeFIM-Rust-Mellum
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	- sft
	- code
	- rust
	- fill-in-the-middle
	- fim
	- text-generation
	- llm
	license: apache-2.0
	language:
	- en
	library_name: transformers
	model-index:
	- name: Etherll/Mellum-4b-sft-rust
	results: []
	---
	# Etherll/Mellum-4b-sft-rust

	Etherll/Mellum-4b-sft-rust is a large language model (LLM) fine-tuned specifically for Rust code Fill-in-the-Middle (FIM) tasks. It is built upon `JetBrains/Mellum-4b-base` model.

	This model has been fine-tuned on the `Etherll/CodeFIM-Rust-Mellum` dataset, which comprises approximately 57,000 Rust-specific FIM examples, to enhance its proficiency in completing Rust code snippets accurately and contextually.

	A GGUF version for CPU inference is also available: [Etherll/Mellum-4b-sft-rust-GGUF](https://huggingface.co/Etherll/Mellum-4b-sft-rust-GGUF).

	## Model Description

	This model leverages the LLaMA-style architecture of `Mellum-4b-base` (4 billion parameters) and its extensive pre-training on over 4 trillion tokens. The fine-tuning process focused on adapting the model to the nuances of Rust syntax and common coding patterns for FIM tasks.

	Key Features:
	* Specialized for Rust: Optimized for Fill-in-the-Middle tasks in Rust.
	* Based on Mellum-4b-base: Benefits from JetBrains' robust base model.
	* Efficient: Suitable for both cloud and local deployment.
	* IDE Integration Ready: Designed for use in developer tooling, and works particularly well with [Continue.dev](https://www.continue.dev/) for an enhanced coding assistant experience.

	## Fine-tuning Data
	* Dataset: `Etherll/CodeFIM-Rust-Mellum`
	* Size: ~57,000 rows
	* Focus: Rust code Fill-in-the-Middle

	## FIM Format

	This model is trained to recognize a specific format for Fill-in-the-Middle tasks. When providing input for FIM, please use the following structure:

	```
	<filename>{{{filename}}}
	<fim_suffix>{{{suffix_code}}}<fim_prefix>{{{prefix_code}}}<fim_middle>
	```

	## How to Use

	## With Continue.dev

	For the best integrated development experience, it's highly recommended to use this model with [Continue.dev](https://www.continue.dev/).

	Refer to the [Continue.dev documentation](https://www.continue.dev/docs/setup/overview) for instructions on how to add custom LLMs.

	### GGUF Version

	A GGUF version is available at [Etherll/Mellum-4b-sft-rust-GGUF](https://huggingface.co/Etherll/Mellum-4b-sft-rust-GGUF).
	This format is suitable for local inference on CPU (and GPU with appropriate llama.cpp/Ollama builds) using tools like:
	* [llama.cpp](https://github.com/ggerganov/llama.cpp)
	* [Ollama](https://ollama.ai/)
	* [LM Studio](https://lmstudio.ai/)
	## Support & Community

	If you need any help, have questions, or just want to chat, feel free to message me on Discord: etherl

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)