pr0methium
/

ReelevateLM-q4f16_1

Model card Files Files and versions

ReelevateLM-q4f16

This is the Meta Llama 3.1 Instruct model fine‑tuned with LoRA and converted to MLC format q4f16_1.

The model can be used in:

Example Usage

Before running any examples, install MLC LLM by following the installation documentation.

Chat (CLI)

mlc_llm chat HF://pr0methium/ReelevateLM-q4f16_1

REST Server

mlc_llm serve HF://pr0methium/ReelevateLM-q4f16_1

Python API

from mlc_llm import MLCEngine

model = "HF://pr0methium/ReelevateLM-q4f16_1"
engine = MLCEngine(model)

for response in engine.chat.completions.create(
    messages=[{"role": "user", "content": "Write me a 30 second reel story…"}],
    model=model,
    stream=True,
):
    for choice in response.choices:
        print(choice.delta.content, end="", flush=True)
print()

engine.terminate()

Documentation

For more information on the MLC LLM project, please visit the docs and the GitHub repo.

Downloads last month: 8

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pr0methium/ReelevateLM-q4f16_1

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Quantized

(470)

this model