alkinun's picture

alkinun

AtAndDev

·

AI & ML interests

LLMs, Alignment, Merging, Unsloth, DPO, SFT, ORPO, SPIN..

Recent Activity

reacted to sweatSmile's post with ❤️ about 17 hours ago

Teaching a 7B Model to Be Just the Right Amount of Snark Ever wondered if a language model could get sarcasm? I fine-tuned Mistral-7B using LoRA and 4-bit quantisation—on just ~720 hand-picked sarcastic prompt–response pairs from Reddit, Twitter, and real-life conversations. The challenge? Keeping it sarcastic but still helpful. LoRA rank 16 to avoid overfitting 4-bit NF4 quantization to fit on limited GPU memory 10 carefully monitored epochs so it didn’t turn into a full-time comedian Result: a model that understands “Oh great, another meeting” exactly as you mean it. Read the full journey, tech details, and lessons learned on my blog: Fine-Tuning Mistral-7B for Sarcasm with LoRA and 4-Bit Quantisation Try the model here on Hugging Face: sweatSmile/Mistral-7B-Instruct-v0.1-Sarcasm.

reacted to sweatSmile's post with 🚀 about 17 hours ago

Teaching a 7B Model to Be Just the Right Amount of Snark Ever wondered if a language model could get sarcasm? I fine-tuned Mistral-7B using LoRA and 4-bit quantisation—on just ~720 hand-picked sarcastic prompt–response pairs from Reddit, Twitter, and real-life conversations. The challenge? Keeping it sarcastic but still helpful. LoRA rank 16 to avoid overfitting 4-bit NF4 quantization to fit on limited GPU memory 10 carefully monitored epochs so it didn’t turn into a full-time comedian Result: a model that understands “Oh great, another meeting” exactly as you mean it. Read the full journey, tech details, and lessons learned on my blog: Fine-Tuning Mistral-7B for Sarcasm with LoRA and 4-Bit Quantisation Try the model here on Hugging Face: sweatSmile/Mistral-7B-Instruct-v0.1-Sarcasm.

replied to dhruv3006's post about 20 hours ago

GPT 5 for Computer Use agents. Same tasks, same grounding model we just swapped GPT 4o with GPT 5 as the thinking model. Left = 4o, right = 5. Watch GPT 5 pull away. Reasoning model: OpenAI GPT-5 Grounding model: Salesforce GTA1-7B Action space: CUA Cloud Instances (macOS/Linux/Windows) The task is: "Navigate to {random_url} and play the game until you reach a score of 5/5”....each task is set up by having claude generate a random app from a predefined list of prompts (multiple choice trivia, form filling, or color matching)" Try it yourself here : https://github.com/trycua/cua Docs : https://docs.trycua.com/docs/agent-sdk/supported-agents/composed-agents

View all activity

Organizations

liked 2 datasets 1 day ago

databricks/databricks-dolly-15k

Viewer • Updated Jun 30, 2023 • 15k • 15.2k • 847

allenai/WildChat-1M

Viewer • Updated Oct 17, 2024 • 838k • 6.06k • 363

liked a dataset 3 days ago

HuggingFaceTB/smol-smoltalk

Viewer • Updated Feb 6 • 485k • 896 • 58

liked a model 7 days ago

TheDrummer/Cydonia-R1-24B-v4

24B • Updated 6 days ago • 181 • 19

liked 4 models 10 days ago

zai-org/GLM-4.5

Text Generation • 358B • Updated 13 days ago • 18.4k • • 1.13k

nvidia/Llama-3_3-Nemotron-Super-49B-v1_5

Text Generation • 50B • Updated 10 days ago • 9.87k • 160

tngtech/DeepSeek-TNG-R1T2-Chimera

Text Generation • 685B • Updated 5 days ago • 3.41k • 234

microsoft/MAI-DS-R1

Text Generation • 671B • Updated May 6 • 814 • 282

liked a dataset 11 days ago

MoLA-LLM/magpie-ultra-5k-11-tasks

Viewer • Updated 5 days ago • 55k • 209 • 2

liked a model 11 days ago

Qwen/Qwen3-30B-A3B-Instruct-2507

Text Generation • 31B • Updated 2 days ago • 188k • 458

liked 3 datasets 12 days ago

facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 1.52k • 515

MegaScience/MegaScience

Viewer • Updated 17 days ago • 1.25M • 7.79k • 89

HuggingFaceTB/smoltalk

Viewer • Updated Feb 10 • 2.2M • 4.82k • 357

liked 2 models 12 days ago

Tonic/petite-elle-L-aime-3-sft

Text Generation • 3B • Updated 8 days ago • 112 • 1

FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Preview

33B • Updated Jan 25 • 11 • 44

liked a model 16 days ago

Kwaipilot/KAT-V1-40B

Text Generation • 41B • Updated 20 days ago • 989 • 104

liked a model 18 days ago

Qwen/Qwen3-Coder-480B-A35B-Instruct

Text Generation • 480B • Updated 3 days ago • 37.5k • • 1.06k

liked a dataset 19 days ago

selimc/orpo-dpo-mix-TR-20k

Viewer • Updated Nov 12, 2024 • 19.9k • 47 • 7

liked 2 datasets 21 days ago

ucekmez/OpenOrca-tr

Viewer • Updated Feb 11, 2024 • 798k • 189 • 22

Team-ACE/ToolACE

Viewer • Updated Sep 4, 2024 • 11.3k • 1.56k • 124