Post
293
π Run <14B, 12B, 8Bβ¦ LLMs **for FREE** on Google Colab (15GB VRAM GPU)
π Repo: https://github.com/seyf1elislam/LocalLLM_OneClick_Colab
π How to Use
1. Open the notebook β Click βOpen in Colabβ and enable GPU mode.
2. Enter model details β Provide the Hugging Face repo name & quantization type.
* Example:
unsloth/Qwen3-8B-GGUF
with quant Q5_k_m
3. Run all cells β Wait 1β3 minutes. You'll get a link to the GUI & API (OpenAI-compatible).
π‘ Yes, itβs really free. Enjoy! β¨
---
π Supported Models (examples)
* Qwen3 14B** β
Q5_k_m
, Q4_k_m
* Qwen3 8B** β
Q8_0
* Nemo 12B β
Q6_k
, Q5_k_m
* Gemma3 12B β
Q6_k
, Q5_k_m
---
π» Available Notebooks
1. KoboldCpp(βββ Recommended β faster setup & inference)
π https://github.com/seyf1elislam/LocalLLM_OneClick_Colab/blob/main/awesome_koboldcpp_notebook.ipynb
2. TextGen-WebUI(ββ Recommended)
π https://github.com/seyf1elislam/LocalLLM_OneClick_Colab/blob/main/Run_any_gguf_model_in_TextGen_webui.ipynb