McaTech
/

Nonet

Text Generation

Model card Files Files and versions Community

Nonet / README.md

McaTech's picture

Update README.md

c4f7e40 verified about 18 hours ago

|

history blame contribute delete

3.42 kB


	---
	license: apache-2.0
	language:
	- en
	library_name: llama.cpp
	tags:
	- gguf
	- quantized
	- int8
	- offline-ai
	- local-llm
	- chatnonet
	model_type: causal
	inference: true
	pipeline_tag: text-generation
	---

	# NONET

	NONET is a family of offline, quantized large language models fine-tuned for question answering with direct, concise answers. Designed for local execution using `llama.cpp`, NONET is available in multiple sizes and optimized for Android or Python-based environments.

	## Model Details

	### Model Description

	NONET is intended for lightweight offline use, particularly on local devices like mobile phones or single-board computers. The models have been fine-tuned for direct-answer QA and quantized to int8 (q8_0) using `llama.cpp`.

	\| Model Name \| Base Model \| Size \|
	\|----------------------------------\|--------------------\|--------\|
	\| ChatNONET-135m-tuned-q8_0.gguf \| Smollm \| 135M \|
	\| ChatNONET-300m-tuned-q8_0.gguf \| Smollm \| 300M \|
	\| ChatNONET-1B-tuned-q8_0.gguf \| LLaMA 3.2 \| 1B \|
	\| ChatNONET-3B-tuned-q8_0.gguf \| LLaMA 3.2 \| 3B \|

	- Developed by: McaTech (Michael Cobol Agan)
	- Model type: Causal decoder-only transformer
	- Languages: English
	- License: Apache 2.0
	- Finetuned from:
	- Smollm (135M, 300M variants)
	- LLaMA 3.2 (1B, 3B variants)

	## Uses

	### Direct Use

	- Offline QA chatbot
	- Local assistants (no internet required)
	- Embedded Android or Python apps

	### Out-of-Scope Use

	- Long-form text generation
	- Tasks requiring real-time web access
	- Creative storytelling or coding tasks

	## Bias, Risks, and Limitations

	NONET may reproduce biases present in its base models or fine-tuning data. Outputs should not be relied upon for sensitive or critical decisions.

	### Recommendations

	- Validate important responses
	- Choose model size based on your device capability
	- Avoid over-reliance for personal or legal advice

	## How to Get Started with the Model
	### For Android Devices
	- Try the Android app: [Download ChatNONET APK](https://drive.google.com/file/d/1-5Ozx_VsOUBS5_b4yS40MCaNZge_5_1f/view?usp=sharing)
	### You can also build llama.cpp your own and run it
	```bash
	# Clone llama.cpp and build it
	git clone https://github.com/ggerganov/llama.cpp
	cd llama.cpp
	make

	# Run the model
	./llama-cli -m ./ChatNONET-300m-tuned-q8_0.gguf -p "You are ChatNONET AI assistant." -cnv
	````

	## Training Details

	* Finetuning Goal: Direct-answer question answering
	* Precision: FP16 mixed precision
	* Frameworks: PyTorch, Transformers, Bitsandbytes
	* Quantization: int8 GGUF (`q8_0`) via `llama.cpp`

	## Evaluation

	* Evaluated internally on short QA prompts
	* Capable of direct factual or logical answers
	* Larger models perform better on reasoning tasks

	## Technical Specifications

	* Architecture:

	* Smollm (135M, 300M)
	* LLaMA 3.2 (1B, 3B)
	* Format: GGUF
	* Quantization: q8\_0 (int8)
	* Deployment: Mobile (Android) and desktop via `llama.cpp`

	## Citation

	```bibtex
	@misc{chatnonet2025,
	title={ChatNONET: Offline Quantized Q&A Models},
	author={Michael Cobol Agan},
	year={2025},
	note={\url{https://huggingface.co/McaTech/Nonet}},
	}
	```

	## Contact

	* Author: Michael Cobol Agan (McaTech)
	* Facebook: [FB Profile](https://www.facebook.com/michael.cobol.agan.2025/)