-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 84 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 24
Collections
Discover the best community collections!
Collections including paper arxiv:2402.17764
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 61 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49
-
Rewnozom/agent-zero-v1-a-01
Text Generation • 4B • Updated • 3 • 1 -
TheBloke/MythoMax-L2-13B-GGUF
13B • Updated • 125k • 172 -
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF
Text Generation • 18B • Updated • 109k • 311 -
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Text Generation • 8B • Updated • 21.8k • 110
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.73k • 1.15k -
microsoft/bitnet-b1.58-2B-4T-bf16
Text Generation • 2B • Updated • 2.5k • 33 -
microsoft/bitnet-b1.58-2B-4T-gguf
Text Generation • 2B • Updated • 4.89k • 190 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 74
-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper • 2208.07339 • Published • 5 -
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper • 2210.17323 • Published • 8 -
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper • 2211.10438 • Published • 6 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 55
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 373 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 624 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 754k • • 1.05k -
keras-io/GauGAN-Image-generation
Updated • 11 • 4
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 84 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 24
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.73k • 1.15k -
microsoft/bitnet-b1.58-2B-4T-bf16
Text Generation • 2B • Updated • 2.5k • 33 -
microsoft/bitnet-b1.58-2B-4T-gguf
Text Generation • 2B • Updated • 4.89k • 190 -
BitNet b1.58 2B4T Technical Report
Paper • 2504.12285 • Published • 74
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 61 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49
-
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Paper • 2208.07339 • Published • 5 -
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Paper • 2210.17323 • Published • 8 -
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Paper • 2211.10438 • Published • 6 -
QLoRA: Efficient Finetuning of Quantized LLMs
Paper • 2305.14314 • Published • 55
-
Rewnozom/agent-zero-v1-a-01
Text Generation • 4B • Updated • 3 • 1 -
TheBloke/MythoMax-L2-13B-GGUF
13B • Updated • 125k • 172 -
DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF
Text Generation • 18B • Updated • 109k • 311 -
QuantFactory/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF
Text Generation • 8B • Updated • 21.8k • 110
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 373 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 624 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 754k • • 1.05k -
keras-io/GauGAN-Image-generation
Updated • 11 • 4