Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2307.09288

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 77
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 51
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 243
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Paper • 2309.11235 • Published Sep 20, 2023 • 15
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 416

Running

2.55k

2.55k

Anycoder

🏢

Generate HTML/CSS/JS code for web applications
Runtime error

274

274

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

922

922

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

13.4k

13.4k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

LLM Tech Report

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 373
Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 151
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement

Paper • 2409.12122 • Published Sep 18, 2024 • 4
Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 200

Qwen/Qwen3-8B

Text Generation • 8B • Updated 20 days ago • 5.3M • • 536
Qwen/Qwen3-4B

Text Generation • 4B • Updated 20 days ago • 1.23M • • 359
Qwen/Qwen3-0.6B

Text Generation • 0.8B • Updated 20 days ago • 3.32M • • 542
google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 1.33M • 783

Source Papers of LLM Giants

Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 37
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

Paper • 2311.07919 • Published Nov 14, 2023 • 10
Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 167
Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15, 2024 • 60

royalmatrimonial

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 624
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 373
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 257
LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 258

New Tools For Oct 2024

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.59M • • 11.2k
openai/whisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated Oct 4, 2024 • 3.42M • • 2.55k
meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 815k • • 1.5k
deepseek-ai/DeepSeek-V2.5

Text Generation • 236B • Updated Dec 11, 2024 • 2.94k • • 724

how to verify bank account on Wise business

If you want to know more or have any queries, just knock us here– Email: [email protected] Telegram: @Smmtoperofficial Skype: Smmtoperofficial WhatsA

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 243

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 77
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 17

Qwen/Qwen3-8B

Text Generation • 8B • Updated 20 days ago • 5.3M • • 536
Qwen/Qwen3-4B

Text Generation • 4B • Updated 20 days ago • 1.23M • • 359
Qwen/Qwen3-0.6B

Text Generation • 0.8B • Updated 20 days ago • 3.32M • • 542
google/gemma-3-4b-it

Image-Text-to-Text • 4B • Updated Mar 21 • 1.33M • 783

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 19
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Source Papers of LLM Giants

Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 37
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

Paper • 2311.07919 • Published Nov 14, 2023 • 10
Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15, 2024 • 167
Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15, 2024 • 60

Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 51
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 243
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Paper • 2309.11235 • Published Sep 20, 2023 • 15
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 416

royalmatrimonial

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 624
Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 373
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 257
LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 258

Running

2.55k

2.55k

Anycoder

🏢

Generate HTML/CSS/JS code for web applications
Runtime error

274

274

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

922

922

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

13.4k

13.4k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

New Tools For Oct 2024

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.59M • • 11.2k
openai/whisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated Oct 4, 2024 • 3.42M • • 2.55k
meta-llama/Llama-3.2-11B-Vision-Instruct

Image-Text-to-Text • 11B • Updated Dec 4, 2024 • 815k • • 1.5k
deepseek-ai/DeepSeek-V2.5

Text Generation • 236B • Updated Dec 11, 2024 • 2.94k • • 724

LLM Tech Report

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 373
Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 151
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement

Paper • 2409.12122 • Published Sep 18, 2024 • 4
Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 200

how to verify bank account on Wise business

If you want to know more or have any queries, just knock us here– Email: [email protected] Telegram: @Smmtoperofficial Skype: Smmtoperofficial WhatsA

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 243

Previous
1
2
3
...
9
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs