anthonyivn (Anthony Ivan S)

upvoted 2 papers 11 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 253

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10, 2025 • 153

upvoted an article 11 months ago

Article

Open-source DeepResearch – Freeing our search agents

+3

Feb 4, 2025

•

1.31k

upvoted a paper 12 months ago

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41

upvoted 2 articles 12 months ago

Article

🪆 Introduction to Matryoshka Embedding Models

+1

Feb 23, 2024

•

185

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15, 2025

•

222

upvoted a paper 12 months ago

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9, 2025 • 102

upvoted a paper about 1 year ago

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69

upvoted an article over 1 year ago

Article

Document Similarity Search with ColPali

Sep 21, 2024

•

52

upvoted 3 papers over 1 year ago

upvoted an article over 1 year ago

Article

The Rise of Agentic Data Generation

Jul 15, 2024

•

89

upvoted a paper over 1 year ago

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12, 2024 • 138

upvoted a collection over 1 year ago

InternLM2.5

Collection

14 items • Updated 5 days ago • 72

upvoted 3 papers over 1 year ago

LongIns: A Challenging Long-context Instruction-based Exam for LLMs

Paper • 2406.17588 • Published Jun 25, 2024 • 23

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 99

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

upvoted 2 articles over 1 year ago

Article

Uncensor any LLM with abliteration

Jun 13, 2024

•

751

Article

Putting RL back in RLHF

Jun 12, 2024

•

109

Anthony Ivan S PRO

AI & ML interests

Organizations

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Open-source DeepResearch – Freeing our search agents

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

🪆 Introduction to Matryoshka Embedding Models

Train 400x faster Static Embedding Models with Sentence Transformers

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level