1 9 21

Suchit G

asquirous

https://suchitg04.github.io/blog/

AI & ML interests

None yet

Recent Activity

updated a model 3 months ago

asquirous/gippity

liked a dataset 4 months ago

Rowan/hellaswag

upvoted an article 4 months ago

Mastering Tensor Dimensions in Transformers

View all activity

Organizations

updated a model 3 months ago

asquirous/gippity

Updated Apr 28

liked a dataset 4 months ago

Rowan/hellaswag

Viewer • Updated about 1 month ago • 60k • 166k • 131

upvoted 2 articles 4 months ago

Article

Mastering Tensor Dimensions in Transformers

•

Jan 12

• 84

Article

Decoding Strategies in Large Language Models

•

Oct 29, 2024

• 76

liked a model 4 months ago

Lwasinam/voicera

Text-to-Speech • 0.3B • Updated Jul 30, 2024 • 30 • 23

upvoted 2 articles 4 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

Jan 30

• 108

Article

You could have designed state of the art positional encoding

•

Nov 25, 2024

• 336

published a model 4 months ago

asquirous/gippity

Updated Apr 28

published a dataset 4 months ago

asquirous/fineweb_sample_10B_np_bin

Updated Mar 27 • 2

updated a dataset 5 months ago

asquirous/fineweb_sample_10B_np_bin

Updated Mar 27 • 2

liked 2 models 5 months ago

answerdotai/ModernBERT-base

Fill-Mask • 0.1B • Updated Jan 15 • 1.18M • 913

answerdotai/ModernBERT-large

Fill-Mask • 0.4B • Updated Jan 15 • 133k • 418

upvoted an article 5 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

•

Oct 7, 2024

• 45

published a Space 5 months ago

Chunker

🌖

Chunk code into smaller, manageable sections

updated a Space 5 months ago

Chunker

🌖

Chunk code into smaller, manageable sections

liked a dataset 6 months ago

neuralnets/letter_bench

Viewer • Updated Feb 20 • 50k • 2 • 4

upvoted an article 6 months ago

Article

SmolLM - blazingly fast and remarkably powerful

and 2 others •

Jul 16, 2024

• 405

liked a Space 9 months ago

168

Vidore Leaderboard

🥇

Explore visual document retrieval benchmark results

liked a model 10 months ago

microsoft/OmniParser

Image-Text-to-Text • Updated Dec 2, 2024 • 1.32k • 1.68k

updated a dataset 10 months ago

big-banyan-tree/BBT_CommonCrawl_2018

Viewer • Updated Oct 11, 2024 • 61.5M • 120 • 3

Suchit G

AI & ML interests

Recent Activity

Organizations

asquirous's activity

Mastering Tensor Dimensions in Transformers

Decoding Strategies in Large Language Models

KV Caching Explained: Optimizing Transformer Inference Efficiency

You could have designed state of the art positional encoding

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Chunker

Chunker

SmolLM - blazingly fast and remarkably powerful

Vidore Leaderboard