yuchenglow

AI & ML interests

Graphs, Interpretability, Performance. Pragmatic Bayesian.

Recent Activity

commented on their article 5 months ago

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

upvoted an article 6 months ago

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

published an article 6 months ago

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

View all activity

Organizations

commented on From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub 5 months ago

We LZ4 everything automatically, but when we do encounter a model, we also perform a format-agnostic byte grouping inspired by ZipNN before LZ4ing. This does empirically save about 20%.
https://github.com/huggingface/xet-core/blob/main/cas_object/src/byte_grouping/bg4.rs

upvoted an article 6 months ago

Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

and 3 others •

Feb 12

• 72

published an article 6 months ago

Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

and 3 others •

Feb 12

• 72

published an article 10 months ago

Article

Improving Parquet Dedupe on Hugging Face Hub

and 1 other •

Oct 5, 2024

• 38

upvoted an article 10 months ago

Article

Improving Parquet Dedupe on Hugging Face Hub

and 1 other •

Oct 5, 2024

• 38

published an article about 1 year ago

Article

XetHub is joining Hugging Face!

and 1 other •

Aug 8, 2024

• 106

yuchenglow

AI & ML interests

Recent Activity

Organizations

yuchenglow's activity

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

Improving Parquet Dedupe on Hugging Face Hub

Improving Parquet Dedupe on Hugging Face Hub

XetHub is joining Hugging Face!