Amazon SageMaker Community

non-profit

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

DrishtiSharma authored a paper 27 days ago

Robust and Fine-Grained Detection of AI Generated Texts

nouamanetazi authored a paper about 1 month ago

SmolVLM: Redefining small and efficient multimodal models

birgermoell authored a paper about 1 month ago

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

View all activity

amazon-sagemaker-community's activity

jeffboudier

posted an update 16 minutes ago

Post

Transcribing 1 hour of audio for less than $0.01 🤯

@mfuntowicz cooked with 8x faster Whisper speech recognition - whisper-large-v3-turbo transcribes at 100x real time on a $0.80/hr L4 GPU!

How they did it: https://huggingface.co/blog/fast-whisper-endpoints

1-click deploy with HF Inference Endpoints: https://endpoints.huggingface.co/new?repository=openai%2Fwhisper-large-v3-turbo&vendor=aws&region=us-east&accelerator=gpu&instance_id=aws-us-east-1-nvidia-l4-x1&task=automatic-speech-recognition&no_suggested_compute=true

jeffboudier

posted an update 6 days ago

Post

2940

So many orgs on HF would really benefit from security and governance built into Enterprise Hub - I wrote a guide on why and how upgrade: jeffboudier/how-to-upgrade-to-enterprise

For instance, did you know about Resource Groups?

julien-c

posted an update 18 days ago

Post

4184

BOOOOM: Today I'm dropping TINY AGENTS

the 50 lines of code Agent in Javascript 🔥

I spent the last few weeks working on this, so I hope you will like it.

I've been diving into MCP (Model Context Protocol) to understand what the hype was all about.

It is fairly simple, but still quite powerful: MCP is a standard API to expose sets of Tools that can be hooked to LLMs.

But while doing that, came my second realization:

Once you have a MCP Client, an Agent is literally just a while loop on top of it. 🤯

➡️ read it exclusively on the official HF blog: https://huggingface.co/blog/tiny-agents

1 reply

philschmid

posted an update 26 days ago

Post

2532

Gemini 2.5 Flash is here! We excited launch our first hybrid reasoning Gemini model. In Flash 2.5 developer can turn thinking off.

**TL;DR:**
- 🧠 Controllable "Thinking" with thinking budget with up to 24k token
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏦 $0.15 1M input tokens; $0.6 or $3.5 (thinking on) per million output tokens (thinking tokens are billed as output tokens)
- 💡 Knowledge cut of January 2025
- 🚀 Rate limits - Free 10 RPM 500 req/day
- 🏅Outperforms 2.0 Flash on every benchmark

Try it ⬇️
https://aistudio.google.com/prompts/new_chat?model=gemini-2.5-flash-preview-04-17

1 reply

DrishtiSharma

authored a paper 27 days ago

Robust and Fine-Grained Detection of AI Generated Texts

Paper • 2504.11952 • Published 28 days ago • 11

nouamanetazi

authored a paper about 1 month ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 181

jeffboudier

posted an update about 1 month ago

Post

2193

Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems 👉 dell.huggingface.co

jeffboudier

posted an update about 1 month ago

Post

1556

Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing

2 replies

birgermoell

authored a paper about 1 month ago

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Paper • 2504.00016 • Published Mar 27 • 1

philschmid

authored a paper about 2 months ago

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 50

birgermoell

authored 2 papers about 2 months ago

The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification

Paper • 2503.04802 • Published Mar 3

Artificial Humans

Paper • 2503.16502 • Published Mar 12

philschmid

posted an update about 2 months ago

Post

2967

Gemini 2.5 Pro, thinking by default! We excited launch our best Gemini model for reasoning, multimodal and coding yet! #1 on LMSYS, Humanity’s Last Exam, AIME and GPQA and more!

TL;DR:
- 💻 Best Gemini coding model yet, particularly for web development (excels on LiveCodeBench).
- 🧠 Default "Thinking" with up to 64k token output
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏆 #1 on LMArena & sota on AIME, GPQA, Humanity's Last Exam
- 💡 Knowledge cut of January 2025
- 🤗 Available for free as Experimental in AI Studio, Gemini API & Gemini APP
- 🚀 Rate limits - Free 2 RPM 50 req/day

Try it ⬇️

https://aistudio.google.com/?model=gemini-2.5-pro-exp-03-25

3 replies

w11wo

authored a paper 2 months ago

COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition

Paper • 2503.07259 • Published Mar 10

julien-c

posted an update 2 months ago

Post

3775

Important notice 🚨

For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference – with more coming soon), we've started enabling Pay as you go (=PAYG)

What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.

You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.