Mohammed Hamdy's picture

Mohammed Hamdy

mmhamdy

·

AI & ML interests

TechBio | AI4Sci | NLP | Reinforcement Learning

Recent Activity

upvoted an article 18 days ago

Tiny Agents: a MCP-powered agent in 50 lines of code

upvoted a paper about 1 month ago

SmolVLM: Redefining small and efficient multimodal models

posted an update about 1 month ago

What inspired the Transformer architecture in the "Attention Is All You Need" paper? And how were various ideas combined to create this groundbreaking model? In this lengthy article, I explore the story and the origins of some of the ideas introduced in the paper. We'll explore everything from the fundamental attention mechanism that lies at its heart to the surprisingly simple explanation for its name, Transformer. 💡 Examples of ideas explored in the article: ✅ What was the inspiration for the attention mechanism? ✅ How did we go from attention to self-attention? ✅ Did the team have any other names in mind for the model? and more... I aim to tell the story of Transformers as I would have wanted to read it, and hopefully, one that appeals to others interested in the details of this fascinating idea. This narrative draws from video interviews, lectures, articles, tweets/Xs, and some digging into the literature. I have done my best to be accurate, but errors are possible. If you find inaccuracies or have any additions, please do reach out, and I will gladly make the necessary updates. Read the article: https://huggingface.co/blog/mmhamdy/pandemonium-the-transformers-story

View all activity

Organizations

mmhamdy's activity

upvoted an article 18 days ago

Article

Tiny Agents: a MCP-powered agent in 50 lines of code

19 days ago

• 231

upvoted a paper about 1 month ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 181

upvoted a paper 2 months ago

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 123

upvoted an article 2 months ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

Mar 4

• 74

upvoted a collection 2 months ago

Cohere Labs Aya Vision

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 28 days ago • 67

upvoted a collection 3 months ago

CHASE

Generate challenging synthetic data to evaluate LLMs • 5 items • Updated Feb 21 • 4

upvoted a paper 3 months ago

How to Get Your LLM to Generate Challenging Problems for Evaluation

Paper • 2502.14678 • Published Feb 20 • 17

upvoted a collection 3 months ago

Reasoning Datasets

44 items • Updated 4 days ago • 7

upvoted 3 papers 3 months ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19 • 34

From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions

Paper • 2502.13791 • Published Feb 19 • 5

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 121

upvoted 5 papers 4 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 98

METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring

Paper • 2501.02045 • Published Jan 3 • 21

EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation

Paper • 2501.01895 • Published Jan 3 • 56

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Paper • 2406.19314 • Published Jun 27, 2024 • 23

ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 83

upvoted 2 papers 5 months ago

PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion

Paper • 2412.17780 • Published Dec 23, 2024 • 5

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 9

upvoted a collection 5 months ago

Falcon3

Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated Feb 13 • 86