Spaces:
Sleeping
title: SSM Blog Posts
emoji: 📝
colorFrom: purple
colorTo: yellow
sdk: static
pinned: false
Une version en français est disponible sur mon blog
October 7, 2021, while wondering whether AK was a bot or a human, I saw one of his tweets.
A link to a publication on open-review.net accompanied by the following image:

Intrigued by the results announced, I went to read what this S3 model consisted of, which would be renamed less than a month later to S4 (link from the version from when it was still called S3 for those interested).
It's the only scientific article that gave me goosebumps when I read it, so beautiful did I find it. At that time, I was convinced that State Space Models (SSM) would replace transformers in the following months. Two years later, I'm forced to admit that I was completely mistaken in the face of the tidal wave of LLMs making the news in NLP.
Nevertheless, on Monday December 4, 2023, the announcement of Mamba by Albert Gu and Tri Dao aroused some interest. The phenomenon was accentuated 4 days later with the announcement of StripedHyena by Together AI.
A good opportunity for me to write a few words about SSM developments over the past two years.
I'm planning three articles to start with, where the aim is to illustrate the basics of SSM with S4 (the "Attention is all you need" of the field) before carrying out a literature review of the evolution of SSM since that first paper:
I hope in a second time, time permitting, to go into detail about the architectures of some specific SSMs with animations ✨