Spaces:

lbourdois
/

SSM_blog_posts

Running

App Files Files Community

lbourdois commited on Dec 12, 2023

Commit

a969571

1 Parent(s): 2e5e12d

Update README.md

Browse files

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -16,14 +16,14 @@ October 7, 2021, while wondering whether [AK](https://hf.co/akhaliq) was a bot o
 <img src="https://cdn-uploads.huggingface.co/production/uploads/613b0a62a14099d5afed7830/QMpNVGwdQV2jRw-jYalxa.png" alt="alt text" width="800" height="450">
 </center>
-Intrigued by the results announced, I went to read what this S3 model consisted of, which would be renamed less than a month later to [S4](https://twitter.com/_albertgu/status/1456031299194470407) ([link](https://github.com/lbourdois/blog/blob/master/assets/efficiently_modeling_long_sequences_s3.pdf) of the "original" version from when it was still called S3 for those interested).
-It's the only scientific article that gave me goosebumps when I read it, so beautiful did I find it. At that time, I was convinced that State Space Models (SSM) would replace transformers in the following months. Two years later, I'm forced to admit that I was completely mistaken in the face of the tidal wave of LLMs making the news in NLP.
-Nevertheless, on Monday December 4, 2023, the announcement of Mamba by [Albert Gu](https://twitter.com/_albertgu/status/1731727672286294400) and [Tri Dao](https://twitter.com/tri_dao/status/1731728602230890895) aroused some interest. The phenomenon was accentuated 4 days later with the announcement of [StripedHyena](https://twitter.com/togethercompute/status/1733213267185762411) by Together AI.
-A good opportunity for me to write a few words about SSM developments over the past two years.
-I'm planning three articles to start with, where the aim is to illustrate the basics of SSM with S4 (the "Attention is all you need" of the field) before carrying out a literature review of the evolution of SSM since that first paper:
 - [Introduction to SSM and S4](WIP)
 - [SSM evolutions in 2022](WIP)
 - [SSM developments in 2023](WIP)
-I hope in a second time, time permitting, to go into detail about the architectures of some specific SSMs with animations ✨

 <img src="https://cdn-uploads.huggingface.co/production/uploads/613b0a62a14099d5afed7830/QMpNVGwdQV2jRw-jYalxa.png" alt="alt text" width="800" height="450">
 </center>
+Intrigued by the results announced, I decided to read about this S3 model, which would be renamed less than a month later to [S4](https://twitter.com/_albertgu/status/1456031299194470407) ([link](https://github.com/lbourdois/blog/blob/master/assets/efficiently_modeling_long_sequences_s3.pdf) of the version from when it was still called S3 for those interested).
+This brilliant article impressed me. At the time, I was convinced that State Space Models (SSM) were going to be a revolution, replacing transformers in the coming months. Two years later, I'm forced to admit that I was completely wrong, given the tsunami of LLMs making the news in NLP.
+Nevertheless, on Monday December 4, 2023, the announcement of Mamba by [Albert Gu](https://twitter.com/_albertgu/status/1731727672286294400) and [Tri Dao](https://twitter.com/tri_dao/status/1731728602230890895) revived their interest. This phenomenon was accentuated 4 days later with the announcement of [StripedHyena](https://twitter.com/togethercompute/status/1733213267185762411) by Together AI.
+A good opportunity for me to write a few words about the developments in SSM over the last two years.
+I plan to write three articles first, where the aim is to illustrate the basics of SSM with S4 (the "Attention is all you need" of the field) before doing a literature review of the evolution of SSM since that first paper:
 - [Introduction to SSM and S4](WIP)
 - [SSM evolutions in 2022](WIP)
 - [SSM developments in 2023](WIP)
+I also hope in a second time to go into the details of the architectures of some specific SSMs with animations ✨