Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -10,16 +10,14 @@ pinned: false
|
|
10 |
<b><p style="text-align: center; color:red">Une version en français est disponible sur mon [blog](https://lbourdois.github.io/blog/ssm/)</p></b>
|
11 |
<br>
|
12 |
|
13 |
-
October 7, 2021, while wondering whether [AK](https://hf.co/akhaliq) was a bot or a human, I saw one of his [tweets](https://twitter.com/_akhaliq/status/1445931206030282756).
|
14 |
-
A link to a publication on [open-review.net](https://openreview.net/forum?id=uYLFoz1vlAC) accompanied by the following image:
|
15 |
|
|
|
16 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/613b0a62a14099d5afed7830/QMpNVGwdQV2jRw-jYalxa.png" alt="alt text" width="800" height="450">
|
|
|
17 |
|
18 |
-
|
19 |
-
Intrigued by the results announced, I went to read what this S3 model consisted of, which would be renamed less than a month later to [S4](https://twitter.com/_albertgu/status/1456031299194470407) ([link](https://github.com/lbourdois/blog/blob/master/assets/efficiently_modeling_long_sequences_s3.pdf) from the version from when it was still called S3 for those interested).
|
20 |
-
|
21 |
It's the only scientific article that gave me goosebumps when I read it, so beautiful did I find it. At that time, I was convinced that State Space Models (SSM) would replace transformers in the following months. Two years later, I'm forced to admit that I was completely mistaken in the face of the tidal wave of LLMs making the news in NLP.
|
22 |
-
|
23 |
Nevertheless, on Monday December 4, 2023, the announcement of Mamba by [Albert Gu](https://twitter.com/_albertgu/status/1731727672286294400) and [Tri Dao](https://twitter.com/tri_dao/status/1731728602230890895) aroused some interest. The phenomenon was accentuated 4 days later with the announcement of [StripedHyena](https://twitter.com/togethercompute/status/1733213267185762411) by Together AI.
|
24 |
A good opportunity for me to write a few words about SSM developments over the past two years.
|
25 |
|
|
|
10 |
<b><p style="text-align: center; color:red">Une version en français est disponible sur mon [blog](https://lbourdois.github.io/blog/ssm/)</p></b>
|
11 |
<br>
|
12 |
|
13 |
+
October 7, 2021, while wondering whether [AK](https://hf.co/akhaliq) was a bot or a human, I saw one of his [tweets](https://twitter.com/_akhaliq/status/1445931206030282756). A link to a publication on [open-review.net](https://openreview.net/forum?id=uYLFoz1vlAC) accompanied by the following image:
|
|
|
14 |
|
15 |
+
<center>
|
16 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/613b0a62a14099d5afed7830/QMpNVGwdQV2jRw-jYalxa.png" alt="alt text" width="800" height="450">
|
17 |
+
</center>
|
18 |
|
19 |
+
Intrigued by the results announced, I went to read what this S3 model consisted of, which would be renamed less than a month later to [S4](https://twitter.com/_albertgu/status/1456031299194470407) ([link](https://github.com/lbourdois/blog/blob/master/assets/efficiently_modeling_long_sequences_s3.pdf) of the "original" version from when it was still called S3 for those interested).
|
|
|
|
|
20 |
It's the only scientific article that gave me goosebumps when I read it, so beautiful did I find it. At that time, I was convinced that State Space Models (SSM) would replace transformers in the following months. Two years later, I'm forced to admit that I was completely mistaken in the face of the tidal wave of LLMs making the news in NLP.
|
|
|
21 |
Nevertheless, on Monday December 4, 2023, the announcement of Mamba by [Albert Gu](https://twitter.com/_albertgu/status/1731727672286294400) and [Tri Dao](https://twitter.com/tri_dao/status/1731728602230890895) aroused some interest. The phenomenon was accentuated 4 days later with the announcement of [StripedHyena](https://twitter.com/togethercompute/status/1733213267185762411) by Together AI.
|
22 |
A good opportunity for me to write a few words about SSM developments over the past two years.
|
23 |
|