Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -64,19 +64,11 @@ Processing single interactions in real-time by **Reactive Language Models** lead
|
|
64 |
|
65 |
### RxT-Alpha Open Research
|
66 |
We are currently working on **Reactive Transformer Proof-of-Concept - RxT-Alpha**, especially on the new reinforcement learning stage - **Memory Reinforcement Learning**,
|
67 |
-
that's required for our reactive models, between the _Supervised
|
68 |
-
is open, we are publishing the results of all separate steps, just after finishing them.
|
69 |
-
|
70 |
-
|
71 |
-
|
72 |
-
- RxT-Alpha-Mini (~70M params) - pre-training in progress - training on real data
|
73 |
-
- RxT-Alpha (~530M/0.5B params) - pre-training in progress - training on real data
|
74 |
-
|
75 |
-
All the models have theoretically infinite context, limited only for single interaction (message + response), but in practice it's limited by short-term memory
|
76 |
-
capacity (it will be improved in Preactor). Limits are:
|
77 |
-
- RxT-Alpha-Micro - 256 tokens for single interaction, 6 * 256 for STM size (768kb), expected length of a smooth conversation min. ~4k tokens
|
78 |
-
- RxT-Alpha-Mini - 1024 tokens for single interaction, 8 * 1024 for STM size (8mb), expected length of a smooth conversation min. ~16k tokens
|
79 |
-
- RxT-Alpha - 2048 tokens for single interaction, 12 * 2048 for STM size (50mb), expected length of a smooth conversation min. ~32k tokens
|
80 |
|
81 |
|
82 |
## RxNN Platform
|
|
|
64 |
|
65 |
### RxT-Alpha Open Research
|
66 |
We are currently working on **Reactive Transformer Proof-of-Concept - RxT-Alpha**, especially on the new reinforcement learning stage - **Memory Reinforcement Learning**,
|
67 |
+
that's required for our reactive models, between the _Supervised Memory System Training (SMST)_ and _Reinforcement Learning from Human Feedback for reactive models (RxRLHF)_.
|
68 |
+
The research is open, we are publishing the results of all separate steps, just after finishing them.
|
69 |
+
|
70 |
+
We are currently finishing **MRL** training for the world's first experimental (Proof-of-Concept) reactive model - [RxT-Alpha-Micro-Plus](https://huggingface.co/ReactiveAI/RxT-Alpha-Micro-Plus).
|
71 |
+
That's only a micro-scale PoC (~27M params) trained on simple synthetic datasets to demonstrate memory system. Then, we will move to bigger scales and real-world datasets in RxT-Alpha-Mini and RxT-Alpha
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
72 |
|
73 |
|
74 |
## RxNN Platform
|