AdamF92 commited on
Commit
6cdb73d
·
verified ·
1 Parent(s): c0570e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -13
README.md CHANGED
@@ -64,19 +64,11 @@ Processing single interactions in real-time by **Reactive Language Models** lead
64
 
65
  ### RxT-Alpha Open Research
66
  We are currently working on **Reactive Transformer Proof-of-Concept - RxT-Alpha**, especially on the new reinforcement learning stage - **Memory Reinforcement Learning**,
67
- that's required for our reactive models, between the _Supervised Fine-Tuning_ and _Reinforcement Learning from Human Feedback for reactive models (RxRLHF)_. The research
68
- is open, we are publishing the results of all separate steps, just after finishing them.
69
-
70
- The Proof-of-Concept includes 3 small scale models based on **Reactive Transformer** architecture:
71
- - RxT-Alpha-Micro (~11M params) - pre-training and fine-tuning finished, MRL in progress - training based on small synthetic datasets
72
- - RxT-Alpha-Mini (~70M params) - pre-training in progress - training on real data
73
- - RxT-Alpha (~530M/0.5B params) - pre-training in progress - training on real data
74
-
75
- All the models have theoretically infinite context, limited only for single interaction (message + response), but in practice it's limited by short-term memory
76
- capacity (it will be improved in Preactor). Limits are:
77
- - RxT-Alpha-Micro - 256 tokens for single interaction, 6 * 256 for STM size (768kb), expected length of a smooth conversation min. ~4k tokens
78
- - RxT-Alpha-Mini - 1024 tokens for single interaction, 8 * 1024 for STM size (8mb), expected length of a smooth conversation min. ~16k tokens
79
- - RxT-Alpha - 2048 tokens for single interaction, 12 * 2048 for STM size (50mb), expected length of a smooth conversation min. ~32k tokens
80
 
81
 
82
  ## RxNN Platform
 
64
 
65
  ### RxT-Alpha Open Research
66
  We are currently working on **Reactive Transformer Proof-of-Concept - RxT-Alpha**, especially on the new reinforcement learning stage - **Memory Reinforcement Learning**,
67
+ that's required for our reactive models, between the _Supervised Memory System Training (SMST)_ and _Reinforcement Learning from Human Feedback for reactive models (RxRLHF)_.
68
+ The research is open, we are publishing the results of all separate steps, just after finishing them.
69
+
70
+ We are currently finishing **MRL** training for the world's first experimental (Proof-of-Concept) reactive model - [RxT-Alpha-Micro-Plus](https://huggingface.co/ReactiveAI/RxT-Alpha-Micro-Plus).
71
+ That's only a micro-scale PoC (~27M params) trained on simple synthetic datasets to demonstrate memory system. Then, we will move to bigger scales and real-world datasets in RxT-Alpha-Mini and RxT-Alpha
 
 
 
 
 
 
 
 
72
 
73
 
74
  ## RxNN Platform