Spaces:
Running
Running
title: README | |
emoji: π | |
colorFrom: blue | |
colorTo: red | |
sdk: static | |
pinned: false | |
short_description: Reactive AI - Reactive Neural Networks and Event-Driven AI | |
<img src="https://raw.githubusercontent.com/RxAI-dev/RxNN/refs/heads/main/assets/logo/logo_rxai_v2.png" width="350" /> | |
# Reactive AI | |
We are working on our own ideas of Reactive Neural Networks (RxNN) and Event-Driven AI, advancing from language models to AGI awareness models. | |
## Reactive Neural Networks and Event-Driven AI | |
Reactive Neural Networks (RxNN) are memory-augmented neural networks with higher levels of recurrence (inter-sequence vs. intra-sequence in RNNs), | |
focused on processing single interactions with access to previous interactions via memory layers. We call this _**event-driven real-time processing**_ | |
to distinguish it from classical _data-driven processing_ of the full conversation history in each interaction. This difference is crucial in case | |
of AGI and awareness - the key feature of humans awareness, is that we remember what we were doing 10 mins ago, without recalling the whole-day history - we | |
are working in real-time - just like event-driven _Reactive Neural Networks_. | |
In Event-Driven AI models are processing the data in reaction to environment or internal events, and are emitting other response events as a result. | |
Processing of input and output events by the model is called the interaction. Event or an interaction could occur in any point in continous time. Models | |
have to be stateful and remember the data between the interactions. | |
_**Strong Reactive Neural Networks**_ like **Reactor** could emit and listen to its internal events, while the _**Weak Reactive Neural Networks**_ are | |
working only on environment events. | |
## Reactor AGI | |
<img src="https://raw.githubusercontent.com/RxAI-dev/RxNN/refs/heads/main/assets/logo/logo_reactor.png" width="350" /> | |
Our primary architecture - **Reactor** - is planned as the first _**awareness AGI model**_, that's modelling awareness as an _Infinite Chain-of-Thoughts_, | |
connected to _Short-Term and Long-Term Memory_ (_Attention-based Memory System_) and _Receptors/Effectors_ systems for real-time reactive processing. | |
It will be able to constantly and autonomously learn from interactions in _Continouos Live Learning_ process. | |
> Reactor architecture details and mathematical model were analysed by 30 state-of-the-art LLM/Reasoning models that rated it's potential | |
> to reach the AGI as ~4.35/5 | |
## Reactive Language Models (RxLM) | |
While the **Reactor** is the main goal, it's extremely hard to achieve, as it's definitely the most advanced neural network ensemble ever. | |
That's why we designed simplified architectures, for incremental transformation from language/reasoning models to awareness model: | |
- **Reactive Transformer** is introducing _Attention-based Memory System_ and adding _Short-Term Memory_ to Transformer language models | |
- **Preactor** is adding _Long-Term Memory_ and ability to learn from interactions | |
## RxLM vs LLM advantages | |
Processing single interactions in real-time by **Reactive Language Models** leads to **revolutional** improvements in inference speed/cost: | |
- LLM inference costs are increasing quadratically with conversation length (accumulated for each next message), because of full dialog history processing | |
- RxLM inference costs are linear, depending only on single interaction tokens (not accumulated) - each next interaction is `number of steps` times cheaper than for LLM | |
- same for inference speed - LLM has to process full history, while RxLM only single message (only first interaction could be slower because of encoder/memory attention overhead) | |
> In example, for a dialog with **DeepSeek R1**, that have overally ~90k tokens, I paid for about 1.5M tokens. With **RxLM** it will cost only that ~90k tokens, so it | |
> will be about **15x cheaper** | |
> Reactive Transformer architecture was analysed by 10 state-of-the-art LLM/Reasoning models for its innovations and market disruption potential, | |
> rated as ~4.36/5.0. Check - [Reactive Transformer AI Analysis](https://github.com/RxAI-dev/RxNN/blob/main/docs/research/ReactiveTransformer/ai-analysis.md) | |
## Reactive Transformer - drafts | |
- [Architecture introduction](https://github.com/RxAI-dev/RxNN/blob/main/docs/research/ReactiveTransformer/reactive-transformer.md) | |
- [Supervised Training stages](https://github.com/RxAI-dev/RxNN/blob/main/docs/research/ReactiveTransformer/supervised-training.md) | |
- [Reinforcement Learning stages](https://github.com/RxAI-dev/RxNN/blob/main/docs/research/ReactiveTransformer/mrl.md) | |
### RxT-Alpha Open Research | |
We are currently working on **Reactive Transformer Proof-of-Concept - RxT-Alpha**, especially on the new reinforcement learning stage - **Memory Reinforcement Learning**, | |
that's required for our reactive models, between the _Supervised Memory System Training (SMST)_ and _Reinforcement Learning from Human Feedback for reactive models (RxRLHF)_. | |
The research is open, we are publishing the results of all separate steps, just after finishing them. | |
We are currently finishing **MRL** training for the world's first experimental (Proof-of-Concept) reactive model - [RxT-Alpha-Micro-Plus](https://huggingface.co/ReactiveAI/RxT-Alpha-Micro-Plus). | |
That's only a micro-scale PoC (~27M params) trained on simple synthetic datasets to demonstrate memory system. Then, we will move to bigger scales and real-world datasets in RxT-Alpha-Mini and RxT-Alpha | |
## RxNN Platform | |
<img src="https://raw.githubusercontent.com/RxAI-dev/RxNN/refs/heads/main/assets/logo/logo_rxnn_v2.png" width="350" /> | |
We are working on complete Reactive Neural Networks development framework - [RxNN github](https://github.com/RxAI-dev/RxNN) | |
## Additional Research | |
- **Sparse Query Attention (SQA)** - the most cost-effective GQA variant, even 2-3x faster for long sequences! Research in progress - [draft](https://github.com/RxAI-dev/RxNN/blob/main/docs/research/sparse_query_attention.md) | |
- **Flex-SQA** - combination of Flex Attention and (symmetric) Sparse Query Attention, enabling 4-8x longer sliding windows | |
- **Flex Memory Attention/Memory Cross-Attention** - connecting spatially sparse attention with memory layers to enable very long single interactions - smaller sliding window for input sequences attends to full memory, or the opposite | |
- **Mixture-of-Experts for Grouped Attention** - MoE Router dynamically selects GQA/SQA groups, instead of static selection. Abandoned, because results were worse than for GQA/SQA - [more](https://github.com/RxAI-dev/RxNN/blob/main/docs/research/moe_attention.md) |