File size: 1,387 Bytes
9313128
 
 
 
 
 
 
 
6f35935
1c2c7ce
90cb568
1c2c7ce
90cb568
1c2c7ce
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
---
title: README
emoji: 🌖
colorFrom: yellow
colorTo: yellow
sdk: streamlit
pinned: false
---
We are a group of volunteer researchers dedicated to promoting equal access to multimodal and multilingual AI. Our goal is to build a permissive and open stack for developing multimodal LLMs. This initiative is a collaborative effort led by OntocordAI. We began as an effort named MDEL [Multi-Domain Expert Learning](https://huggingface.co/Multi-Domain-Expert-Learning).

The -m in Aurora-M refers to our focus on multimodal, multilingual, multidomain mixture-of-experts (MoE) models, each of which we aim to explore and develop through ongoing research.

Building on our previous success— [Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code](https://aclanthology.org/2025.coling-industry.56/) — we are training a new family of models called [Aurora-M2](https://aurora-lm.github.io/posts/about-us/) aligned with laws, regulations, and policies for controllable AI. The series will include models with parameter sizes of 3B, 8B, and 21B, aligned with the comprehensive policy framework of the EU AI Act, specifically Annex III of the Act.

As part of our commitment to openness, we plan to open-source the entire training pipeline and experimental process—including data synthesis and the evolving methodologies we employ in model training. Stay with us!