File size: 1,191 Bytes
5d9971a
 
 
 
bbc978a
6a7b34d
9241187
6a7b34d
 
 
bbc978a
6a7b34d
965cc44
 
 
 
 
 
 
 
 
 
 
 
b864f6f
965cc44
 
 
 
 
b864f6f
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
license: mit
language:
- en
pipeline_tag: text-generation
---
# OpenSML: An Family of Open Small Language Models (coming soon)

*William Zebrowski*

Introduce **OpenSML**, a series of **Open** **SM**a**L**l Language Models.  These models arcitecture are built stricly will Apple's [MLX](https://ml-explore.github.io/mlx/build/html/index.html#) framework.

The pre-training dataset is a slice of OpenWebText dataset with approximately 2.3 billion tokens.



## Bias, Risks, and Limitations

OpenSMLis shared to advance open research by granting access to cutting-edge language models. However, because it’s trained on publicly sourced data and released without safety warranties, it may produce content that is inaccurate, harmful, biased, or otherwise objectionable. Users and developers should therefore conduct rigorous safety evaluations and put in place filtering or other safeguards that suit their specific use cases.

## Citation

If you find our work useful, please cite:

```BibTex 
@misc{zebrowski2025opensml,
  title={OpenSML: A Family of Small Language Models},
  author={William Zebrowski},
  year={2025},
  howpublished={\url{https://github.com/wzebrowski/opensml}}
}
```