EtherAI
/

v1-Beta

@@ -138,7 +138,7 @@ Detailed examination and analysis are pending further development and results fr
 ### Model Architecture and Objective
-V1 is implemented using PyTorch. The architecture is Transformer-based but includes **custom components designed to achieve SOTA (State-of-the-Art) performance**. A key design principle was **parameter efficiency**, and the architecture aims to offer advantages over standard transformers in this regard.
 The training objective was next token prediction (standard language modeling).
 Further architectural details (e.g., number of layers, hidden size) are standard for its model class but are not disclosed for this base model release.
@@ -150,4 +150,4 @@ Training was performed on a GPU cluster.
 #### Software
-- PyTorch, Hugging Face Transformers, and other standard machine learning libraries were utilized.

 ### Model Architecture and Objective
+V1 is implemented using PyTorch. The architecture includes Transformer-based components also but includes **custom components designed to achieve SOTA (State-of-the-Art) performance**. A key design principle was **parameter efficiency**, and the architecture aims to offer advantages over standard transformers in this regard.
 The training objective was next token prediction (standard language modeling).
 Further architectural details (e.g., number of layers, hidden size) are standard for its model class but are not disclosed for this base model release.
 #### Software
+- PyTorch, Core  Transformers Components, and other standard machine learning libraries were utilized.