MAMUT-MathBert (Math Mutator MathBERT)

MAMUT-MathBERT is a pretrained language model based on tbs17/MathBERT, further pretrained on mathematical texts and formulas. It was introduced in MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training.

Despite its base model is already a mathematical model, our training aims to improve the mathematical understanding even further, as shown in our paper.

Model Details

Overview

MAMUT-MPBERT was pretrained on four math-specific tasks across four datasets.

Mathematical Formulas (MF): A Masked Language Modeling (MLM) task on math formulas written in LaTeX.
Mathematical Texts (MT): An MLM task on natural language text containing inline LaTeX math (mathematical texts). The masking probability was biased toward mathematical tokens (inside math environment $...$) and domain-specific terms (e.g., sum, one, ...)
Named Math Formulas (NMF): A Next-Sentence-Prediction (NSP)-style task: given a formula and the name of a mathematical identity (e.g., Pythagorean Theorem), classify whether they match.
Math Formula Retrieval (MFR): Another NSP-style task to decide if two formulas describe the same mathematical identity or concept.

Model Sources

Base Model: tbs17/MathBERT (whose base model is bert-base-cased)
Pretraining Code: aieng-lab/transformer-math-pretraining
MAMUT Repository: aieng-lab/math-mutator
Paper: MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training

Uses

MAMUT-MathBERT is intended for downstream tasks that require improved mathematical understanding, such as:

Formula classification
Retrieval of semantically similar formulas
Math-related question answering

Note: This model was saved without the MLM or NSP heads and requires fine-tuning before use in downstream tasks.

Similarly trained models are MAMUT-BERT based on bert-base-cased and MAMUT-MPBERT based on AnReu/math_structure_bert (best of the three models according to our evaluation).

Training Details

Training configurations are described in Appendix C of the MAMUT paper.

Evaluation

The model is evaluated in Section 7 and Appendix C.4 of the MAMUT paper (MAMUT-MPBERT).

Environmental Impact

Hardware Type: 8xA100
Hours used: 48
Compute Region: Germany

Citation

BibTeX:

@article{
  drechsel2025mamut,
  title={{MAMUT}: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training},
  author={Jonathan Drechsel and Anja Reusch and Steffen Herbold},
  journal={Transactions on Machine Learning Research},
  issn={2835-8856},
  year={2025},
  url={https://openreview.net/forum?id=khODmRpQEx}
}

aieng-lab
/

MathBERT-mamut