Thoth: Mid-Training Bridges LLMs to Time Series Understanding
π Introduction
While Large Language Models (LLMs) demonstrate exceptional proficiency in general reasoning, they often exhibit a fundamental limitation in capturing intricate temporal dependencies. To bridge this gap, Thoth introduces the first family of mid-trained LLMs that transcend the constraints of task-specific Supervised Fine-Tuning (SFT) through a task- and domain-agnostic mid-training stage. By leveraging an automated synthesis pipeline to achieve bidirectional alignment between time-series-to-text and text-to-time-series generation, Thoth equips models with an intrinsic and foundational understanding of temporal dynamics. This internalized comprehension enables the model to effectively address and enhance performance across a wide range of complex, knowledge-intensive time series reasoning downstream tasks in real-world scenarios.
Thoth-30B-A3B is a full-parameter fine-tuned version based on the Qwen3-30B-A3B-Instruct-2507. For more details, please check our paper.
β¨ Quickstart
pip install transformers==4.57.1
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "thuml/Thoth-30B-A3B"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", dtype=torch.bfloat16, trust_remote_code=True).eval()
# A simple time series anomaly detection task
question = """The following data represents the hourly electricity consumption (in kWh) of an office building over a 24-hour period, starting from midnight (00:00).
Data: [12.5, 11.8, 12.1, 11.5, 12.2, 11.9, 15.6, 32.4, 35.1, 34.8, 36.2, 65.5, 37.0, 35.5, 34.2, 33.9, 35.1, 31.8, 18.2, 14.5, 13.1, 12.8, 12.4, 11.9]
Task: 1. Specify the hour (0-23) when the anomaly occurs. 2. Provide a brief reasoning why you consider it an anomaly."""
messages = [
{"role": "system", "content": "You are an expert in time series understanding and reasoning."},
{"role": "user", "content": question}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Generate reasoning output
generated_ids = model.generate(**model_inputs, max_new_tokens=512, temperature=0.7)
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
For detailed evaluation, please visit our GitHub repository: https://github.com/thuml/Thoth.
π Release Progress
- Thoth-30B-A3B model weights
- public benchmark evaluation pipeline
- KnoTS benchmark
- KnoTS evaluation code
π Citation
If you find our work useful, please cite our paper as:
@article{lin2026thoth,
title={Thoth: Mid-Training Bridges LLMs to Time Series Understanding},
author={Lin, Jiafeng and Wang, Yuxuan and Wu, Jialong and Luo, Huakun and Pei, Zhongyi and Wang, Jianmin},
journal={arXiv preprint arXiv:2603.01042},
year={2026}
}
π€ Contact
If you have any questions, feel free to contact:
- Jiafeng Lin (lin-jf21@mails.tsinghua.edu.cn)
- Yuxuan Wang (wangyuxu22@mails.tsinghua.edu.cn)
- Jialong Wu (wujialong0229@gmail.com)
π‘ Acknowledgment
We sincerely appreciate the following works for their valuable open-source models and evaluation benchmarks: Qwen3, Time-MQA, ChatTime, ChatTS.
- Downloads last month
- 27
Model tree for thuml/Thoth-30B-A3B
Base model
Qwen/Qwen3-30B-A3B-Instruct-2507