thuml
/

sundial-base-128m

Time Series Forecasting

foundation models

pretrained models

generative models

time series foundation models

Model card Files Files and versions

Yong99 commited on 5 days ago

Commit

2f40cf7

·

verified ·

1 Parent(s): a1b1746

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -38,12 +38,12 @@ Not only the mean or quantiles, you can estimate anything about the predictive d
 The base version is pre-trained on **1 trillion** time points with **128M** parameters. For more information, please refer to this [paper](https://arxiv.org/pdf/2502.00816).
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64fbe24a2d20ced4e91de38a/B5w-TNPnTBpChexIhsVOp.png)
 **Overall Architecture**: The input time series is divided into patch tokens, which are embedded from the original continuous values. The patch embeddings are fed into a decoder-only Transformer, a stable and speedup version that learns token representations. The model is optimized using our TimeFlow Loss, a parameterized loss function that models per-token probability distribution conditioned on the learned representations, and generates multiple plausible predictions under the flow-matching framework.
-**Sundial** can be viewed as an **ARMA** model (Auto-Regression and Moving-Average). Transformer learns auto-regressive token representations. Conditioned on them, TimeFlow transforms random noises into non-deterministic predictions.
 ## Quickstart
 ```
 pip install transformers==4.40.1 # Use this version and Python 3.10 for stable compatibility

 The base version is pre-trained on **1 trillion** time points with **128M** parameters. For more information, please refer to this [paper](https://arxiv.org/pdf/2502.00816).
+**Sundial** can be viewed as an **ARMA** model (Auto-Regression and Moving-Average). Transformer learns auto-regressive token representations. Conditioned on them, TimeFlow transforms random noises into non-deterministic predictions.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64fbe24a2d20ced4e91de38a/B5w-TNPnTBpChexIhsVOp.png)
 **Overall Architecture**: The input time series is divided into patch tokens, which are embedded from the original continuous values. The patch embeddings are fed into a decoder-only Transformer, a stable and speedup version that learns token representations. The model is optimized using our TimeFlow Loss, a parameterized loss function that models per-token probability distribution conditioned on the learned representations, and generates multiple plausible predictions under the flow-matching framework.
 ## Quickstart
 ```
 pip install transformers==4.40.1 # Use this version and Python 3.10 for stable compatibility