Empowering forecasts with precision and efficiency.
- Overview
- Why LTSM-bundle
- Features
- Installation
- Project Structure
- Datasets and Prompts
- Model Access
- Cite This Work
- License
- Acknowledgments
This work investigates the transition from traditional Time Series Forecasting (TSF) to Large Time Series Models (LTSMs), leveraging large transformer-based models like GPT. Training LTSMs on diverse time series data introduces challenges due to varying frequencies, dimensions, and patterns.
We explore multiple design choices, including pre-processing strategies, tokenization, model architectures, and dataset setups. We introduce:
- Time Series Prompt: A statistical prompting strategy
- LTSM-bundle: A toolkit encapsulating effective design practices
The project is developed by the Data Lab at Rice University.
The LTSM-bundle leverages HuggingFace transformers, allowing flexible integration of large-scale pre-trained language models for time series tasks. Users can customize the pipeline to fit specific forecasting needs with minimal overhead, making it adaptable across various domains and industries.
Key highlights:
- Plug-and-play with GPT-style backbones
- Modular pipeline for easy experimentation
- Support for statistical and text prompts
| Category | Highlights |
|---|---|
| βΓ―ΒΈοΏ½ Architecture | Modular design, GPT-style transformers for time series |
| π Prompting | Time Series Prompt & Text Prompt support |
| β‘οΈ Performance | GPU acceleration, optimized pipelines |
| π§ Integrations | LoRA support, JSON/CSV-based dataset and prompt interfaces |
| π¬ Testing | Unit and integration tests, GitHub Actions CI |
| π Data | Built-in data loaders, scalers, and tokenizers |
| π Documentation | The documents are in English and Chinese |
We recommend using Conda:
conda create -n ltsm python=3.8.0 conda activate ltsmThen install the package:
git clone https://github.com/datamllab/ltsm.git cd ltsm pip install -e . pip install -r requirements.txtfrom ltsm.data_pipeline import StatisticalTrainingPipeline, get_args, seed_all from ltsm.models.base_config import LTSMConfig from ltsm.common.base_training_pipeline import TrainingConfig # Option 1: Load config via command-line arguments config = get_args() # Option 2: Load config from a JSON file config = TrainingConfig.load("example.json") # Option 3: Manually customize a supported model config in code # (e.g., LTSMConfig, DLinearConfig, InformerConfig, etc.) config = LTSMConfig(seq_len=336, pred_len=96) # Set random seeds for reproducibility seed = config.train_params["seed"] seed_all(seed) # Initialize the training pipeline with the loaded config pipeline = StatisticalTrainingPipeline(config) # Run the training and evaluation process pipeline.run()import os import torch import pandas as pd from huggingface_hub import hf_hub_download from safetensors.torch import load_file from ltsm.models import LTSMConfig, ltsm_model # Download model config and weights from Hugging Face config_path = hf_hub_download("LSC2204/LTSM-bundle", "config.json") weights_path = hf_hub_download("LSC2204/LTSM-bundle", "model.safetensors") # Load model and weights model_config = LTSMConfig() model_config.load(config_path) model = ltsm_model.LTSM(model_config) state_dict = load_file(weights_path) model.load_state_dict(state_dict) # Move model to device device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = model.to(device).eval() # Load your dataset (e.g., weather) df_weather = pd.read_csv("/path/to/dataset.csv") print("Loaded data shape:", df_weather.shape) # Load prompts per feature feature_prompts = {} prompt_dir = "/path/to/prompts/" for feature, filename in { "T (degC)": "weather_T (degC)_prompt.pth.tar", "rain (mm)": "weather_rain (mm)_prompt.pth.tar" }.items(): prompt_tensor = torch.load(os.path.join(prompt_dir, filename)) feature_prompts[feature] = prompt_tensor.squeeze(0).float().to(device) # Predict (custom code here depending on your model usage) # For example: with torch.no_grad(): inputs = feature_prompts["T (degC)"].unsqueeze(0) preds = model(inputs) print("Prediction output shape:", preds.shape)βββ ltsm-package/ βββ datasets β βββ README.md βββ imgs β βββ ltsm_model.png β βββ prompt_csv_tsne.png β βββ stat_prompt.png βββ ltsm βΒ Β βββ common # Base classes βΒ Β βββ data_pipeline # Model lifecycle management and training pipeline βΒ Β βββ data_provider # Dataset construction βΒ Β βββ data_reader # Read input data from various formats (CSV, JSON, etc.) βΒ Β βββ evaluate_pipeline # Evaluation workflow for model performance βΒ Β βββ layers # Custom neural network components βΒ Β βββ models # Implementations: LTSM, DLinear, Informer, PatchTST βΒ Β βββ prompt_reader # Prompt generation and formatting βΒ Β βββ sk_interface # Scikit-learn style interface βΒ Β βββ utils # Shared helper functions βββ multi_agents_pipeline # Multi-agent time series reasoning framework βΒ Β βββ Readme.md βΒ Β βββ agents # Agent definitions: Planner, QA, TS, Reward βΒ Β βββ llm-server.py # Local LLM server interface βΒ Β βββ ltsm_inference.py # Inference script using LTSM pipeline βΒ Β βββ main.py # Pipeline entry point βΒ Β βββ model_config.yaml # Configuration file for models and agents βββ requirements.txt βββ setup.py βββ tests # Unit tests for LTSM modules βΒ Β βββ common βΒ Β βββ data_pipeline βΒ Β βββ data_provider βΒ Β βββ data_reader βΒ Β βββ evaluate_pipeline βΒ Β βββ models βΒ Β βββ test_scripts βββ tutorial βββ README.md Download datasets:
cd datasets # Google Drive link: https://drive.google.com/drive/folders/1hLFbz0FRxdiDCzgFYtKCOPJYSBVvwW9PDownload time series prompts:
cd prompt_bank/prompt_data_csv # Same Google Drive link appliesYou can find our trained LTSM models on Hugging Face:
β‘οΈ https://huggingface.co/LSC2204/LTSM-bundle
If you find this work useful, please cite:
@misc{chuang2025ltsmbundletoolboxbenchmarklarge, title={LTSM-Bundle: A Toolbox and Benchmark on Large Language Models for Time Series Forecasting}, author={Yu-Neng Chuang and Songchen Li and Jiayi Yuan and Guanchu Wang and Kwei-Herng Lai and Songyuan Sui and Leisheng Yu and Sirui Ding and Chia-Yuan Chang and Qiaoyu Tan and Daochen Zha and Xia Hu}, year={2025}, eprint={2406.14045}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2406.14045}, }This project is licensed under the MIT License. See the LICENSE file for details.
We thank all contributors and collaborators involved in the LTSM project. Special thanks to the Data Lab at Rice University and the open-source community for enabling fast prototyping and reproducible research.
