Skip to content

UbiquantAI/URM

Repository files navigation

Universal Reasoning Model

paper

Universal transformers (UTs) have been widely used for complex reasoning tasks such as ARC-AGI and Sudoku, yet the specific sources of their performance gains remain underexplored. In this work, we systematically analyze UTs variants and show that improvements on ARC-AGI primarily arise from the recurrent inductive bias and strong nonlinear components of Transformer, rather than from elaborate architectural designs. Motivated by this finding, we propose the Universal Reasoning Model (URM), which enhances the UT with short convolution and truncated backpropagation. Our approach substantially improves reasoning performance, achieving state-of-the-art 53.8% pass@1 on ARC-AGI 1 and 16.0% pass@1 on ARC-AGI 2.

⚠️ For Question Regarding Sudoku Score

The reported score of 87.4% in the TRM paper is obtained using an MLP model, which we believe it is completely different from the TRM architecture in ARC-AGI task. Therefore, for fair comparison, when reproducing the results, we unified the architectures for ARC-AGI 1, ARC-AGI 2, and Sudoku to be exactly the same, which means the architecture used to reproduce Sudoku is the same TRM architecture used to run ARC-AGI.

Reproducing the correct TRM Sudoku score:

git clone https://github.com/SamsungSAILMontreal/TinyRecursiveModels cd TinyRecursiveModels python dataset/build_sudoku_dataset.py --output-dir data/sudoku-extreme-1k-aug-1000 --subsample-size 1000 --num-aug 1000 run_name="pretrain_att_sudoku" python pretrain.py \ arch=trm \ data_paths="[data/sudoku-extreme-1k-aug-1000]" \ evaluators="[]" \ epochs=50000 eval_interval=5000 \ lr=1e-4 puzzle_emb_lr=1e-4 weight_decay=1.0 puzzle_emb_weight_decay=1.0 \ arch.L_layers=2 \ arch.H_cycles=3 arch.L_cycles=6 \ +run_name=${run_name} ema=True

Results:

image

Installation

pip install -r requirements.txt

Login Wandb

wandb login YOUR_API_KEY

Preparing Data

# ARC-AGI-1 python -m data.build_arc_dataset \ --input-file-prefix kaggle/combined/arc-agi \ --output-dir data/arc1concept-aug-1000 \ --subsets training evaluation concept \ --test-set-name evaluation # ARC-AGI-2 python -m data.build_arc_dataset \ --input-file-prefix kaggle/combined/arc-agi \ --output-dir data/arc2concept-aug-1000 \ --subsets training2 evaluation2 concept \ --test-set-name evaluation2 # Sudoku python data/build_sudoku_dataset.py --output-dir data/sudoku-extreme-1k-aug-1000 --subsample-size 1000 --num-aug 1000

Reproducing ARC-AGI 1 Score

bash scripts/URM_arcagi1.sh

Reproducing ARC-AGI 2 Score

bash scripts/URM_arcagi2.sh

Reproducing Sudoku Score

bash scripts/URM_sudoku.sh

Citation

@misc{gao2025universalreasoningmodel, title={Universal Reasoning Model}, author={Zitian Gao and Lynx Chen and Yihao Xiao and He Xing and Ran Tao and Haoming Luo and Joey Zhou and Bryan Dai}, year={2025}, eprint={2512.14693}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2512.14693}, } 

About

Universal Reasoning Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published