Skip to content

Commit 7dcfab9

Browse files
Update README.md
1 parent f1d1bfc commit 7dcfab9

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66

77
Large language models (LLMs) have made impressive strides in reasoning tasks, yet mathematical problem-solving remains a challenge. Traditional "chain-of-thought" reasoning often produces verbose explanations and error-prone arithmetic. **DeepMath** tackles this by combining a small Python executor with a fine-tuned LLM, enabling concise, computation-driven reasoning.
88

9+
910
## DeepMath
1011

1112
DeepMath is built on **Qwen3-4B Thinking** and fine-tuned with **GRPO (Group Relative Policy Optimization)**. Instead of verbose text, the model emits **tiny Python snippets** for intermediate steps, runs them in a secure sandbox, and folds the results back into its reasoning, reducing errors and output length.
@@ -22,6 +23,7 @@ We evaluate DeepMath on four math datasets: **MATH500, AIME, HMMT, and HLE,** an
2223

2324
- Combining the agent with GRPO yields the largest gains.
2425

26+
🔗 Blog: https://huggingface.co/blog/intel-deepmath \
2527
👉 Model: <https://huggingface.co/Intel/deepmath-v1>
2628

2729
## Why DeepMath?

0 commit comments

Comments
 (0)