[BIG tensor]Fix the infinite loop issue in argsort. #74434

Difers · 2025-08-05T13:22:03Z

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

Pcard-73145

问题背景：
在以下case argsort会导致hang住：

import numpy import paddle input_tensor = (numpy.random.random([2, 1140850690]) - 0.5).astype("float16") paddle_x = paddle.to_tensor(numpy_tensor) paddle_out = paddle.argsort(paddle_x)

问题原因
在 [PHI] Fixed argsort big tensor bug #72712 中实现了拆分为多个 cub kernel；但拆分计算batch时每次参与 sort 的元素不超过 2^30 个

constexpr int64_t max_elements = 1 << 30; const int64_t segment_size = num_cols; const int64_t element_per_call = std::min(max_elements, total_elements); const int64_t batch_size = (element_per_call / segment_size) * segment_size;

在被sort的维度大于2^30时，(element_per_call / segment_size)为0，导致死循环

报错补充
在被sort的维度大于2^31时，调用DeviceSegmentedRadixSort::SortPairs依旧有问题，参考torch处理，对该情况添加报错信息。
https://github.com/pytorch/pytorch/blob/aeb5321b6360c899808d3461789b3bbd6265756e/aten/src/ATen/native/cuda/Sort.cpp#L63

paddle-bot · 2025-08-05T13:22:18Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

wanghuancoder

LGTM

fix argsort

85da4ca

lshpku approved these changes Aug 6, 2025

View reviewed changes

wanghuancoder approved these changes Aug 7, 2025

View reviewed changes

wanghuancoder merged commit 51c12fb into PaddlePaddle:develop Aug 7, 2025
87 of 90 checks passed

Difers mentioned this pull request Aug 8, 2025

[Big tensor]add some argsort case PFCCLab/PaddleAPITest#505

Merged

Enigmatisms pushed a commit to Enigmatisms/Paddle that referenced this pull request Aug 9, 2025

fix argsort (PaddlePaddle#74434)

ddb38a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BIG tensor]Fix the infinite loop issue in argsort. #74434

[BIG tensor]Fix the infinite loop issue in argsort. #74434

Uh oh!

Difers commented Aug 5, 2025 •

edited

Loading

paddle-bot bot commented Aug 5, 2025

wanghuancoder left a comment

Uh oh!

Labels

3 participants

Uh oh!

[BIG tensor]Fix the infinite loop issue in argsort. #74434

[BIG tensor]Fix the infinite loop issue in argsort. #74434

Uh oh!

Conversation

Difers commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

paddle-bot bot commented Aug 5, 2025

wanghuancoder left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

3 participants

Difers commented Aug 5, 2025 •

edited

Loading