Skip to content

Conversation

@cangtianhuang
Copy link
Contributor

@cangtianhuang cangtianhuang commented Jul 1, 2025

PR Category

Operator Mechanism

PR Types

Bug fixes

Description

Bug Fixes:

  1. Add typename IndexType to distinguish index types
  2. Change shared_mem to dynamic allocation, instead of hardcoding
  3. Set grid_dim, block_dim to avoid exceeding
  4. Resolve write race condition for foundKValue by adopting CudaAtomicMin
  5. Fix grad_kernel int64_t handling

Tests:

The accuracy diff is due for multiple identical kth values, resulting in different index positions.
image

After comparing only the first value, all accuracy tests pass
image

TODO:

  • 1. Support int64_t k param in API signature
  • 2. Replace shared_mem in RadixTopK with dynamic allocation

Pcard-85711

@paddle-bot
Copy link

paddle-bot bot commented Jul 1, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@wanghuancoder wanghuancoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lshpku lshpku merged commit 60d3a2f into PaddlePaddle:develop Jul 2, 2025
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants