[PHI] implement masked_fill_op and optimize bool setitem indexing #72788

Eddie-Wang1120 · 2025-05-19T08:23:20Z

PR Category

Performance Optimization

PR Types

Improvements

Description

pcard-67164

Kernel Optimization

Masked_Fill forward kernel performance

x	mask	torch (us)	paddle_before(us)	paddle_after(us)	speedup (torch)	speedup (paddle_before)
(1024, 1024)	(1, 1)	48.22	102.29	40.96	117.72%	249.73%
(1024, 1024)	(1024, 1)	49.89	101.71	42.83	116.48%	237.47%
(108, 64, 12288)	(1, 1)	1712.8	1918.77	411.67	416.06%	466.09%
(108, 64, 12288)	(108, 1, 1)	1716.5	1968.26	666.03	257.72%	295.52%
(108, 64, 12288)	(108, 64, 1)	1716.96	1968.53	684.87	250.70%	287.43%
(108, 64, 12288)	(108, 64, 12288)	1802.64	1935.82	1039.35	173.44%	186.25%

Masked_Fill backward kernel performance

x	mask	torch (us)	paddle_before(us)	paddle_after(us)	speedup (torch)	speedup (paddle_before)
(1024, 1024)	(1, 1)	197.34	116.55	122.38	161.25%	95.24%
(1024, 1024)	(1024, 1)	195.34	114.06	113.74	171.74%	100.28%
(108, 64, 12288)	(1, 1)	1897.44	1927.06	1321.87	143.54%	145.78%
(108, 64, 12288)	(108, 1, 1)	1848.15	1709.19	1559.12	118.54%	109.63%
(108, 64, 12288)	(108, 64, 1)	1854.34	1735.66	1587.32	116.82%	109.35%
(108, 64, 12288)	(108, 64, 12288)	1967.94	1928.49	1952.57	100.79%	98.77%

Logical Optimization

SetItem forward performance

index	value	paddle/torch before	paddle/torch after
(108,) True	0.5	6.28	0.5
(108,) random	0.5	-	0.75
(108, 64,) True	0.5	6.33	0.5
(108, 64,) random	0.5	-	0.82
(108, 64, 12288,) True	0.5	2.07	0.68
(108, 64, 12288,) random	0.5	-	1.06

SetItem backward performance

index	value	paddle/torch before	paddle/torch after
(108,) True	0.5	3.48	0.73
(108,) random	0.5	-	0.81
(108, 64,) True	0.5	2.45	0.76
(108, 64,) random	0.5	-	0.75
(108, 64, 12288,) True	0.5	0.85	0.82
(108, 64, 12288,) random	0.5	-	1.02

paddle-bot · 2025-05-19T08:23:24Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

changeyoung98 · 2025-05-19T08:44:42Z

paddle/phi/kernels/cpu/masked_fill_grad_kernel.cc

@@ -0,0 +1,114 @@
+// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.


xiaoguoguo626807 · 2025-05-19T08:47:12Z

python/paddle/tensor/manipulation.py


- mask = paddle.logical_not(mask)
- out = paddle.where_(mask, x, value)
+ out = _C_ops.masked_fill(x, mask, value)


这里得是masked_fill_

changeyoung98 · 2025-05-19T08:57:44Z

paddle/phi/kernels/funcs/masked_fill_utils.h

+
+namespace phi {
+namespace funcs {
+inline bool CanShortCutMaskFill(const DDim& input_dim, const DDim& mask_dim) {


这里换个体现函数实际比对功能的名字比较好

xiaoguoguo626807 · 2025-05-19T10:33:21Z

paddle/fluid/pybind/slice_utils.h

+inline static bool MaskedFillDispatching(const paddle::Tensor& tensor,
+ const paddle::Tensor& value,
+ std::vector<paddle::Tensor>* indices) {
+ if (value.numel() != 1) return false;


mask_fill kernel 只支持value 是单值的？

根据python/paddle/tensor/manipulation.py中对masked_fill参数定义：
value (Scalar or 0-D Tensor): The value used to fill the target tensor.
因此只有value为单值时才可以使用masked_fill

zyfncg · 2025-05-21T12:30:52Z

paddle/phi/kernels/cpu/masked_fill_grad_kernel.cc

+ auto x_grad_dims = x_grad->dims();
+ auto mask_dims = mask.dims();


这种一般用 const auto&，避免拷贝开销

好的，后面会提交pr修改

zyfncg · 2025-05-21T12:31:23Z

paddle/phi/kernels/cpu/masked_fill_kernel.cc

+ auto x_dims = x.dims();
+ auto mask_dims = mask.dims();


implement masked_fill_op and optimize bool setitem indexing

5099be6

paddle-bot bot added the contributor External developers label May 19, 2025

changeyoung98 reviewed May 19, 2025

View reviewed changes

xiaoguoguo626807 reviewed May 19, 2025

View reviewed changes

changeyoung98 reviewed May 19, 2025

View reviewed changes

changeyoung98 previously approved these changes May 19, 2025

View reviewed changes

xiaoguoguo626807 reviewed May 19, 2025

View reviewed changes

fix install bugs and typos

132c5bf

Eddie-Wang1120 dismissed changeyoung98’s stale review via 132c5bf May 19, 2025 14:25

Eddie-Wang1120 added 5 commits May 19, 2025 15:25

remove unused

e07570e

remove space

87d83e8

fix setitem inplace bug

256a8a6

fix shape error

02476b8

improve coverage

c8c8fa5

Eddie-Wang1120 changed the title ~~implement masked_fill_op and optimize bool setitem indexing~~ [PHI] implement masked_fill_op and optimize bool setitem indexing May 21, 2025

zyfncg approved these changes May 21, 2025

View reviewed changes

xiaoguoguo626807 merged commit 15e2e47 into PaddlePaddle:develop May 22, 2025
53 of 57 checks passed

HydrogenSulfate mentioned this pull request Jun 9, 2025

[Prim] Support masked_fill_double_grad for eager mode #73140

Merged

lshpku mentioned this pull request Jun 12, 2025

[PHI] Fix masked_fill & masked_fill_grad kernel for big tensor #73284

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PHI] implement masked_fill_op and optimize bool setitem indexing #72788

[PHI] implement masked_fill_op and optimize bool setitem indexing #72788

Eddie-Wang1120 commented May 19, 2025 •

edited

Loading

paddle-bot bot commented May 19, 2025

changeyoung98 May 19, 2025

Eddie-Wang1120 May 19, 2025

xiaoguoguo626807 May 19, 2025

Eddie-Wang1120 May 19, 2025

changeyoung98 May 19, 2025

Eddie-Wang1120 May 19, 2025

xiaoguoguo626807 May 19, 2025

Eddie-Wang1120 May 19, 2025

zyfncg May 21, 2025

Eddie-Wang1120 May 22, 2025

zyfncg May 21, 2025

Uh oh!

Labels

4 participants

		@@ -0,0 +1,114 @@
		// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.

		auto x_grad_dims = x_grad->dims();
		auto mask_dims = mask.dims();

[PHI] implement masked_fill_op and optimize bool setitem indexing #72788

[PHI] implement masked_fill_op and optimize bool setitem indexing #72788

Conversation

Eddie-Wang1120 commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

paddle-bot bot commented May 19, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants

Eddie-Wang1120 commented May 19, 2025 •

edited

Loading