Skip to content

Conversation

@AnnaTrainingG
Copy link
Contributor

@AnnaTrainingG AnnaTrainingG commented Feb 17, 2022

PR types

Others

PR changes

OPs

Describe

Modifide RandomKernel with Kernel Primitive API

背景:为提升KP 算子覆盖率,为XPU 40+ 模型提供算子支持,此处将gaussian/uniform_random中的Kernel实现替换为KP Kernel实现。

PR改动:

  1. 【新增】根据数据index生成随机数的Kernel: IndexKernel 在paddle/fluid/operators/index_impl.cu.h文件中
  2. 【统一代码】paddle/fluid/operators/uniform_random_inplace_op.cu / paddle/fluid/operators/uniform_random_op.cu 实现代码重复较多,将重复的代码放到了 paddle/fluid/operators/uniform_random_op.h 中。
  3. 【新增】InitWithDataIndex API到[primitive/datamover_primitives.h]中,将数据index 放到 寄存器中。

正确性说明: uniform_random_inplace_op/ uniform_random_op/ gaussian_random 的单测均已在develop分支中,py3可以测试到。

  1. test_uniform_random_bf16_op
  2. test_gaussian_random_op
    3.test_uniform_random_op
@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Copy link
Contributor

@ZzSean ZzSean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for op benchmark

@AnnaTrainingG AnnaTrainingG changed the title Modifide RandomKernel with Kernel Primitive API Modified RandomKernel with Kernel Primitive API Feb 21, 2022
@zhangting2020
Copy link
Contributor

PR描述需要说明下问题的背景,PR的改动点比如你这个PR里新增、删除、修改了什么

Copy link
Contributor

@limin2021 limin2021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

}
}

template <typename T, int NX, int NY, int BlockSize>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个接口是否需要加一下说明,使用场景?其他的接口都有解释

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

会的,后面会统一加注释和说明文档

from paddle.fluid.op import Operator
from paddle.fluid.executor import Executor
from op_test import OpTest
from paddle.fluid.tests.unittests.op_test import OpTest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该不用改?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不该另外一个文件引用的时候会挂掉(不同目录下的)

int grid = config.block_per_grid.x;
int block = config.thread_per_block.x;
auto stream = dev_ctx.stream();
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XPU的线程配置是否合并到GetGpuLaunchConfig1D里面更好?这里线程配置就不用写分支了。包括stream的获取。可以考虑优化下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以的, 后面会加进去

@AnnaTrainingG AnnaTrainingG merged commit 9f94821 into PaddlePaddle:develop Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

6 participants