Optimize where_op and abs_grad_op by the elementwise interface #39609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

AnnaTrainingG merged 6 commits into PaddlePaddle:develop from huangxu96:drop_out_kp

Feb 24, 2022

Contributor

huangxu96 commented Feb 16, 2022 •

edited

Loading

PR types

Performance optimization

PR changes

OPs

Describe

通过elementwise 接口优化了wehere_op和abs_grad_op。 elementwise 接口打包了一系列性能优化技巧，对于有elementwise行为的op有通用的性能提升。通过重写functor的形式，将代码里的循环遍历元素改写为通过elementwise接口调用functor实现。

Optimize the where_op by the elementwise_op funtion

e4e0e06

paddle-bot-old bot commented Feb 16, 2022

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

huangxu96 force-pushed the drop_out_kp branch from 6564ee3 to f376b92 Compare

February 23, 2022 03:07

huangxu96 changed the title ~~Optimize the where_op by the elementwise_op funtion~~ Optimize where_op and abs_grad_op by the elementwise interface

Modified where_op & abs_grad_op by elementwise interface

0a79619

huangxu96 force-pushed the drop_out_kp branch from f376b92 to 0a79619 Compare

February 23, 2022 03:11

huangxu96 added 2 commits

February 23, 2022 06:49

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

e185e63

… drop_out_kp

fix compile error

142bf51

AnnaTrainingG previously approved these changes

View reviewed changes

AnnaTrainingG reviewed

View reviewed changes

paddle/phi/kernels/impl/abs_grad_kernel_impl.h Outdated

      #include "paddle/phi/kernels/abs_grad_kernel.h"  
    #include "paddle/phi/kernels/funcs/complex_functors.h"  
    #if defined(__NVCC__) || defined(__HIPCC__)  
    #include "paddle/fluid/operators/elementwise/elementwise_op_impl.cu.h"  
 

Contributor

AnnaTrainingG Feb 23, 2022

phi下不能include fluid路径下的文件，参考cast 修改一下

Contributor

AnnaTrainingG Feb 23, 2022 •

edited

Loading

PR描述里面介绍清楚一点做的工作，比如：添加哪些functor，调用哪个Kernel等

Contributor Author

huangxu96 Feb 23, 2022

Done.

AnnaTrainingG reviewed

View reviewed changes

paddle/fluid/operators/where_op.cu

       auto functor = CondFunctor<T>();  
     std::vector<const framework::Tensor*> ins = {condition, X, Y};  
     std::vector<framework::Tensor*> outs = {out};  
     paddle::operators::LaunchSameDimsElementwiseCudaKernel<T>(dev_ctx, ins,  
 

Contributor

AnnaTrainingG Feb 23, 2022

建议改成phi::funcs的那种调用

Contributor Author

huangxu96 Feb 23, 2022

Done.

Remove inclued path with fluid

734a73e

huangxu96 dismissed AnnaTrainingG’s stale review via 734a73e

February 23, 2022 08:42

fix ROCM compile bug

a5bfdf0

AnnaTrainingG approved these changes

View reviewed changes

Liu-xiandong reviewed

View reviewed changes

paddle/fluid/operators/where_op.cu

       T><<<config.block_per_grid.x, config.thread_per_block.x, 0, stream>>>(  
     numel, cond_data, x_data, y_data, out_data);  
     auto functor = CondFunctor<T>();  
     std::vector<const framework::Tensor*> ins = {condition, X, Y};  
 

Member

Liu-xiandong Feb 24, 2022

相关的framework Tensor后续可以改成DensorTensor

Liu-xiandong approved these changes

View reviewed changes

xingfeng01 approved these changes

View reviewed changes

paddle/fluid/operators/where_op.cu

      
    namespace paddle {  
    namespace operators {

Contributor

xingfeng01 Feb 24, 2022

下个PR里每个函数加上功能说明

AnnaTrainingG merged commit c969955 into PaddlePaddle:develop

Xreki reviewed

View reviewed changes

paddle/fluid/operators/where_op.cu

      };  
    
    template <typename T>  
    __global__ void WhereCUDAKernel(const int N, const bool* cond, const T* x,

Contributor

Xreki Feb 24, 2022

这个函数可以删除了？

paddle/fluid/operators/where_op.cu

      
    template <typename T>  
    struct CondFunctor {  
     HOSTDEVICE inline CondFunctor() {}

Contributor

Xreki Feb 24, 2022

默认构造函数，可以不用显式写。

paddle/phi/kernels/funcs/complex_functors.h

      
    template <typename T>  
    struct AbsGradCUDAFunctor {  
     HOSTDEVICE inline AbsGradCUDAFunctor() {}

Contributor

Xreki Feb 24, 2022

默认构造函数可以不用显式定义。

paddle/phi/kernels/funcs/complex_functors.h

      };  
    
    template <>  
    struct AbsGradCUDAFunctor<phi::dtype::complex<float>> {

Contributor

Xreki Feb 24, 2022

Functor定义可以简化下，参考：

Paddle/paddle/phi/kernels/gpu/abs_kernel.cu

Lines 29 to 34 in bbe441f

     template <typename T>  
   struct CudaAbsFunctor<T, phi::funcs::Complex<T, phi::funcs::Real<T>>> {  
   __device__ __forceinline__ phi::funcs::Real<T> operator()(const T x) const {  
   return abs(x);  
   }  
   };  
 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment