Skip to content

Conversation

@YuanRisheng
Copy link
Contributor

@YuanRisheng YuanRisheng commented Nov 8, 2021

PR types

Others

PR changes

OPs

Describe

相关背景

需要迁移elementwise系列kernel到新的Tensor计算库下,其中elementwise_add是迁移的第一个该系列kernel。该PR主要迁移了elementwise_add计算逻辑以及大部分elementwise系列kernel依赖的基本组件,涉及到了较多的文件,构成了后续其他elementwise迁移的基础。

迁移主要逻辑

1,elementwise基础组件迁移

image

2,elementwise_add kernel迁移

image

@paddle-bot-old
Copy link

paddle-bot-old bot commented Nov 8, 2021

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

int axis = PackTensorsIntoVector<T>(ctx, &ins, &outs);
LaunchElementwiseCudaKernel<ElementwiseType::kBinary, T, T>(
cuda_ctx, ins, &outs, axis, AddFunctor<T>());
auto* x = ctx.Input<framework::LoDTensor>("X");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个函数是不是可以移除了,直接复用下面paddle/fluid/operators/elementwise/elementwise_add_op.h中的kernel,在注册时,给cuda注册一份就可以

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


#include "paddle/fluid/framework/pten_utils.h"

// only can include the headers in paddle/top/api dirs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注释目录需要更改,现在应该是paddle/pten/include

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#include "paddle/fluid/platform/gpu_info.h"
#include "paddle/fluid/platform/transform.h"

// only can include the headers in paddle/top/api dirs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

is_run_common_broadcast);
}

inline int GetElementwiseIndex(const int *x_dims_array, const int max_dim,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这层壳的保留是必要的吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
std::vector<const pten::DenseTensor *> pt_inputs;
std::vector<pten::DenseTensor *> pt_outputs;
// *_tmp for cache DenseTensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以加TODO注释说明是DenseTensor暂不支持拷贝构造导致需要这样写?后续会优化?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

cc_library(utils_cpu SRCS utils.cc DEPS dense_tensor kernel_context kernel_factory memory convert_utils)
cc_library(manipulation_cpu SRCS manipulation.cc DEPS dense_tensor kernel_context kernel_factory utils_cpu unary)
cc_library(nn_cpu SRCS nn.cc DEPS dense_tensor kernel_context kernel_factory blas eigen_function)
add_subdirectory(funcs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以将原来的functions改为funcs,是不是不在每个目录下新建funcs比较好

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

nv_library(manipulation_cuda SRCS manipulation.cu DEPS dense_tensor kernel_context kernel_factory utils_cuda unary)
elseif(WITH_ROCM)
nv_library(nn_cuda SRCS nn.cu DEPS dense_tensor kernel_context kernel_factory eigen_function)
elseif(WITH_ROCM)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缩进有点问题

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

int64_t post_;
};

// #if defined(__NVCC__) || defined(__HIPCC__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些注释是什么情况

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

namespace general {

using DDim = paddle::framework::DDim;
using CPUContext = paddle::platform::CPUDeviceContext;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里用的cpu,为什么是在general下面

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是cpu和gpu的公共方法,所以放在里general下边


#pragma once

#include "paddle/pten/kernels/cuda/funcs/elementwise/elementwise_broadcast.cu.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上,要不在functions下面新增cpu和cuda目录吧,不在kernel层的设备目录里新增子目录了,后续这里也还要调整,另外如果都在cuda目录下,是不是文件后缀中的cu也可以去掉了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

namespace paddle {
namespace experimental {

Tensor elementwise_add(const Tensor& x, const Tensor& y, int axis) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API中好像没有aixs这个参数

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@MingMingShangTian MingMingShangTian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@chenwhql chenwhql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


int pre, n, post, is_run_common_broadcast;
get_mid_dims(x_dim, y_dim, axis, &pre, &n, &post, &is_run_common_broadcast);
pten::general::get_mid_dims(x_dim, y_dim, axis, &pre, &n, &post,
Copy link
Contributor

@chenwhql chenwhql Nov 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

函数命名建议按照code style统一,采用驼峰式命名,这里原先就不太规范,可在后续PR更改

Copy link
Contributor

@Avin0323 Avin0323 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for PR-CI-OP-benchmark

@chenwhql chenwhql merged commit c131034 into PaddlePaddle:develop Nov 12, 2021
@YuanRisheng YuanRisheng deleted the elementwise_add_refactor branch November 19, 2021 02:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

6 participants