Skip to content

Conversation

@typhoonzero
Copy link
Contributor

@typhoonzero typhoonzero commented Nov 20, 2017

Fix #5784

int group_offset_out =
output_channels / groups * output_height * output_width;
output_channels / groups * output_height * output_width * output_depth;
int group_offset_filter = filter->numel() / groups;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's simpler to write this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to http://www.cplusplus.com/reference/vector/vector/erase/

Because vectors use an array as their underlying storage, erasing elements in positions other than the vector end causes the container to relocate all the elements after the segment erased to their new positions.

Erasing first two elements will cause memory re-allocation, which is not efficient.

int group_offset_out =
output_channels / groups * output_height * output_width;
output_channels / groups * output_height * output_width * output_depth;
int group_offset_filter = filter->numel() / groups;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group is supported in cudnn7.0 .

cudnnConvolutionDescriptor_t cudnn_conv_desc =
conv_desc.descriptor<T>(paddings, strides, dilations);

#if CUDNN_VERSION > 6000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#if CUDNN_VERSION > 6000 - > #if CUDNN_VERSION >= 7000 or #if CUDNN_VERSION_MIN(7,0,0)

This place needs to be changed too.

layout, framework::vectorize2int(output_grad->dims()), groups);
cudnnFilterDescriptor_t cudnn_filter_desc = filter_desc.descriptor<T>(
layout, framework::vectorize2int(filter->dims()), groups);
cudnnTensorDescriptor_t cudnn_input_grad_desc = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cudnn_input_grad_desc and cudnn_input_desc are the same, you can replace cudnn_input_grad_desc with cudnn_input_desc. Just like this.

Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM++

@typhoonzero typhoonzero merged commit a06bec1 into PaddlePaddle:develop Nov 27, 2017
@typhoonzero typhoonzero deleted the conv_cudnn_3d branch December 22, 2017 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants