Skip to content

Conversation

@jakpiase
Copy link
Contributor

PR types

New features

PR changes

OPs

Describe

Added exp FP32 FWD/BWD oneDNN kernel and optimized other oneDNN grad kernels by allowing the usage of Out tensor instead of X tensor in some activation grad kernels. New version is working faster, because in some cases the computations are a lot simpler, f.e. for exp activation:

forward equation:
out = exp(x)

grad equation:
dx = dout * exp(x)

optimized grad equation(using out instead of x):
dx = dout * out

Simple multiplication is a lot faster than a multiplication followed by calculating the exponential function and same logic applies to other kernels that use "use_dst_for_bwd" versions of oneDNN kernels. Our new kernels are working up to 10% faster than old ones.

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@jakpiase
Copy link
Contributor Author

jakpiase commented Jan 4, 2022

@Silv3S, @piotrekobiIntel, @wozna please review

@jakpiase jakpiase requested review from Aganlengzi and wozna January 4, 2022 23:45
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@wozna wozna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job. LGTM

@jakpiase
Copy link
Contributor Author

jakpiase commented Jan 5, 2022

Hi @Aganlengzi, could you please continue your review?

@Silv3S
Copy link
Member

Silv3S commented Jan 6, 2022

LGTM 👍

Copy link
Contributor

@Aganlengzi Aganlengzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Aganlengzi Aganlengzi merged commit 718183f into PaddlePaddle:develop Jan 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 participants