Performance Improvements suggestions on ConNets

Some general feedbacks from Nvidia on profiling fluid ConvNet #6179:

cuDNN convolution is not used(I am not sure whether this is intended). The convolution operator of cuDNN is not exposed in the fluid. #6089
For profiling, normally we ignore the first minibatch or several minibatch from benchmark result because it is slow on allocating and tuning algorithm. Doing the same thing here allow us to easier compare result to other frameworks to see how well we are doing
Data pipeline: some part of it is not running in parallel with GPU. plus, it is slow and become the bottleneck if GPU perf gets reasonable

After changing three things above, by using cuDNN, use fake numpy data and only calculate speed for 10-50 minibatch, the TitanXp perf increased from 53img/sec to ~108img/sec

Also, another bug is caught at #6320.

After changing all four things above, we got ~40% speed up to 150img/sec on my Titan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance Improvements suggestions on ConNets #6321

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance Improvements suggestions on ConNets #6321

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions