Skip to content

Conversation

@panyx0718
Copy link
Contributor

@panyx0718 panyx0718 commented Apr 1, 2018

se-resnext on 4 device titan-x goes from ~1.25 to ~0.98

Some changes in the PR are due to pre-commit cpplint

The main issue is that nccl_all_reduce stream cannot overlap with computation streams.
some slow nccl_all_reduce block other gpus' streams
slow_nccl

@panyx0718 panyx0718 force-pushed the group_nccl_all_reduce branch 2 times, most recently from 37257cf to 836f069 Compare April 1, 2018 09:44
@panyx0718 panyx0718 requested review from chengduoZH and reyoung and removed request for reyoung April 1, 2018 10:12
@panyx0718 panyx0718 force-pushed the group_nccl_all_reduce branch 2 times, most recently from 114a6b2 to 4a76d1d Compare April 2, 2018 08:10
chengduoZH
chengduoZH previously approved these changes Apr 2, 2018
Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@panyx0718 panyx0718 force-pushed the group_nccl_all_reduce branch 4 times, most recently from be193ab to cb5d752 Compare April 3, 2018 00:38
chengduoZH
chengduoZH previously approved these changes Apr 3, 2018
Copy link
Contributor

@chengduoZH chengduoZH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@panyx0718 panyx0718 merged commit 49313d4 into PaddlePaddle:develop Apr 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants