Skip to content

Conversation

@Johnson-Wang
Copy link
Collaborator

@Johnson-Wang Johnson-Wang commented Jan 28, 2022

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

The training of TOOD often encounters None gradient during backpropagation, which would further cause None tensors in the next training step. Some issues in the original repo (fcjian/TOOD#11) might be also due to this error. The problem is caused by the naive implementation of sigmoid geometric mean function cls_score = (cls_logits.sigmoid() * cls_prob.sigmoid()).sqrt(). This output might be 0 if cls_logits or cls_prob is a low negative value, which causes either inf grad of none grad during backpropagation.

Modification

A reimplementation of SigmoidGeometricMean class as an inheritance of torch.autograd.Function is proposed. The backward function is derived analytically and would avoid and inf or none grad during bp.

  • This modification has little influence on the final results (42.3 mAP after modification vs. 42.4 mAP as reported).
  • This modification enables users to train TOOD without ATSS warmup, yet with some performance drop (~41.8 mAP)

Checklist

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • This PR does not involve any function interface change.
  • Docstring has been added.
@Johnson-Wang Johnson-Wang changed the title Add SigmoidGeometricMean Fix None grad problem during training TOOD by adding SigmoidGeometricMean Jan 28, 2022
from torch.nn import functional as F


class SigmoidGeometricMean(Function):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we implement an interface named sigmoid_geometric_mean = SigmoidGeometricMean.apply here so that in tood_head we can simply use sigmoid_geometric_mean(xxx)?

Copy link
Collaborator

@jshilong jshilong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link

codecov bot commented Jan 28, 2022

Codecov Report

Merging #7090 (72e89e3) into dev (4bdb312) will increase coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@ Coverage Diff @@ ## dev #7090 +/- ## ========================================== + Coverage 62.41% 62.46% +0.04%  ========================================== Files 330 330 Lines 26199 26216 +17 Branches 4436 4437 +1 ========================================== + Hits 16353 16375 +22  + Misses 8976 8966 -10  - Partials 870 875 +5 
Flag Coverage Δ
unittests 62.43% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmdet/models/dense_heads/tood_head.py 83.79% <100.00%> (+0.06%) ⬆️
mmdet/models/utils/__init__.py 100.00% <100.00%> (ø)
mmdet/models/utils/misc.py 96.66% <100.00%> (+3.80%) ⬆️
mmdet/utils/misc.py 95.23% <0.00%> (-4.77%) ⬇️
mmdet/core/bbox/assigners/max_iou_assigner.py 72.36% <0.00%> (-1.32%) ⬇️
mmdet/models/dense_heads/corner_head.py 69.46% <0.00%> (+1.40%) ⬆️
mmdet/models/detectors/cornernet.py 100.00% <0.00%> (+5.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4bdb312...72e89e3. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants