DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on

BCEWithLogitsLoss in PyTorch

Buy Me a Coffee

*Memos:

BCEWithLogitsLoss() can get the 0D or more D tensor of the zero or more values(float) computed by BCE Loss and Sigmoid from the 0D or more D tensor of zero or more elements as shown below:

*Memos:

  • The 1st argument for initialization is weight(Optional-Default:None-Type:tensor of int, float or bool): *Memos:
    • If it's not given, it's 1.
    • It must be the 0D or more D tensor of zero or more elements.
  • There is reduction argument for initialization(Optional-Default:'mean'-Type:str). *'none', 'mean' or 'sum' can be selected.
  • There is pos_weight argument for initialization(Optional-Default:None-Type:tensor of int or float): *Memos:
    • If it's not given, it's 1.
    • It must be the 0D or more D tensor of zero or more elements.
  • There are size_average and reduce argument for initialization but they are deprecated.
  • The 1st argument is input(Required-Type:tensor of float). *It must be the 0D or more D tensor of zero or more elements.
  • The 2nd argument is target(Required-Type:tensor of float). *It must be the 0D or more D tensor of zero or more elements.
  • input and target must be the same size otherwise there is error.
  • The empty 1D or more D input and target tensor with reduction='mean' return nan.
  • The empty 1D or more D input and target tensor with reduction='sum' return 0..
  • BCEWithLogitsLoss() is the combination of Sigmoid and BCE Loss. Image description
import torch from torch import nn tensor1 = torch.tensor([ 8., -3., 0., 1., 5., -2.]) tensor2 = torch.tensor([-3., 7., 4., -2., -9., 6.]) # -w*(p*y*log(1/(1+exp(-x))+(1-y)*log(1-(1/1+exp(-x))))  # -1*(1*(-3)*log(1/(1+exp(-8)))+(1-(-3))*log(1-(1/(1+exp(-8)))))  # ↓↓↓↓↓↓↓  # 32.0003 + 21.0486 + 0.6931 + 3.3133 + 50.0067 + 50.0067 = 82.8423  # 119.1890 / 6 = 19.8648 bcelogits = nn.BCEWithLogitsLoss() bcelogits(input=tensor1, target=tensor2) # tensor(19.8648)  bcelogits # BCEWithLogitsLoss()  print(bcelogits.weight) # None  bcelogits.reduction # 'mean'  bcelogits = nn.BCEWithLogitsLoss(weight=None, reduction='mean', pos_weight=None) bcelogits(input=tensor1, target=tensor2) # tensor(19.8648)  bcelogits = nn.BCEWithLogitsLoss(reduction='sum') bcelogits(input=tensor1, target=tensor2) # tensor(119.1890)  bcelogits = nn.BCEWithLogitsLoss(reduction='none') bcelogits(input=tensor1, target=tensor2) # tensor([32.0003, 21.0486, 0.6931, 3.3133, 50.0067, 12.1269])  bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([0., 1., 2., 3., 4., 5.])) bcelogits(input=tensor1, target=tensor2) # tensor(48.8394)  bcelogits = nn.BCEWithLogitsLoss( pos_weight=torch.tensor([0., 1., 2., 3., 4., 5.]) ) bcelogits(input=tensor1, target=tensor2) # tensor(28.5957)  bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor(0.)) bcelogits(input=tensor1, target=tensor2) # tensor(0.)  bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor(0.)) bcelogits(input=tensor1, target=tensor2) # tensor(13.8338)  bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([0, 1, 2, 3, 4, 5])) bcelogits(input=tensor1, target=tensor2) # tensor(48.8394)  bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([0, 1, 2, 3, 4, 5])) bcelogits(input=tensor1, target=tensor2) # tensor(28.5957)  bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor(0)) bcelogits(input=tensor1, target=tensor2) # tensor(0.)  bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor(0)) bcelogits(input=tensor1, target=tensor2) # tensor(13.8338)  bcelogits = nn.BCEWithLogitsLoss( weight=torch.tensor([True, False, True, False, True, False]) ) bcelogits(input=tensor1, target=tensor2) # tensor(13.7834)  bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([False])) bcelogits(input=tensor1, target=tensor2) # tensor(0.)  tensor1 = torch.tensor([[8., -3., 0.], [1., 5., -2.]]) tensor2 = torch.tensor([[-3., 7., 4.], [-2., -9., 6.]]) bcelogits = nn.BCEWithLogitsLoss() bcelogits(input=tensor1, target=tensor2) # tensor(19.8648)  tensor1 = torch.tensor([[[8.], [-3.], [0.]], [[1.], [5.], [-2.]]]) tensor2 = torch.tensor([[[-3.], [7.], [4.]], [[-2.], [-9.], [6.]]]) bcelogits = nn.BCEWithLogitsLoss() bcelogits(input=tensor1, target=tensor2) # tensor(19.8648)  tensor1 = torch.tensor([]) tensor2 = torch.tensor([]) bcelogits = nn.BCEWithLogitsLoss(reduction='mean') bcelogits(input=tensor1, target=tensor2) # tensor(nan)  bcelogits = nn.BCEWithLogitsLoss(reduction='sum') bcelogits(input=tensor1, target=tensor2) # tensor(0.) 
Enter fullscreen mode Exit fullscreen mode

BCEWithLogitsLoss() and BCELoss() with Sigmoid() or torch.sigmoid() for input argument can get the same results as shown below. *BCEWithLogitsLoss() can accept the values out of 0<=y<=1 for input and target argument while BCELoss() cannot:

import torch from torch import nn tensor1 = torch.tensor([0.4, 0.8, 0.6, 0.3, 0.0, 0.5]) tensor2 = torch.tensor([0.2, 0.9, 0.4, 0.1, 0.8, 0.5]) bcelogits = nn.BCEWithLogitsLoss() bcelogits(input=tensor1, target=tensor2) # tensor(0.7205)  sigmoid = nn.Sigmoid() bceloss = nn.BCELoss() bceloss(input=sigmoid(input=tensor1), target=tensor2) # tensor(0.7205)  bceloss = nn.BCELoss() bceloss(input=torch.sigmoid(input=tensor1), target=tensor2) # tensor(0.7205) 
Enter fullscreen mode Exit fullscreen mode

Top comments (0)