Skip to content

Commit 6be19f9

Browse files
atgambardellasoumith
authored andcommitted
fixed divide by zero error
1 parent f3a883c commit 6be19f9

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

reinforcement_learning/actor_critic.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ def finish_episode():
6969
R = r + args.gamma * R
7070
rewards.insert(0, R)
7171
rewards = torch.Tensor(rewards)
72-
rewards = (rewards - rewards.mean()) / rewards.std()
72+
rewards = (rewards - rewards.mean()) / (rewards.std() + np.finfo(np.float32).eps)
7373
for (action, value), r in zip(saved_actions, rewards):
7474
action.reinforce(r - value.data.squeeze())
7575
value_loss += F.smooth_l1_loss(value, Variable(torch.Tensor([r])))

reinforcement_learning/reinforce.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ def finish_episode():
6565
R = r + args.gamma * R
6666
rewards.insert(0, R)
6767
rewards = torch.Tensor(rewards)
68-
rewards = (rewards - rewards.mean()) / rewards.std()
68+
rewards = (rewards - rewards.mean()) / (rewards.std() + np.finfo(np.float32).eps)
6969
for action, r in zip(model.saved_actions, rewards):
7070
action.reinforce(r)
7171
optimizer.zero_grad()

0 commit comments

Comments
 (0)