- Notifications
You must be signed in to change notification settings - Fork 896
Open
Description
I think that the advantage value here should be base on the old actor
target_v = reward + args.gamma * self.critic_net(next_state)
Metadata
Metadata
Assignees
Labels
No labels