[RLlib] Fix crash when kl_coeff is set to 0 (#23063)

Co-authored-by: Jeroen Bédorf <jeroen@minds.ai>
Co-authored-by: Ishant Mrinal Haloi <mrinal.haloi11@gmail.com>
Co-authored-by: Ishant Mrinal <33053278+n30111@users.noreply.github.com>
This commit is contained in:
Jeroen Bédorf 2022-03-11 21:24:52 +01:00 committed by GitHub
parent e9ae784e62
commit bc21a4593d
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -98,7 +98,7 @@ def ppo_surrogate_loss(
action_kl = prev_action_dist.kl(curr_action_dist)
mean_kl_loss = reduce_mean_valid(action_kl)
else:
mean_kl_loss = 0.0
mean_kl_loss = tf.constant(0.0)
curr_entropy = curr_action_dist.entropy()
mean_entropy = reduce_mean_valid(curr_entropy)