ray/rllib/evaluation at f146d05b36d135091c24d4f9a2be54d0a178aa1c - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-09 04:46:38 -04:00

History

Sven Mika 5ac5ac9560 [RLlib] Fix broken example: tf-eager with custom-RNN (#6732 ). (#7021 ) * WIP. * Fix float32 conversion in OneHot preprocessor (would cause float64 in eager, then NN-matmul-failure). Add proper seq-len + state-in construction in eager_tf_policy.py::_compute_gradients(). * LINT. * eager_tf_policy.py: Only set samples["seq_lens"] if RNN. Otherwise, eager-tracing will throw flattened-dict key-mismatch error. * Move issue code to examples folder. Co-authored-by: Eric Liang <ekhliang@gmail.com>		2020-02-06 09:44:08 -08:00
..
__init__.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
episode.py	[rllib] Feature/histograms in tensorboard (#6942 )	2020-01-30 22:02:53 -08:00
interface.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
metrics.py	[rllib] Support parallel, parameterized evaluation (#6981 )	2020-02-01 22:12:12 -08:00
policy_evaluator.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
policy_graph.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
postprocessing.py	[rllib] implemented compute_advantages without gae (#6941 )	2020-01-31 22:25:45 -08:00
rollout_metrics.py	[rllib] Feature/histograms in tensorboard (#6942 )	2020-01-30 22:02:53 -08:00
rollout_worker.py	[rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918 )	2020-01-25 22:36:43 -08:00
sample_batch.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
sample_batch_builder.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
sampler.py	[RLlib] Fix broken example: tf-eager with custom-RNN (#6732 ). (#7021 )	2020-02-06 09:44:08 -08:00
tf_policy_graph.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
torch_policy_graph.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
worker_set.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00