ray/rllib/evaluation
Sven Mika 5ac5ac9560
[RLlib] Fix broken example: tf-eager with custom-RNN (#6732). (#7021)
* WIP.

* Fix float32 conversion in OneHot preprocessor (would cause float64 in eager, then NN-matmul-failure).
Add proper seq-len + state-in construction in eager_tf_policy.py::_compute_gradients().

* LINT.

* eager_tf_policy.py: Only set samples["seq_lens"] if RNN. Otherwise, eager-tracing will throw flattened-dict key-mismatch error.

* Move issue code to examples folder.

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-02-06 09:44:08 -08:00
..
__init__.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
episode.py [rllib] Feature/histograms in tensorboard (#6942) 2020-01-30 22:02:53 -08:00
interface.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
metrics.py [rllib] Support parallel, parameterized evaluation (#6981) 2020-02-01 22:12:12 -08:00
policy_evaluator.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
policy_graph.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
postprocessing.py [rllib] implemented compute_advantages without gae (#6941) 2020-01-31 22:25:45 -08:00
rollout_metrics.py [rllib] Feature/histograms in tensorboard (#6942) 2020-01-30 22:02:53 -08:00
rollout_worker.py [rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918) 2020-01-25 22:36:43 -08:00
sample_batch.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
sample_batch_builder.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
sampler.py [RLlib] Fix broken example: tf-eager with custom-RNN (#6732). (#7021) 2020-02-06 09:44:08 -08:00
tf_policy_graph.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
torch_policy_graph.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
worker_set.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00