ray/rllib/policy
2021-09-21 22:00:14 +02:00
..
tests [RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550) 2021-09-14 19:58:10 +02:00
__init__.py [RLlib] JAXPolicy prep. PR #1. (#13077) 2020-12-26 20:14:18 -05:00
dynamic_tf_policy.py [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
eager_tf_policy.py [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
policy.py [RLlib] No Preprocessors; preparatory PR #1 (#18367) 2021-09-09 08:10:42 +02:00
policy_map.py [RLlib] Add locking to PolicyMap in case it is accessed by a RolloutWorker and the same worker's AsyncSampler or the main LearnerThread. (#18444) 2021-09-08 23:32:23 +02:00
policy_template.py [RLlib] Strictly run evaluation_num_episodes episodes each evaluation run (no matter the other eval config settings). (#18335) 2021-09-05 15:37:05 +02:00
rnn_sequencing.py [RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550) 2021-09-14 19:58:10 +02:00
sample_batch.py [RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550) 2021-09-14 19:58:10 +02:00
tf_policy.py [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
tf_policy_template.py [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
torch_policy.py [RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928) 2021-08-21 17:05:48 +02:00
torch_policy_template.py [RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530) 2021-08-03 18:30:02 -04:00
view_requirement.py [RLlib] Remove all non-trajectory view API code. (#14860) 2021-03-23 09:50:18 -07:00