ray/rllib/examples at c01245763e9c37fe0a5204a7e3bd508ecd7d31ec - hiro/ray

Sven Mika c01245763e [RLlib] Revert "Revert "updated pettingzoo wrappers, env versions, urls"" (#21339 )	2022-01-04 18:30:26 +01:00
..
documentation	[RLlib; Documentation] RLlib README overhaul. (#20249 )	2021-11-18 18:08:40 +01:00
env	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
export	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
inference_and_serving	[RLlib] Issue 20062: Action inference examples missing (#20144 )	2021-11-10 18:49:06 +01:00
models	[RLlib] Use SampleBrach instead of input dict whenever possible (#20746 )	2021-12-02 13:11:26 +01:00
policy	[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )	2021-12-13 12:04:23 +01:00
serving	[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). (#18942 )	2021-09-30 08:30:20 +02:00
simulators/sumo	[RLlib] Integration with SUMO Simulator (#11710 )	2020-11-03 09:45:03 +01:00
tune	[Release] Refactor User Tests (#20028 )	2021-11-05 17:28:37 -07:00
__init__.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
action_masking.py	[RLlib] Document and extend action mask example. (#20390 )	2021-11-16 13:20:41 +01:00
attention_net.py	Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 )	2021-11-16 14:49:41 +01:00
attention_net_supervised.py	[RLlib] Support easy `use_attention=True` flag for using the GTrXL model. (#11698 )	2021-01-01 14:06:23 -05:00
autoregressive_action_dist.py	[RLlib] Better example scripts: Description --no-tune and --local-mode CLI options (autoregressive_action_dist.py) (#17705 )	2021-08-16 22:08:13 +02:00
bare_metal_policy_with_custom_view_reqs.py	[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 )	2021-11-03 16:24:00 +01:00
batch_norm_model.py	[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 )	2021-11-03 16:24:00 +01:00
cartpole_lstm.py	[RLlib] CQL BC loss fixes; PPO/PG/A2\|3C action normalization fixes (#16531 )	2021-06-30 12:32:11 +02:00
centralized_critic.py	[RLlib] Trainer sub-class PPO/DDPPO (instead of `build_trainer()`). (#20571 )	2021-11-23 23:01:05 +01:00
centralized_critic_2.py	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
checkpoint_by_custom_criteria.py	[RLlib] Examples scripts add argparse help and replace `--torch` with `--framework`. (#15832 )	2021-05-18 13:18:12 +02:00
coin_game_env.py	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
complex_struct_space.py	[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )	2021-12-13 12:04:23 +01:00
compute_adapted_gae_on_postprocess_trajectory.py	[RLlib] Example containing a proposal for computing an adapted (time-dependent) GAE used by the PPO algorithm (via callback on_postprocess_trajectory) (#20850 )	2021-12-09 14:48:56 +01:00
curriculum_learning.py	[RLlib] Add simple curriculum learning API and example script. (#15740 )	2021-05-16 17:35:10 +02:00
custom_env.py	[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591 )	2021-09-15 22:16:48 +02:00
custom_eval.py	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
custom_experiment.py	[RLlib] CQL BC loss fixes; PPO/PG/A2\|3C action normalization fixes (#16531 )	2021-06-30 12:32:11 +02:00
custom_fast_model.py	[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169 )	2021-07-20 14:58:13 -04:00
custom_input_api.py	[RLlib] Rename `metrics_smoothing_episodes` into `metrics_num_episodes_for_smoothing` for clarity. (#20983 )	2021-12-11 20:33:35 +01:00
custom_keras_model.py	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
custom_logger.py	[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 )	2021-11-03 16:24:00 +01:00
custom_loss.py	[RLlib] Fix ModelV2 custom metrics for torch. (#16734 )	2021-07-01 13:01:40 +02:00
custom_metrics_and_callbacks.py	[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )	2021-10-29 12:03:56 +02:00
custom_metrics_and_callbacks_legacy.py	[RLlib] Fix all example scripts to run on GPUs. (#11105 )	2020-10-02 23:07:44 +02:00
custom_model_api.py	[RLlib] Use SampleBrach instead of input dict whenever possible (#20746 )	2021-12-02 13:11:26 +01:00
custom_model_loss_and_metrics.py	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
custom_observation_filters.py	[RLlib] No Preprocessors; preparatory PR #1 (#18367 )	2021-09-09 08:10:42 +02:00
custom_rnn_model.py	[RLlib] CQL BC loss fixes; PPO/PG/A2\|3C action normalization fixes (#16531 )	2021-06-30 12:32:11 +02:00
custom_tf_policy.py	[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )	2021-10-25 15:00:00 +02:00
custom_torch_policy.py	[RLlib] JAXPolicy prep. PR #1 . (#13077 )	2020-12-26 20:14:18 -05:00
custom_train_fn.py	[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591 )	2021-09-15 22:16:48 +02:00
custom_vector_env.py	[RLlib] Discussion 2294: Custom vector env example and fix. (#16083 )	2021-07-28 10:40:04 -04:00
deterministic_training.py	[rllib] Add deterministic test to gpu (#19306 )	2021-10-26 10:11:39 -07:00
dmlab_watermaze.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
eager_execution.py	[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )	2021-10-25 15:00:00 +02:00
env_rendering_and_recording.py	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
fractional_gpus.py	[RLlib] Fix ModelV2 custom metrics for torch. (#16734 )	2021-07-01 13:01:40 +02:00
hierarchical_training.py	[RLlib] Add `worker` arg (optional) to `policy_mapping_fn`. (#18184 )	2021-09-17 12:07:11 +02:00
iterated_prisoners_dilemma_env.py	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
lstm_auto_wrapping.py	[RLlib] Preparatory PR for: Documentation on Model Building. (#13260 )	2021-01-08 10:56:09 +01:00
mobilenet_v2_with_lstm.py	[RLlib] Examples scripts add argparse help and replace `--torch` with `--framework`. (#15832 )	2021-05-18 13:18:12 +02:00
multi_agent_cartpole.py	[RLlib] Add `worker` arg (optional) to `policy_mapping_fn`. (#18184 )	2021-09-17 12:07:11 +02:00
multi_agent_custom_policy.py	[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046 )	2021-07-15 05:51:24 -04:00
multi_agent_independent_learning.py	[RLlib] Revert "Revert "updated pettingzoo wrappers, env versions, urls"" (#21339 )	2022-01-04 18:30:26 +01:00
multi_agent_parameter_sharing.py	[RLlib] Revert "Revert "updated pettingzoo wrappers, env versions, urls"" (#21339 )	2022-01-04 18:30:26 +01:00
multi_agent_two_trainers.py	[RLlib] Add `worker` arg (optional) to `policy_mapping_fn`. (#18184 )	2021-09-17 12:07:11 +02:00
nested_action_spaces.py	[RLlib] Fix crash when using StochasticSampling exploration (most PG-style algos) w/ tf and numpy > 1.19.5 (#18366 )	2021-09-06 12:14:00 +02:00
offline_rl.py	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
parallel_evaluation_and_training.py	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
parametric_actions_cartpole.py	[RLlib] Examples scripts add argparse help and replace `--torch` with `--framework`. (#15832 )	2021-05-18 13:18:12 +02:00
parametric_actions_cartpole_embeddings_learnt_by_model.py	[RLlib] New and changed version of parametric actions cartpole example + small suggested update in policy_client.py (#15664 )	2021-07-28 15:25:09 -04:00
partial_gpus.py	[RLlib] Fix ModelV2 custom metrics for torch. (#16734 )	2021-07-01 13:01:40 +02:00
preprocessing_disabled.py	[RLlib] No Preprocessors (part 2). (#18468 )	2021-09-23 12:56:45 +02:00
random_parametric_agent.py	[RLlib] Unify the way we create local replay buffer for all agents (#19627 )	2021-10-26 20:56:02 +02:00
re3_exploration.py	[RLlib] Support for RE3 exploration algorithm (for tf) (#19551 )	2021-12-07 13:26:34 +01:00
recsim_with_slateq.py	[RLlib] Examples scripts add argparse help and replace `--torch` with `--framework`. (#15832 )	2021-05-18 13:18:12 +02:00
remote_base_env_with_custom_api.py	[RLlib] Update a few things to get rid of the `remote_vector_env` deprecation warning. (#20753 )	2021-12-02 13:10:44 +01:00
remote_envs_with_inference_done_on_main_node.py	[RLlib] Trainer sub-class PPO/DDPPO (instead of `build_trainer()`). (#20571 )	2021-11-23 23:01:05 +01:00
restore_1_of_n_agents_from_checkpoint.py	[RLlib] Add `worker` arg (optional) to `policy_mapping_fn`. (#18184 )	2021-09-17 12:07:11 +02:00
rnnsac_stateless_cartpole.py	[RLlib] Add RNN-SAC agent (#16577 )	2021-07-25 10:04:52 -04:00
rock_paper_scissors_multiagent.py	[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984 )	2021-12-21 08:39:05 +01:00
rollout_worker_custom_workflow.py	[Tune] Remove legacy resources implementations in Runner and Executor. (#19773 )	2021-11-12 12:33:39 -08:00
saving_experiences.py	[Core] First pass at privatizing non-public Python APIs. (#14607 )	2021-03-10 22:47:28 -08:00
sb2rllib_rllib_example.py	[RLlib] Better example scripts: Description --no-tune and --local-mode CLI options (#17038 )	2021-07-26 22:25:48 -04:00
sb2rllib_sb_example.py	[RLlib] Examples for training, saving, loading, testing an agent with SB & RLlib (#15897 )	2021-05-19 16:36:59 +02:00
self_play_league_based_with_open_spiel.py	[RLlib] Add `worker` arg (optional) to `policy_mapping_fn`. (#18184 )	2021-09-17 12:07:11 +02:00
self_play_with_open_spiel.py	[RLlib] Add `worker` arg (optional) to `policy_mapping_fn`. (#18184 )	2021-09-17 12:07:11 +02:00
sumo_env_local.py	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
trajectory_view_api.py	[RLlib] Custom view requirements (e.g. for prev-n-obs) work with `compute_single_action` and `compute_actions_from_input_dict`. (#18921 )	2021-09-30 15:03:37 +02:00
two_step_game.py	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
two_trainer_workflow.py	[RLlib] Replay buffer API (cleanups; docstrings; renames; move into `rllib/execution/buffers` dir) (#20552 )	2021-11-19 11:57:37 +01:00
unity3d_env_local.py	[RLlib] Examples scripts add argparse help and replace `--torch` with `--framework`. (#15832 )	2021-05-18 13:18:12 +02:00
vizdoom_with_attention_net.py	[RLlib] Examples scripts add argparse help and replace `--torch` with `--framework`. (#15832 )	2021-05-18 13:18:12 +02:00

documentation

[RLlib; Documentation] RLlib README overhaul. (#20249 )

2021-11-18 18:08:40 +01:00

env

[RLlib] QMIX better defaults + added to CI learning tests (#21332 )

2022-01-04 08:54:41 +01:00

export

[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )

2021-12-11 14:57:58 +01:00

inference_and_serving

[RLlib] Issue 20062: Action inference examples missing (#20144 )

2021-11-10 18:49:06 +01:00

models

[RLlib] Use SampleBrach instead of input dict whenever possible (#20746 )

2021-12-02 13:11:26 +01:00

policy

[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )

2021-12-13 12:04:23 +01:00

serving

[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). (#18942 )

2021-09-30 08:30:20 +02:00

simulators/sumo

[RLlib] Integration with SUMO Simulator (#11710 )

2020-11-03 09:45:03 +01:00

tune

[Release] Refactor User Tests (#20028 )

2021-11-05 17:28:37 -07:00

__init__.py

[rllib] Try moving RLlib to top level dir (#5324 )

2019-08-05 23:25:49 -07:00

action_masking.py

[RLlib] Document and extend action mask example. (#20390 )

2021-11-16 13:20:41 +01:00

attention_net.py

Revert "Revert [RLlib] POC: Deprecate build_policy (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 )

2021-11-16 14:49:41 +01:00

attention_net_supervised.py

[RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698 )

2021-01-01 14:06:23 -05:00

autoregressive_action_dist.py

[RLlib] Better example scripts: Description --no-tune and --local-mode CLI options (autoregressive_action_dist.py) (#17705 )

2021-08-16 22:08:13 +02:00

bare_metal_policy_with_custom_view_reqs.py

[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 )

2021-11-03 16:24:00 +01:00

batch_norm_model.py

[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 )

2021-11-03 16:24:00 +01:00

cartpole_lstm.py

[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531 )

2021-06-30 12:32:11 +02:00

centralized_critic.py

[RLlib] Trainer sub-class PPO/DDPPO (instead of build_trainer()). (#20571 )

2021-11-23 23:01:05 +01:00

centralized_critic_2.py

[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )

2021-06-21 13:46:01 +02:00

checkpoint_by_custom_criteria.py

[RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832 )

2021-05-18 13:18:12 +02:00

coin_game_env.py

[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )

2021-06-21 13:46:01 +02:00

complex_struct_space.py

[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )

2021-12-13 12:04:23 +01:00

compute_adapted_gae_on_postprocess_trajectory.py

[RLlib] Example containing a proposal for computing an adapted (time-dependent) GAE used by the PPO algorithm (via callback on_postprocess_trajectory) (#20850 )

2021-12-09 14:48:56 +01:00

curriculum_learning.py

[RLlib] Add simple curriculum learning API and example script. (#15740 )

2021-05-16 17:35:10 +02:00

custom_env.py

[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591 )

2021-09-15 22:16:48 +02:00

custom_eval.py

[RLlib] Allow for evaluation to run by timesteps (alternative to episodes) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )

2021-12-04 13:26:33 +01:00

custom_experiment.py

[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531 )

2021-06-30 12:32:11 +02:00

custom_fast_model.py

[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169 )

2021-07-20 14:58:13 -04:00

custom_input_api.py

[RLlib] Rename metrics_smoothing_episodes into metrics_num_episodes_for_smoothing for clarity. (#20983 )

2021-12-11 20:33:35 +01:00

custom_keras_model.py

[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )

2021-09-30 16:39:05 +02:00

custom_logger.py

[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 )

2021-11-03 16:24:00 +01:00

custom_loss.py

[RLlib] Fix ModelV2 custom metrics for torch. (#16734 )

2021-07-01 13:01:40 +02:00

custom_metrics_and_callbacks.py

[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )

2021-10-29 12:03:56 +02:00

custom_metrics_and_callbacks_legacy.py

[RLlib] Fix all example scripts to run on GPUs. (#11105 )

2020-10-02 23:07:44 +02:00

custom_model_api.py

[RLlib] Use SampleBrach instead of input dict whenever possible (#20746 )

2021-12-02 13:11:26 +01:00

custom_model_loss_and_metrics.py

[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )

2021-09-30 16:39:05 +02:00

custom_observation_filters.py

[RLlib] No Preprocessors; preparatory PR #1 (#18367 )

2021-09-09 08:10:42 +02:00

custom_rnn_model.py

[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531 )

2021-06-30 12:32:11 +02:00

custom_tf_policy.py

[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )

2021-10-25 15:00:00 +02:00

custom_torch_policy.py

[RLlib] JAXPolicy prep. PR #1 . (#13077 )

2020-12-26 20:14:18 -05:00

custom_train_fn.py

[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591 )

2021-09-15 22:16:48 +02:00

custom_vector_env.py

[RLlib] Discussion 2294: Custom vector env example and fix. (#16083 )

2021-07-28 10:40:04 -04:00

deterministic_training.py

[rllib] Add deterministic test to gpu (#19306 )

2021-10-26 10:11:39 -07:00

dmlab_watermaze.py

Remove future imports (#6724 )

2020-01-09 00:15:48 -08:00

eager_execution.py

[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )

2021-10-25 15:00:00 +02:00

env_rendering_and_recording.py

[RLlib] Allow for evaluation to run by timesteps (alternative to episodes) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )

2021-12-04 13:26:33 +01:00

fractional_gpus.py

[RLlib] Fix ModelV2 custom metrics for torch. (#16734 )

2021-07-01 13:01:40 +02:00

hierarchical_training.py

[RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184 )

2021-09-17 12:07:11 +02:00

iterated_prisoners_dilemma_env.py

[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )

2021-06-21 13:46:01 +02:00

lstm_auto_wrapping.py

[RLlib] Preparatory PR for: Documentation on Model Building. (#13260 )

2021-01-08 10:56:09 +01:00

mobilenet_v2_with_lstm.py

[RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832 )

2021-05-18 13:18:12 +02:00

multi_agent_cartpole.py

[RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184 )

2021-09-17 12:07:11 +02:00

multi_agent_custom_policy.py

[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046 )

2021-07-15 05:51:24 -04:00

multi_agent_independent_learning.py

[RLlib] Revert "Revert "updated pettingzoo wrappers, env versions, urls"" (#21339 )

2022-01-04 18:30:26 +01:00

multi_agent_parameter_sharing.py

[RLlib] Revert "Revert "updated pettingzoo wrappers, env versions, urls"" (#21339 )

2022-01-04 18:30:26 +01:00

multi_agent_two_trainers.py

[RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184 )

2021-09-17 12:07:11 +02:00

nested_action_spaces.py

[RLlib] Fix crash when using StochasticSampling exploration (most PG-style algos) w/ tf and numpy > 1.19.5 (#18366 )

2021-09-06 12:14:00 +02:00

offline_rl.py

[RLlib] Allow for evaluation to run by timesteps (alternative to episodes) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )

2021-12-04 13:26:33 +01:00

parallel_evaluation_and_training.py

[RLlib] Allow for evaluation to run by timesteps (alternative to episodes) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )

2021-12-04 13:26:33 +01:00

parametric_actions_cartpole.py

[RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832 )

2021-05-18 13:18:12 +02:00

parametric_actions_cartpole_embeddings_learnt_by_model.py

[RLlib] New and changed version of parametric actions cartpole example + small suggested update in policy_client.py (#15664 )

2021-07-28 15:25:09 -04:00

partial_gpus.py

[RLlib] Fix ModelV2 custom metrics for torch. (#16734 )

2021-07-01 13:01:40 +02:00

preprocessing_disabled.py

[RLlib] No Preprocessors (part 2). (#18468 )

2021-09-23 12:56:45 +02:00

random_parametric_agent.py

[RLlib] Unify the way we create local replay buffer for all agents (#19627 )

2021-10-26 20:56:02 +02:00

re3_exploration.py

[RLlib] Support for RE3 exploration algorithm (for tf) (#19551 )

2021-12-07 13:26:34 +01:00

recsim_with_slateq.py

[RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832 )

2021-05-18 13:18:12 +02:00

remote_base_env_with_custom_api.py

[RLlib] Update a few things to get rid of the remote_vector_env deprecation warning. (#20753 )

2021-12-02 13:10:44 +01:00

remote_envs_with_inference_done_on_main_node.py

[RLlib] Trainer sub-class PPO/DDPPO (instead of build_trainer()). (#20571 )

2021-11-23 23:01:05 +01:00

restore_1_of_n_agents_from_checkpoint.py

[RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184 )

2021-09-17 12:07:11 +02:00

rnnsac_stateless_cartpole.py

[RLlib] Add RNN-SAC agent (#16577 )

2021-07-25 10:04:52 -04:00

rock_paper_scissors_multiagent.py

[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984 )

2021-12-21 08:39:05 +01:00

rollout_worker_custom_workflow.py

[Tune] Remove legacy resources implementations in Runner and Executor. (#19773 )

2021-11-12 12:33:39 -08:00

saving_experiences.py

[Core] First pass at privatizing non-public Python APIs. (#14607 )

2021-03-10 22:47:28 -08:00

sb2rllib_rllib_example.py

[RLlib] Better example scripts: Description --no-tune and --local-mode CLI options (#17038 )

2021-07-26 22:25:48 -04:00

sb2rllib_sb_example.py

[RLlib] Examples for training, saving, loading, testing an agent with SB & RLlib (#15897 )

2021-05-19 16:36:59 +02:00

self_play_league_based_with_open_spiel.py

[RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184 )

2021-09-17 12:07:11 +02:00

self_play_with_open_spiel.py

[RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184 )

2021-09-17 12:07:11 +02:00

sumo_env_local.py

[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )

2021-06-21 13:46:01 +02:00

trajectory_view_api.py

[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action and compute_actions_from_input_dict. (#18921 )

2021-09-30 15:03:37 +02:00

two_step_game.py

[RLlib] QMIX better defaults + added to CI learning tests (#21332 )

2022-01-04 08:54:41 +01:00

two_trainer_workflow.py

[RLlib] Replay buffer API (cleanups; docstrings; renames; move into rllib/execution/buffers dir) (#20552 )

2021-11-19 11:57:37 +01:00

unity3d_env_local.py

[RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832 )

2021-05-18 13:18:12 +02:00

vizdoom_with_attention_net.py

[RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832 )

2021-05-18 13:18:12 +02:00