ray/rllib/examples
Avnish Narayan 026bf01071
[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535)
* Fix QMix, SAC, and MADDPA too.

* Unpin gym and deprecate pendulum v0

Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1

Lastly, all of the RLlib tests and have
been moved to python 3.7

* Add gym installation based on python version.

Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20

* Reformatting

* Fixing tests

* Move atari-py install conditional to req.txt

* migrate to new ale install method

* Fix QMix, SAC, and MADDPA too.

* Unpin gym and deprecate pendulum v0

Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1

Lastly, all of the RLlib tests and have
been moved to python 3.7
* Add gym installation based on python version.

Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20

Move atari-py install conditional to req.txt

migrate to new ale install method

Make parametric_actions_cartpole return float32 actions/obs

Adding type conversions if obs/actions don't match space

Add utils to make elements match gym space dtypes

Co-authored-by: Jun Gong <jungong@anyscale.com>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-03 16:24:00 +01:00
..
env [RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535) 2021-11-03 16:24:00 +01:00
export [RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693) 2021-10-25 15:00:00 +02:00
models [RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982) 2021-11-03 10:00:46 +01:00
policy [RLlib] No Preprocessors (part 2). (#18468) 2021-09-23 12:56:45 +02:00
serving [RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). (#18942) 2021-09-30 08:30:20 +02:00
simulators/sumo [RLlib] Integration with SUMO Simulator (#11710) 2020-11-03 09:45:03 +01:00
__init__.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
action_masking.py [RLlib] Add simple action-masking example script/env/model (tf and torch). (#18494) 2021-09-11 23:08:09 +02:00
attention_net.py [RLlib] Better PolicyServer example (w/ or w/o tune) and add printing out actual listen port address in log-level=INFO. (#18254) 2021-08-31 22:03:23 +02:00
attention_net_supervised.py [RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698) 2021-01-01 14:06:23 -05:00
autoregressive_action_dist.py [RLlib] Better example scripts: Description --no-tune and --local-mode CLI options (autoregressive_action_dist.py) (#17705) 2021-08-16 22:08:13 +02:00
bare_metal_policy_with_custom_view_reqs.py [RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535) 2021-11-03 16:24:00 +01:00
batch_norm_model.py [RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535) 2021-11-03 16:24:00 +01:00
cartpole_lstm.py [RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
centralized_critic.py [RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982) 2021-11-03 10:00:46 +01:00
centralized_critic_2.py [RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 2021-06-21 13:46:01 +02:00
checkpoint_by_custom_criteria.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
coin_game_env.py [RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 2021-06-21 13:46:01 +02:00
complex_struct_space.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
curriculum_learning.py [RLlib] Add simple curriculum learning API and example script. (#15740) 2021-05-16 17:35:10 +02:00
custom_env.py [RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591) 2021-09-15 22:16:48 +02:00
custom_eval.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
custom_experiment.py [RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
custom_fast_model.py [RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 2021-07-20 14:58:13 -04:00
custom_input_api.py [RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535) 2021-11-03 16:24:00 +01:00
custom_keras_model.py [RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879) 2021-09-30 16:39:05 +02:00
custom_logger.py [RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535) 2021-11-03 16:24:00 +01:00
custom_loss.py [RLlib] Fix ModelV2 custom metrics for torch. (#16734) 2021-07-01 13:01:40 +02:00
custom_metrics_and_callbacks.py [RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783) 2021-10-29 12:03:56 +02:00
custom_metrics_and_callbacks_legacy.py [RLlib] Fix all example scripts to run on GPUs. (#11105) 2020-10-02 23:07:44 +02:00
custom_model_api.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
custom_model_loss_and_metrics.py [RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879) 2021-09-30 16:39:05 +02:00
custom_observation_filters.py [RLlib] No Preprocessors; preparatory PR #1 (#18367) 2021-09-09 08:10:42 +02:00
custom_rnn_model.py [RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
custom_tf_policy.py [RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693) 2021-10-25 15:00:00 +02:00
custom_torch_policy.py [RLlib] JAXPolicy prep. PR #1. (#13077) 2020-12-26 20:14:18 -05:00
custom_train_fn.py [RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591) 2021-09-15 22:16:48 +02:00
custom_vector_env.py [RLlib] Discussion 2294: Custom vector env example and fix. (#16083) 2021-07-28 10:40:04 -04:00
deterministic_training.py [rllib] Add deterministic test to gpu (#19306) 2021-10-26 10:11:39 -07:00
dmlab_watermaze.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
eager_execution.py [RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693) 2021-10-25 15:00:00 +02:00
env_rendering_and_recording.py [RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535) 2021-11-03 16:24:00 +01:00
fractional_gpus.py [RLlib] Fix ModelV2 custom metrics for torch. (#16734) 2021-07-01 13:01:40 +02:00
hierarchical_training.py [RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184) 2021-09-17 12:07:11 +02:00
iterated_prisoners_dilemma_env.py [RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 2021-06-21 13:46:01 +02:00
lstm_auto_wrapping.py [RLlib] Preparatory PR for: Documentation on Model Building. (#13260) 2021-01-08 10:56:09 +01:00
mobilenet_v2_with_lstm.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
multi_agent_cartpole.py [RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184) 2021-09-17 12:07:11 +02:00
multi_agent_custom_policy.py [RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046) 2021-07-15 05:51:24 -04:00
multi_agent_independent_learning.py [RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046) 2021-07-15 05:51:24 -04:00
multi_agent_parameter_sharing.py [RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046) 2021-07-15 05:51:24 -04:00
multi_agent_two_trainers.py [RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184) 2021-09-17 12:07:11 +02:00
nested_action_spaces.py [RLlib] Fix crash when using StochasticSampling exploration (most PG-style algos) w/ tf and numpy > 1.19.5 (#18366) 2021-09-06 12:14:00 +02:00
offline_rl.py [RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535) 2021-11-03 16:24:00 +01:00
parallel_evaluation_and_training.py [RLlib] Add support for evaluation_num_episodes=auto (run eval for as long as the parallel train step takes). (#18380) 2021-09-07 08:08:37 +02:00
parametric_actions_cartpole.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
parametric_actions_cartpole_embeddings_learnt_by_model.py [RLlib] New and changed version of parametric actions cartpole example + small suggested update in policy_client.py (#15664) 2021-07-28 15:25:09 -04:00
partial_gpus.py [RLlib] Fix ModelV2 custom metrics for torch. (#16734) 2021-07-01 13:01:40 +02:00
preprocessing_disabled.py [RLlib] No Preprocessors (part 2). (#18468) 2021-09-23 12:56:45 +02:00
random_parametric_agent.py [RLlib] Unify the way we create local replay buffer for all agents (#19627) 2021-10-26 20:56:02 +02:00
recsim_with_slateq.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
remote_envs_with_inference_done_on_main_node.py [RLlib] Redo #17410: Example script: Remote worker envs with inference done on main node. (#17960) 2021-08-20 08:02:18 +02:00
remote_vector_env_with_custom_api.py [RLlib] Issue 18104: Cannot set remote_worker_envs=True for non local-mode and MultiAgentEnv. (#19133) 2021-10-07 22:39:21 +02:00
restore_1_of_n_agents_from_checkpoint.py [RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184) 2021-09-17 12:07:11 +02:00
rnnsac_stateless_cartpole.py [RLlib] Add RNN-SAC agent (#16577) 2021-07-25 10:04:52 -04:00
rock_paper_scissors_multiagent.py [RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693) 2021-10-25 15:00:00 +02:00
rollout_worker_custom_workflow.py [RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
saving_experiences.py [Core] First pass at privatizing non-public Python APIs. (#14607) 2021-03-10 22:47:28 -08:00
sb2rllib_rllib_example.py [RLlib] Better example scripts: Description --no-tune and --local-mode CLI options (#17038) 2021-07-26 22:25:48 -04:00
sb2rllib_sb_example.py [RLlib] Examples for training, saving, loading, testing an agent with SB & RLlib (#15897) 2021-05-19 16:36:59 +02:00
self_play_league_based_with_open_spiel.py [RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184) 2021-09-17 12:07:11 +02:00
self_play_with_open_spiel.py [RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184) 2021-09-17 12:07:11 +02:00
serve_and_rllib.py [RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
sumo_env_local.py [RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 2021-06-21 13:46:01 +02:00
trajectory_view_api.py [RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action and compute_actions_from_input_dict. (#18921) 2021-09-30 15:03:37 +02:00
two_step_game.py [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
two_trainer_workflow.py [RLlib] Some minor cleanups (buffer buffer_size -> capacity and others). (#19623) 2021-10-25 09:42:39 +02:00
unity3d_env_local.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
vizdoom_with_attention_net.py [RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00