Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation ( #19783 )
2021-10-29 12:03:56 +02:00
Rohan138
b9c9cc5946
[RLlib] Updated PettingZoo+RLlib tutorial; Removed pettingzoo example script ( #19069 )
...
* Updated PettingZoo+RLlib tutorial
Updated the tutorial and added link to the blog post by the PettingZoo team.
* Ran linting
* Converted link to tinyurl for linting
* fixed line lengths
* Decrease num_workers to 1
* Added comments
* Decreased num_workers
* Decreased timesteps
* Increased num_workers
* Update links and remove pettingzoo_env.py
* remove pettingzoo.py script from tests
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-10-29 10:57:10 +02:00
Sven Mika
902e854af2
[RLlib; Docs overhaul] Docstring cleanup: Environments. ( #19784 )
...
* wip.
* Test: Make a change in tune to trigger tune tests, which are not run otherwise, but seem to fail nevertheless with this PR's changes.
* remove bare_metal_policy_with_custom_view_reqs from tests
2021-10-29 10:46:52 +02:00
gjoliver
39b0faa3ec
[RLlib]: bug fix, should be input_dict['is_training'] ( #19805 )
2021-10-27 23:30:43 +02:00
gjoliver
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents ( #19627 )
...
* [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents.
This change
1. Get rid of the try...except clause when we call execution_plan(),
and get rid of the Deprecation warning as a result.
2. Fix the execution_plan() call in Trainer._try_recover() too.
3. Most importantly, makes it much easier to create and use different types
of local replay buffers for all our agents.
E.g., allow us to easily create a reservoir sampling replay buffer for
APPO agent for Riot in the near future.
* Introduce explicit configuration for replay buffer types.
* Fix is_training key error.
* actually deprecate buffer_size field.
2021-10-26 20:56:02 +02:00
Avnish Narayan
ad87ddf93e
[rllib] Add deterministic test to gpu ( #19306 )
...
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-10-26 10:11:39 -07:00
Sven Mika
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). ( #19693 )
2021-10-25 15:00:00 +02:00
gjoliver
89fbfc00f8
[RLlib] Some minor cleanups (buffer buffer_size -> capacity and others). ( #19623 )
2021-10-25 09:42:39 +02:00
gjoliver
c3c42278e4
[RLlib] clean up all the SampleBatch['is_training'] deprecation warnings ( #19652 )
...
* [RLlib] clean up all the SampleBatch['is_training'] deprecation warnings.
* wip
2021-10-25 09:38:56 +02:00
Sven Mika
d439fd7f17
[RLlib] TF2/eager memory leak fixes. ( #19198 )
2021-10-09 00:11:53 +02:00
Sven Mika
fd438d5630
[RLlib] Issue 18104: Cannot set remote_worker_envs=True for non local-mode and MultiAgentEnv. ( #19133 )
2021-10-07 22:39:21 +02:00
Sven Mika
b4300dd532
[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. ( #18937 )
2021-10-04 13:29:00 +02:00
Jiajun Yao
7588bfd315
[Lint] Add flake8-bugbear ( #19053 )
...
* Add flake8-bugbear
* Add flake8-bugbear
2021-10-03 23:24:11 -07:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
Sven Mika
828f5d26b7
[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action
and compute_actions_from_input_dict
. ( #18921 )
2021-09-30 15:03:37 +02:00
Sven Mika
05a55a9335
[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). ( #18942 )
2021-09-30 08:30:20 +02:00
mvindiola1
62f5da0b65
[RLlib] Add unit tests for updating episode data in base_env ( #17137 )
2021-09-24 16:08:11 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). ( #18468 )
2021-09-23 12:56:45 +02:00
Sven Mika
a96dbd885b
[RLlib] Reinstate trajectory view API tests. ( #18809 )
2021-09-23 08:31:51 +02:00
Sven Mika
698b4eeed3
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). ( #18669 )
2021-09-21 22:00:14 +02:00
Sven Mika
fd13bac9b3
[RLlib] Add worker
arg (optional) to policy_mapping_fn
. ( #18184 )
2021-09-17 12:07:11 +02:00
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). ( #18591 )
2021-09-15 22:16:48 +02:00
Sven Mika
ea4a22249c
[RLlib] Add simple action-masking example script/env/model (tf and torch). ( #18494 )
2021-09-11 23:08:09 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 ( #18367 )
2021-09-09 08:10:42 +02:00
Sven Mika
1520c3d147
[RLlib] Deepcopy env_ctx for vectorized sub-envs AND add eval-worker-option to Trainer.add_policy()
( #18428 )
2021-09-09 07:10:06 +02:00
Sven Mika
45f60e51a9
[RLlib] DDPPO fixes and benchmarks. ( #18390 )
2021-09-08 19:39:01 +02:00
Sven Mika
56f142cac1
[RLlib] Add support for evaluation_num_episodes=auto (run eval for as long as the parallel train step takes). ( #18380 )
2021-09-07 08:08:37 +02:00
Sven Mika
5292b70fc6
[RLlib] Add multi-GPU attention net tests to nightly test suite (+ R2D2 tests for LSTM and attention nets). ( #18368 )
2021-09-06 17:48:05 +02:00
Sven Mika
59f796edf3
[RLlib] Fix crash when using StochasticSampling exploration (most PG-style algos) w/ tf and numpy > 1.19.5 ( #18366 )
2021-09-06 12:14:00 +02:00
Sven Mika
ba58f5edb1
[RLlib] Strictly run evaluation_num_episodes
episodes each evaluation run (no matter the other eval config settings). ( #18335 )
2021-09-05 15:37:05 +02:00
Sven Mika
a772c775cd
[RLlib] Set random seed (if provided) to Trainer process as well. ( #18307 )
2021-09-04 11:02:30 +02:00
gjoliver
336e79956a
[RLlib] Make MultiAgentEnv inherit gym.Env to avoid direct class type manipulation ( #18156 )
2021-09-03 08:02:05 +02:00
Sven Mika
2357bbc0c8
[RLlib] Issue 18231: Better (earlier) env validation and error message improvement. ( #18249 )
2021-09-02 09:28:16 +02:00
Sven Mika
82465f9342
[RLlib] Better PolicyServer example (w/ or w/o tune) and add printing out actual listen port address in log-level=INFO. ( #18254 )
2021-08-31 22:03:23 +02:00
Joseph Suarez
8136d2912b
[RLlib] Add policies
arg to callback: on_episode_step
(already exists in all other episode-related callbacks) ( #18119 )
2021-08-27 16:12:19 +02:00
gjoliver
a8813675f4
[RLlib] Issue 17900: Set seed
in single vectorized sub-envs properly, if num_envs_per_worker > 1
( #18110 )
...
* In case a worker runs multiple envs, make sure a different seed can be deterministically set on all of them.
* Revert a couple of whitespace changes.
* Fix a few style errors.
Co-authored-by: Jun Gong <jungong@mbpro.local>
2021-08-26 11:32:58 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. ( #17928 )
2021-08-21 17:05:48 +02:00
simonsays1980
60aee4a330
[RLlib] Add example script for bare metal Policy with custom view_requirements
. ( #17896 )
2021-08-20 12:17:13 +02:00
Sven Mika
8248ba531b
[RLlib] Redo #17410 : Example script: Remote worker envs with inference done on main node. ( #17960 )
2021-08-20 08:02:18 +02:00
Alex Wu
318ba6fae0
Revert "[RLlib] Add example script for how to have n remote (parallel) envs with inference happening on "main" (possibly GPU) node. ( #17410 )" ( #17951 )
...
This reverts commit 8fc16b9a18
.
2021-08-19 07:55:10 -07:00
Sven Mika
8fc16b9a18
[RLlib] Add example script for how to have n remote (parallel) envs with inference happening on "main" (possibly GPU) node. ( #17410 )
2021-08-19 12:14:50 +02:00
Sven Mika
a428f10ebe
[RLlib] Add multi-GPU learning tests to nightly. ( #17778 )
2021-08-18 17:21:01 +02:00
Sven Mika
f18213712f
[RLlib] Redo: "fix self play example scripts" PR (17566) ( #17895 )
...
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
2021-08-17 09:13:35 -07:00
Stefan Schneider
eab9c25856
[RLlib] Better example scripts: Description --no-tune and --local-mode CLI options (autoregressive_action_dist.py) ( #17705 )
2021-08-16 22:08:13 +02:00
mguarin0
3e010c5760
[rllib] bug fix for rllib pettingzoo pistonball_v4 example ( #17701 )
...
* bug fix for rllib pettingzoo pistonball_v4 example
* adding test for PR 17701
* ran scripts/format.sh
* ok
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-08-12 00:25:00 -07:00
J K Terry
48e32555c8
[rllib] Update PettingZoo dependency versions ( #17702 )
...
* update pettingzoo dependency versions
* pettingzoo verison
* fix tests
2021-08-11 01:19:19 -07:00
Amog Kamsetty
77f28f1c30
Revert "[RLlib] Fix Trainer.add_policy
for num_workers>0 (self play example scripts). ( #17566 )" ( #17709 )
...
This reverts commit 3b447265d8
.
2021-08-10 10:50:01 -07:00
Sven Mika
3b447265d8
[RLlib] Fix Trainer.add_policy
for num_workers>0 (self play example scripts). ( #17566 )
2021-08-05 11:41:18 -04:00
Sven Mika
5107d16ae5
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. ( #17530 )
2021-08-03 18:30:02 -04:00
kk-55
a7f8dc9d77
[RLlib] New and changed version of parametric actions cartpole example + small suggested update in policy_client.py ( #15664 )
2021-07-28 15:25:09 -04:00