Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 ( #18367 )
2021-09-09 08:10:42 +02:00
Sven Mika
1520c3d147
[RLlib] Deepcopy env_ctx for vectorized sub-envs AND add eval-worker-option to Trainer.add_policy()
( #18428 )
2021-09-09 07:10:06 +02:00
Sven Mika
56f142cac1
[RLlib] Add support for evaluation_num_episodes=auto (run eval for as long as the parallel train step takes). ( #18380 )
2021-09-07 08:08:37 +02:00
Sven Mika
ba58f5edb1
[RLlib] Strictly run evaluation_num_episodes
episodes each evaluation run (no matter the other eval config settings). ( #18335 )
2021-09-05 15:37:05 +02:00
Sven Mika
a772c775cd
[RLlib] Set random seed (if provided) to Trainer process as well. ( #18307 )
2021-09-04 11:02:30 +02:00
Sven Mika
82465f9342
[RLlib] Better PolicyServer example (w/ or w/o tune) and add printing out actual listen port address in log-level=INFO. ( #18254 )
2021-08-31 22:03:23 +02:00
Sven Mika
599e589481
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. ( #18065 )
2021-08-31 14:56:53 +02:00
Sven Mika
4888d7c9af
[RLlib] Replay buffers: Add config option to store contents in checkpoints. ( #17999 )
2021-08-31 12:21:49 +02:00
Sven Mika
9883505e84
[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). ( #18017 )
2021-08-24 21:55:27 +02:00
Sven Mika
8248ba531b
[RLlib] Redo #17410 : Example script: Remote worker envs with inference done on main node. ( #17960 )
2021-08-20 08:02:18 +02:00
Alex Wu
318ba6fae0
Revert "[RLlib] Add example script for how to have n remote (parallel) envs with inference happening on "main" (possibly GPU) node. ( #17410 )" ( #17951 )
...
This reverts commit 8fc16b9a18
.
2021-08-19 07:55:10 -07:00
Sven Mika
8fc16b9a18
[RLlib] Add example script for how to have n remote (parallel) envs with inference happening on "main" (possibly GPU) node. ( #17410 )
2021-08-19 12:14:50 +02:00
Sven Mika
f18213712f
[RLlib] Redo: "fix self play example scripts" PR (17566) ( #17895 )
...
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
2021-08-17 09:13:35 -07:00
Amog Kamsetty
77f28f1c30
Revert "[RLlib] Fix Trainer.add_policy
for num_workers>0 (self play example scripts). ( #17566 )" ( #17709 )
...
This reverts commit 3b447265d8
.
2021-08-10 10:50:01 -07:00
Sven Mika
3b447265d8
[RLlib] Fix Trainer.add_policy
for num_workers>0 (self play example scripts). ( #17566 )
2021-08-05 11:41:18 -04:00
Sven Mika
5107d16ae5
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. ( #17530 )
2021-08-03 18:30:02 -04:00
Sven Mika
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). ( #17371 )
2021-08-03 11:35:49 -04:00
Rohan138
f30b444bac
[Rllib] set self._allow_unknown_config ( #17335 )
...
Co-authored-by: Sven Mika <sven@anyscale.io>
2021-07-28 11:48:41 +01:00
Sven Mika
0c5c70b584
[RLlib] Discussion 247: Allow remote sub-envs (within vectorized) to be used with custom APIs. ( #17118 )
2021-07-25 16:55:51 -04:00
Sven Mika
7bc4376466
[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). ( #17077 )
2021-07-22 10:59:13 -04:00
Julius Frost
0b1b6222bc
[rllib] Add merge_trainer_config arguments to trainer template ( #17160 )
2021-07-21 15:43:06 -07:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. ( #17169 )
2021-07-20 14:58:13 -04:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. ( #17031 )
2021-07-19 13:16:03 -04:00
Sven Mika
649580d735
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). ( #17046 )
2021-07-15 05:51:24 -04:00
Sven Mika
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). ( #17014 )
2021-07-13 14:01:30 -04:00
Amog Kamsetty
38b5b6d24c
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). ( #16565 )" ( #17036 )
...
This reverts commit e4123fff27
.
2021-07-13 09:57:15 -07:00
Kai Fricke
27d80c4c88
[RLlib] ONNX export for tensorflow (1.x) and torch ( #16805 )
2021-07-13 12:38:11 -04:00
Sven Mika
e4123fff27
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). ( #16565 )
2021-07-13 06:38:14 -04:00
Amog Kamsetty
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action
, not normalize_action
." ( #17002 )
...
This reverts commit 7862dd64ea
.
2021-07-12 11:09:14 -07:00
Julius Frost
a88b217d3f
[rllib] Enhancements to Input API for customizing offline datasets ( #16957 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-10 15:05:25 -07:00
Sven Mika
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action
, not normalize_action
. ( #16774 )
2021-07-08 17:31:34 +02:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes ( #16531 )
2021-06-30 12:32:11 +02:00
Sven Mika
c95dea51e9
[RLlib] External env enhancements + more examples. ( #16583 )
2021-06-23 09:09:01 +02:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. ( #16569 )
2021-06-21 13:46:01 +02:00
Sven Mika
169ddabae7
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. ( #16429 )
2021-06-19 22:42:00 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )" ( #16543 )
...
This reverts commit e78ec370a9
.
2021-06-18 12:21:49 -07:00
Sven Mika
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )
2021-06-18 10:31:30 +02:00
Sven Mika
3d4dc60e2e
[RLlib] CQL iteration count fixes: Remove dummy buffer and unnecessary store op from exec_plan. ( #16332 )
2021-06-10 07:49:17 +02:00
Sven Mika
d89fb82bfb
[RLlib] Add simple curriculum learning API and example script. ( #15740 )
2021-05-16 17:35:10 +02:00
Sven Mika
16ddab49f5
[RLlib] Trainer._evaluate -> Trainer.evaluate; Also make evaluation possible w/o evaluation worker set. ( #15591 )
2021-05-12 12:16:00 +02:00
Sven Mika
461d73ddf1
[RLlib] simple_optimizer
should not be used by default for tf+MA. ( #15365 )
2021-05-10 16:10:44 +02:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 ( #15527 )
...
* formatting
* format util
* format release
* format rllib/agents
* format rllib/env
* format rllib/execution
* format rllib/evaluation
* format rllib/examples
* format rllib/policy
* format rllib utils and tests
* format streaming
* more formatting
* update requirements files
* fix rllib type checking
* updates
* update
* fix circular import
* Update python/ray/tests/test_runtime_env.py
* noqa
2021-05-03 14:23:28 -07:00
Kai Fricke
2c11a1aff1
[RLlib] Evaluation parallel to training check, key-error hotfix ( #15345 )
2021-04-27 08:38:10 +02:00
Sven Mika
354c960fff
[RLlib] Fix test_dependency_torch and fix custom logger support for RLlib. ( #15120 )
2021-04-24 08:13:41 +02:00
Sven Mika
7ff27dfe07
[RLlib] Remove atari dependency for RLlib (in favor of detailed error message). ( #15292 )
2021-04-20 08:46:58 +02:00
Sven Mika
41968512ca
[RLlib] Partial GPU examples (for learner and workers). ( #15334 )
2021-04-20 08:46:05 +02:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. ( #14709 )
2021-04-16 09:16:24 +02:00
Sven Mika
c90de315e5
[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. ( #15295 )
2021-04-15 16:50:42 +01:00
Sven Mika
5254d2fb36
[RLlib] Support parallelizing evaluation and training (optional). ( #15040 )
2021-04-13 09:53:35 +02:00
Sven Mika
4f66309e19
[RLlib] Redo issue 14533 tf enable eager exec ( #14984 )
2021-03-29 20:07:44 +02:00