Commit graph

58 commits

Author SHA1 Message Date
Sven Mika
7eb1a29426
[RLlib] Fix ModelV2 custom metrics for torch. (#16734) 2021-07-01 13:01:40 +02:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 2021-06-21 13:46:01 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
This reverts commit e78ec370a9.
2021-06-18 12:21:49 -07:00
Sven Mika
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. (#16359) 2021-06-18 10:31:30 +02:00
Sven Mika
3d4dc60e2e
[RLlib] CQL iteration count fixes: Remove dummy buffer and unnecessary store op from exec_plan. (#16332) 2021-06-10 07:49:17 +02:00
Sven Mika
4b8dadccbd
[RLlib] Fix PR 16162: Having added sleep to _NextValueNotReady causes TD3 tests to become flakey. (#16309) 2021-06-08 07:27:02 -07:00
Sven Mika
e2be41b407
[RLlib] MARWIL + BC: Various fixes and enhancements. (#16218) 2021-06-03 22:29:00 +02:00
Chris Bamford
1e3721ef4a
[RLlib] Remove bad spinlocks to allow pytorch GPU scheduler to interrupt. (#16162) 2021-06-01 16:40:28 +02:00
Sven Mika
2d34216660
[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762) 2021-05-20 09:27:03 +02:00
Sven Mika
2303851c3c
[RLlib] Torch multi-GPU + LSTM/RNN bug fix. (#15492) 2021-05-18 11:51:05 +02:00
Michael Luo
4cbe13cdfd
[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603)
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-05-04 19:06:19 +02:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 (#15527)
* formatting

* format util

* format release

* format rllib/agents

* format rllib/env

* format rllib/execution

* format rllib/evaluation

* format rllib/examples

* format rllib/policy

* format rllib utils and tests

* format streaming

* more formatting

* update requirements files

* fix rllib type checking

* updates

* update

* fix circular import

* Update python/ray/tests/test_runtime_env.py

* noqa
2021-05-03 14:23:28 -07:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. (#14709) 2021-04-16 09:16:24 +02:00
Sven Mika
dfc116ea27
[RLlib] Discussion 681: Metrics prepends newest episodes instead of appending. (#15236) 2021-04-11 15:31:43 +02:00
Chris Bamford
cd89f0dc55
[RLLib] Episode media logging support (#14767) 2021-03-19 09:17:09 +01:00
Sven Mika
c3a15ecc0f
[RLlib] Issue #13802: Enhance metrics for multiagent->count_steps_by=agent_steps setting. (#14033) 2021-03-18 20:27:41 +01:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393) 2021-03-08 15:41:27 +01:00
Sven Mika
7718ec70fb
[RLlib] Remove old SegmentTree from tests dir and unflake respective segment tree test. (#14450) 2021-03-03 14:31:30 +01:00
Sven Mika
8000258333
[RLlib] R2D2 Implementation. (#13933) 2021-02-25 12:18:11 +01:00
Sven Mika
775e685531
[RLlib] Issue #13824: compress_observations=True crashes for all algos not using a replay buffer. (#14034) 2021-02-18 21:36:32 +01:00
Sven Mika
eb0038612f
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584) 2021-02-08 15:02:19 +01:00
Sven Mika
d001af3e59
[RLlib] Allow rllib rollout to run distributed via evaluation workers. (#13718) 2021-02-08 12:05:16 +01:00
Michael Luo
a2d1215200
[RLlib] Execution Annotation (#13036) 2020-12-24 09:30:33 -05:00
Edward Oakes
cde711aaf1
Revert "[RLLib] Execution-Folder Type Annotations (#12760)" (#12886)
This reverts commit becca1424d.
2020-12-15 11:03:02 -08:00
Michael Luo
becca1424d
[RLLib] Execution-Folder Type Annotations (#12760) 2020-12-14 19:16:44 +01:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika
fb318addcb
[RLlib] Curiosity exploration module: tf/tf2.x/tf-eager support. (#11945) 2020-11-29 12:31:24 +01:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Eric Liang
8f79b4e45e
[rllib] Replay buffer size inaccurate with replay_seq_len option (#10988)
* support replay seq len

* update

* fix warn

* add test

* test
2020-09-25 13:47:23 -07:00
Eric Liang
ecdaaffc67
add large data warning (#10957) 2020-09-23 15:46:06 -07:00
Eric Liang
daa03ba6e6
[rllib] Add execution module to package ref (#10941)
* add init

* add

* update
2020-09-21 23:03:06 -07:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. (#10420) 2020-09-02 14:03:01 +02:00
Sven Mika
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114) 2020-08-15 13:24:22 +02:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all script & backfill lint fixes (#9956) 2020-08-07 16:49:49 -07:00
Sven Mika
b0b0463161
[RLlib] Trajectory View API (preparatory cleanup and enhancements). (#9678) 2020-07-29 21:15:09 +02:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. (#8752) 2020-07-11 22:06:35 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). (#9136)
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Eric Liang
1e0e1a45e6
[rllib] Add type annotations for evaluation/, env/ packages (#9003) 2020-06-19 13:09:05 -07:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils cleanup. (#8932) 2020-06-16 08:52:20 +02:00
Eric Liang
34bae27ac7
[rllib] Flexible multi-agent replay modes and replay_sequence_length (#8893) 2020-06-12 20:17:27 -07:00
mehrdadn
f93bb008bb
Change os.uname()[1] and socket.gethostname() to the portable and faster platform.node_ip() (#8839)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-08 21:29:46 -07:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520) 2020-05-27 16:19:13 +02:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
Eric Liang
aa7a58e92f
[rllib] Support training intensity for dqn / apex (#8396) 2020-05-20 11:22:30 -07:00
Sven Mika
c9435cad43
WIP. (#8456)
Fix multi-GPU histogram metrics for > 0D tensors.
2020-05-15 21:43:27 +02:00
Eric Liang
6bf1dc0888
[rllib] [hotfix] Build broken due to merge conflict: MixInReplay has no attribute buffer 2020-05-13 12:21:04 -07:00
Eric Liang
96f4d82cc3
[rllib] Qmix replay ratio is wrong 2020-05-12 13:07:19 -07:00
Eric Liang
9d012626e5
[rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00