Commit graph

365 commits

Author SHA1 Message Date
Sven Mika
6f8b754d58 [RLlib] DDPG torch GPU bug. (#16133) 2021-05-30 11:06:11 -07:00
Sven Mika
c74c5038d2 [RLlib] Issue 16117: DQN/APEX torch not working on GPU. (#16118) 2021-05-28 06:51:15 -07:00
Sven Mika
e80095591c
[RLlib] Entropy coeff schedule bug fix and git bisect script. (#15937) 2021-05-20 18:15:10 +02:00
Sven Mika
2d34216660
[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762) 2021-05-20 09:27:03 +02:00
Sven Mika
eaa7f6696d
[RLlib] Issue 15887: MARWIL adv norm update mismatch for tf (static-graph) vs torch versions. (#15898) 2021-05-19 15:44:11 -07:00
Michael Luo
474f04e322
[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup (#14707) 2021-05-19 16:32:29 +02:00
Chris Bamford
0be83d9a95
[RLlib] Fixing Memory Leak In Multi-Agent environments. Adding tooling for finding memory leaks in workers. (#15815) 2021-05-18 13:23:00 +02:00
Sven Mika
2303851c3c
[RLlib] Torch multi-GPU + LSTM/RNN bug fix. (#15492) 2021-05-18 11:51:05 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support (#15841) 2021-05-18 11:10:46 +02:00
Sven Mika
d89fb82bfb
[RLlib] Add simple curriculum learning API and example script. (#15740) 2021-05-16 17:35:10 +02:00
Sven Mika
469f5227da
[RLlib] CQL bug fix: Normalize actions for atanh in BC part of the CQL loss. (#15814) 2021-05-16 15:21:06 +02:00
Sven Mika
bc09e75b78
[RLlib] Fix 3 flakey test cases. (#15785) 2021-05-16 12:20:33 +02:00
Sven Mika
c4a3e1589b
[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. (#15761) 2021-05-13 09:17:23 +02:00
Sven Mika
16ddab49f5
[RLlib] Trainer._evaluate -> Trainer.evaluate; Also make evaluation possible w/o evaluation worker set. (#15591) 2021-05-12 12:16:00 +02:00
Sven Mika
a495759f06
[RLlib] Discussion 2022: PPO should auto-adjust rollout_fragment_length if other settings do not align with train_batch_size. (#15611) 2021-05-10 16:16:02 +02:00
Sven Mika
461d73ddf1
[RLlib] simple_optimizer should not be used by default for tf+MA. (#15365) 2021-05-10 16:10:44 +02:00
Sven Mika
c7563a32ed
[RLlib] DD-PPO not supported on Win (add meaningful error message). (#15631) 2021-05-04 19:26:17 +02:00
Michael Luo
4cbe13cdfd
[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603)
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-05-04 19:06:19 +02:00
Sven Mika
4b3add0066
[RLlib] Discussion 2021: PPO does not learn vf, iff use_gae=False (ignores use_critic setting). (#15610) 2021-05-04 14:17:00 +02:00
Antoine Galataud
ce1c001b1d
[RLlib] DQN: Place LearningRateSchedule mixin at the right moment (#15558) 2021-05-04 13:21:40 +02:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 (#15527)
* formatting

* format util

* format release

* format rllib/agents

* format rllib/env

* format rllib/execution

* format rllib/evaluation

* format rllib/examples

* format rllib/policy

* format rllib utils and tests

* format streaming

* more formatting

* update requirements files

* fix rllib type checking

* updates

* update

* fix circular import

* Update python/ray/tests/test_runtime_env.py

* noqa
2021-05-03 14:23:28 -07:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273) 2021-04-30 19:26:30 +02:00
Sven Mika
78b776942f
[RLlib] Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf and torch). (#15538) 2021-04-27 17:19:52 +02:00
SebastianBo1995
f5be8d8f74
[Rllib] Offline Learning Bug, different shapes (#15132) 2021-04-27 17:18:17 +02:00
Sven Mika
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684) 2021-04-27 10:44:54 +02:00
Kai Fricke
2c11a1aff1
[RLlib] Evaluation parallel to training check, key-error hotfix (#15345) 2021-04-27 08:38:10 +02:00
mvindiola1
9330403200
[RLlib] Mask out padded values for A3C loss with recurrent policy (#15525) 2021-04-27 08:36:04 +02:00
Sven Mika
354c960fff
[RLlib] Fix test_dependency_torch and fix custom logger support for RLlib. (#15120) 2021-04-24 08:13:41 +02:00
Sven Mika
bdda73e2dd
[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421)
Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!
2021-04-22 11:29:42 +02:00
Sven Mika
7318439c3d
[RLlib] DQN native_ratio (for training intensity) incorrect (discussion 1763). (#15436)
Thanks @Manuscrit !
2021-04-22 11:06:29 +02:00
Fabien Couthouis
fe06642df0
[RLlib] Report mean losses instead of sum in IMPALA (discussion 1709) (#15427) 2021-04-21 10:59:06 +02:00
Sven Mika
7ff27dfe07
[RLlib] Remove atari dependency for RLlib (in favor of detailed error message). (#15292) 2021-04-20 08:46:58 +02:00
Sven Mika
41968512ca
[RLlib] Partial GPU examples (for learner and workers). (#15334) 2021-04-20 08:46:05 +02:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. (#14709) 2021-04-16 09:16:24 +02:00
Sven Mika
8b3554e37e
[RLlib] Remove all (already soft-deprecated) SampleBatch.data from code. (#15335) 2021-04-15 19:19:51 +02:00
Sven Mika
c90de315e5
[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. (#15295) 2021-04-15 16:50:42 +01:00
Sven Mika
bbfa8ffec9
[RLlib] Minor release 1.3 warnings cleanups. (#15272) 2021-04-14 14:03:15 +02:00
Sven Mika
ef0f163d16
[RLlib] Discussion 1709: IMPALA (tf and torch) reports sum of entropy (over batch) in stats. Should report mean instead. (#15290) 2021-04-14 11:44:25 +02:00
Sven Mika
5254d2fb36
[RLlib] Support parallelizing evaluation and training (optional). (#15040) 2021-04-13 09:53:35 +02:00
Sven Mika
9c5a0cfd7a
[RLlib] Issue 14385: Policy.compute_actions_from_input_dict does not properly track accessed fields for Policy's view requirements. (#14386) 2021-04-11 18:20:04 +02:00
Sven Mika
b267f1f1ba
[RLlib] Add support for Int-Box action spaces. (#15012) 2021-04-11 13:16:01 +02:00
Sven Mika
1bb70e4907
[RLlib] Issue 14523: Torch + py3.8 leads to GPU device error. (#15014) 2021-03-30 21:43:11 +02:00
Raphael CHEN
93d4244d9c
[RLlib] Correctly get bytes size of SampleBatch (#14801) 2021-03-30 19:24:58 +02:00
Sven Mika
4f66309e19
[RLlib] Redo issue 14533 tf enable eager exec (#14984) 2021-03-29 20:07:44 +02:00
SangBin Cho
fa5f961d5e
Revert "[RLlib] Issue 14533: tf.enable_eager_execution() must be called at beginning. (#14737)" (#14918)
This reverts commit 3e389d5812.
2021-03-25 00:42:01 -07:00
mvindiola1
5e350ceaa2
[RLlib] Issue 14119: Fix TD3 policy delay for torch. (#14840) 2021-03-24 16:26:22 +01:00
astronauti
8874ccec2d
[RLlib] Update sac_tf_policy.py (add tf.cast to float32 for rewards) (#14843) 2021-03-24 16:12:55 +01:00
Sven Mika
6708211b59
[RLlib] JSONReader: Mix files if > 1 at beginning (each worker should start with different file). (#14865) 2021-03-24 16:07:40 +01:00
Sven Mika
3e389d5812
[RLlib] Issue 14533: tf.enable_eager_execution() must be called at beginning. (#14737) 2021-03-24 12:54:27 +01:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. (#14860) 2021-03-23 09:50:18 -07:00