mvindiola1
|
9330403200
|
[RLlib] Mask out padded values for A3C loss with recurrent policy (#15525)
|
2021-04-27 08:36:04 +02:00 |
|
Sven Mika
|
354c960fff
|
[RLlib] Fix test_dependency_torch and fix custom logger support for RLlib. (#15120)
|
2021-04-24 08:13:41 +02:00 |
|
Sven Mika
|
bdda73e2dd
|
[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421)
Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!
|
2021-04-22 11:29:42 +02:00 |
|
Sven Mika
|
7318439c3d
|
[RLlib] DQN native_ratio (for training intensity) incorrect (discussion 1763). (#15436)
Thanks @Manuscrit !
|
2021-04-22 11:06:29 +02:00 |
|
Fabien Couthouis
|
fe06642df0
|
[RLlib] Report mean losses instead of sum in IMPALA (discussion 1709) (#15427)
|
2021-04-21 10:59:06 +02:00 |
|
Sven Mika
|
7ff27dfe07
|
[RLlib] Remove atari dependency for RLlib (in favor of detailed error message). (#15292)
|
2021-04-20 08:46:58 +02:00 |
|
Sven Mika
|
41968512ca
|
[RLlib] Partial GPU examples (for learner and workers). (#15334)
|
2021-04-20 08:46:05 +02:00 |
|
Sven Mika
|
cecfc3b43b
|
[RLlib] Multi-GPU support for Torch algorithms. (#14709)
|
2021-04-16 09:16:24 +02:00 |
|
Sven Mika
|
8b3554e37e
|
[RLlib] Remove all (already soft-deprecated) SampleBatch.data from code. (#15335)
|
2021-04-15 19:19:51 +02:00 |
|
Sven Mika
|
c90de315e5
|
[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. (#15295)
|
2021-04-15 16:50:42 +01:00 |
|
Sven Mika
|
bbfa8ffec9
|
[RLlib] Minor release 1.3 warnings cleanups. (#15272)
|
2021-04-14 14:03:15 +02:00 |
|
Sven Mika
|
ef0f163d16
|
[RLlib] Discussion 1709: IMPALA (tf and torch) reports sum of entropy (over batch) in stats. Should report mean instead. (#15290)
|
2021-04-14 11:44:25 +02:00 |
|
Sven Mika
|
5254d2fb36
|
[RLlib] Support parallelizing evaluation and training (optional). (#15040)
|
2021-04-13 09:53:35 +02:00 |
|
Sven Mika
|
9c5a0cfd7a
|
[RLlib] Issue 14385: Policy.compute_actions_from_input_dict does not properly track accessed fields for Policy's view requirements. (#14386)
|
2021-04-11 18:20:04 +02:00 |
|
Sven Mika
|
b267f1f1ba
|
[RLlib] Add support for Int-Box action spaces. (#15012)
|
2021-04-11 13:16:01 +02:00 |
|
Sven Mika
|
1bb70e4907
|
[RLlib] Issue 14523: Torch + py3.8 leads to GPU device error. (#15014)
|
2021-03-30 21:43:11 +02:00 |
|
Raphael CHEN
|
93d4244d9c
|
[RLlib] Correctly get bytes size of SampleBatch (#14801)
|
2021-03-30 19:24:58 +02:00 |
|
Sven Mika
|
4f66309e19
|
[RLlib] Redo issue 14533 tf enable eager exec (#14984)
|
2021-03-29 20:07:44 +02:00 |
|
SangBin Cho
|
fa5f961d5e
|
Revert "[RLlib] Issue 14533: tf.enable_eager_execution() must be called at beginning. (#14737)" (#14918)
This reverts commit 3e389d5812 .
|
2021-03-25 00:42:01 -07:00 |
|
mvindiola1
|
5e350ceaa2
|
[RLlib] Issue 14119: Fix TD3 policy delay for torch. (#14840)
|
2021-03-24 16:26:22 +01:00 |
|
astronauti
|
8874ccec2d
|
[RLlib] Update sac_tf_policy.py (add tf.cast to float32 for rewards) (#14843)
|
2021-03-24 16:12:55 +01:00 |
|
Sven Mika
|
6708211b59
|
[RLlib] JSONReader: Mix files if > 1 at beginning (each worker should start with different file). (#14865)
|
2021-03-24 16:07:40 +01:00 |
|
Sven Mika
|
3e389d5812
|
[RLlib] Issue 14533: tf.enable_eager_execution() must be called at beginning. (#14737)
|
2021-03-24 12:54:27 +01:00 |
|
Sven Mika
|
04bc0a9828
|
[RLlib] Remove all non-trajectory view API code. (#14860)
|
2021-03-23 09:50:18 -07:00 |
|
Sven Mika
|
f859ebb99f
|
[RLlib] Fix env rendering and recording options (for non-local mode; >0 workers; +evaluation-workers). (#14796)
|
2021-03-23 10:06:06 +01:00 |
|
Sven Mika
|
e7557ae433
|
[RLlib] Issue 13132: DQN does not update target net after restore (#14838)
|
2021-03-23 08:30:37 +01:00 |
|
Sven Mika
|
c3a15ecc0f
|
[RLlib] Issue #13802: Enhance metrics for multiagent->count_steps_by=agent_steps setting. (#14033)
|
2021-03-18 20:27:41 +01:00 |
|
Sven Mika
|
69202c6a7d
|
[RLlib] Obsolete usage tracking dict via sample batch. (#13065)
|
2021-03-17 08:18:15 +01:00 |
|
Sven Mika
|
ee4b6e7e3b
|
[RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569)
|
2021-03-12 18:27:25 +01:00 |
|
Michael Luo
|
020c9439dd
|
[RLlib] CQL Documentation + Tests (#14531)
|
2021-03-11 18:51:39 +01:00 |
|
Sven Mika
|
732197e23a
|
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393)
|
2021-03-08 15:41:27 +01:00 |
|
Sven Mika
|
ef944bc5f0
|
[RLlib] Re-enable placement group support for RLlib. (#14384)
|
2021-03-05 08:16:24 +01:00 |
|
Eric Liang
|
9db000ff2c
|
Auto report object store memory usage; remove some deprecated code (#14260)
|
2021-03-01 13:19:44 -08:00 |
|
Richard Liaw
|
a2d2275ee1
|
Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289)" (#14360)
This reverts commit 6cd0cd3bd9 .
|
2021-02-25 14:27:35 -08:00 |
|
Sven Mika
|
6cd0cd3bd9
|
[RLlib + Tune] Add placement group support to RLlib. (#14289)
|
2021-02-25 16:01:31 +01:00 |
|
Sven Mika
|
8000258333
|
[RLlib] R2D2 Implementation. (#13933)
|
2021-02-25 12:18:11 +01:00 |
|
Kai Fricke
|
d9e5d5f47a
|
[RLlib] Cast fcnet_hiddens to list for DQN models (list vs tuple mismatch error) (#14308)
|
2021-02-25 08:06:08 +01:00 |
|
Michael Luo
|
ec2c10309b
|
[RLlib] CQL for HalfCheetah-Random-v0 + Hopper-Random-v0 + CQL Bug Fixes (#14243)
|
2021-02-22 17:30:18 +01:00 |
|
Sven Mika
|
775e685531
|
[RLlib] Issue #13824: compress_observations=True crashes for all algos not using a replay buffer. (#14034)
|
2021-02-18 21:36:32 +01:00 |
|
Sven Mika
|
4db86404ad
|
[RLlib] Issue #13507: Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. (#14037)
|
2021-02-11 18:58:46 +01:00 |
|
Sven Mika
|
a2f7998026
|
[RLlib] Issue #13342: Add validate_spaces to MB-MPO. (#14038)
|
2021-02-11 11:36:53 +01:00 |
|
Sven Mika
|
37c7daa3c0
|
[RLlib] DDPG: Support simplex action space. (#14011)
|
2021-02-10 15:10:01 +01:00 |
|
Sven Mika
|
eb0038612f
|
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584)
|
2021-02-08 15:02:19 +01:00 |
|
Chace Ashcraft
|
ebeee1d59a
|
[RLlib] Pytorch MAML fix for more than two workers with discrete actions (#13835)
|
2021-02-08 12:06:02 +01:00 |
|
Sven Mika
|
d001af3e59
|
[RLlib] Allow rllib rollout to run distributed via evaluation workers. (#13718)
|
2021-02-08 12:05:16 +01:00 |
|
Raoul Khouri
|
714c367b9d
|
[RLlib] Trainer._validate_config idempotentcy correction (issue 13427) (#13556)
|
2021-02-02 13:11:57 +01:00 |
|
Sven Mika
|
52c94b7ee9
|
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522)
|
2021-02-02 13:05:58 +01:00 |
|
Maltimore
|
b4702de1c2
|
[RLlib] move evaluation to trainer.step() such that the result is properly logged (#12708)
|
2021-01-25 12:56:00 +01:00 |
|
Sven Mika
|
9423930bcc
|
[RLlib] MAML: Add cartpole mass test for PyTorch. (#13679)
|
2021-01-25 12:32:41 +01:00 |
|
Sven Mika
|
d629292d63
|
[RLlib] Add grad_clip config option to MARWIL and stabilize grad clipping against inf global_norms. (#13634)
|
2021-01-22 19:36:02 +01:00 |
|