Commit graph

500 commits

Author SHA1 Message Date
desktable
5af745c90d
[RLlib] Implement the SlateQ algorithm (#11450) 2020-11-03 09:52:04 +01:00
Lara Codeca
e735add268
[RLlib] Integration with SUMO Simulator (#11710) 2020-11-03 09:45:03 +01:00
dHannasch
8346dedc3a
Fix the linter failure. (#11755) 2020-11-02 18:02:15 +01:00
bcahlit
26176ec570
[RLlib] Fix epsilon_greedy on nested_action_spaces only in pytorch (#11453)
* [RLlib] Fix epsilon_greedy on nested_action_spaces only in pytorch

* epsilon_greedy on Continuous action

* formatt

* Fix error

* fix format

* fix bug

* increase speed

* Update rllib/utils/exploration/epsilon_greedy.py

* Update rllib/utils/exploration/epsilon_greedy.py

* Update rllib/utils/exploration/epsilon_greedy.py

Co-authored-by: Sven Mika <sven@anyscale.io>
2020-11-02 12:22:33 +01:00
Sven Mika
54d85a6c2a
[RLlib] Fix RNN learning for tf-eager/tf2.x. (#11720) 2020-11-02 11:18:41 +01:00
Sven Mika
bfc4f95e01
[RLlib] Fix test_bc.py test case. (#11722)
* Fix large json test file.

* Fix large json test file.

* WIP.
2020-10-31 00:16:09 -07:00
Jiajie Xiao
0b07af374a
allow tuple action space (#11429)
Co-authored-by: Jiajie Xiao <jj@Jiajies-MBP-2.attlocal.net>
2020-10-29 16:05:38 +01:00
mvindiola1
9e68b77796
[RLLIB] Wait for remote_workers to finish closing environments before terminating (#11476) 2020-10-28 14:23:06 -07:00
Sven Mika
d9f1874e34
[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609) 2020-10-27 10:00:24 +01:00
Kingsley Kuan
d1dd5d578e
[RLlib] Fix PyTorch A3C / A2C loss function using mixed reduced sum / mean (#11449) 2020-10-22 12:39:34 -07:00
Philsik Chang
ede9347127
[rllib] Add torch_distributed_backend flag for DDPPO (#11362) (#11425) 2020-10-21 18:30:42 -07:00
Eric Liang
e8c77e2847
Remove memory quota enforcement from actors (#11480)
* wip

* fix

* deprecate
2020-10-21 14:29:03 -07:00
Sven Mika
2aec77e305
[RLlib] Fix two test cases that only fail on Travis. (#11435) 2020-10-16 13:53:30 -05:00
Sven Mika
414041c6dd
[RLlib] Do not create env on driver iff num_workers > 0. (#11307) 2020-10-15 18:21:30 +02:00
Sven Mika
a6a94d3206
[RLlib] Fix test_env_with_subprocess.py. (#11356) 2020-10-13 12:42:20 -07:00
Sven Mika
1ebcdf236f
[RLlib] Add support for custom MultiActionDistributions. (#11311) 2020-10-12 13:50:43 -07:00
Sven Mika
0c0f67c14d
[RLlib] ARS/ES eval workers not working: Issue 9933. (#11308) 2020-10-12 13:49:48 -07:00
Sven Mika
8ea1bc5ff9
[RLlib] Allow for more than 2^31 policy timesteps. (#11301) 2020-10-12 13:49:11 -07:00
Sven Mika
f5e2cda68a
[RLlib] SAC: log_alpha not being learnt when on GPU. (#11298) 2020-10-12 13:48:44 -07:00
Julius Frost
7dcfd258cd
[RLlib] Assert LongTensor in SAC Discrete PyTorch (#11245) 2020-10-12 13:47:21 -07:00
Sven Mika
d3bc20b727
[RLlib] ConvTranspose2D module (#11231) 2020-10-12 15:00:42 +02:00
Sven Mika
957877ad3f
Tf version of VisionNet (ray/rllib/model/tf/vision_net.py) crashes iff len(conv-filters)=1. (#11330) 2020-10-11 12:49:47 +02:00
Thomas Tumiel
587319debc
[tune] move _SCHEDULERS to tune.schedulers and add all available schedulers (#11218)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-10-08 16:10:23 -07:00
desktable
8af9ff6dc2
[RLlib] Add MultiAgentEnv wrapper for Kaggle's football environment (#11249)
* [RLlib] Add MultiAgentEnv wrapper for Kaggle's football environment

* Add unit tests to BUILD

* Add gfootball dependency

* Revert the last two commits
2020-10-08 10:57:58 -07:00
desktable
f9621ce23c
[RLlib] Add recsim_wrapper unit test to BUILD (#11225) 2020-10-08 08:23:27 +02:00
Sumanth Ratna
14d8826e43
Fix overriden typo (#11227) 2020-10-07 19:11:07 -07:00
Anes Benmerzoug
ff3e411ea2
[rllib] Fix VectorEnv's check for the info object's type (#10982) 2020-10-07 15:00:37 -07:00
Edward Oakes
cd6936e60b
Deflake test_env_with_subprocess.py (#11257) 2020-10-07 16:19:40 -05:00
Sven Mika
199e5d0f75
[RLlib] Exploration class type annotations. (#11251) 2020-10-07 21:59:14 +02:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Philsik Chang
2b26d2ca1b
[rllib] Fix for Torch checkpoint taken on GPU fails to deserialize on CPU (#11071) (#11208) 2020-10-05 22:01:55 -07:00
desktable
56b56cf7a1
[RLlib] Add RecSim environment wrapper (#11205) 2020-10-05 09:05:02 +02:00
Sven Mika
c17169dc11
[RLlib] Fix all example scripts to run on GPUs. (#11105) 2020-10-02 23:07:44 +02:00
Michael Luo
47b499d899
Cartpole MAML + Discrete (#11028) 2020-10-02 12:56:34 +02:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Benjamin Black
4445f32798
[rllib] Fixed pettingzoo wrapper (#11060)
* fixed PettingZooEnv, relevant docs, examples

* fixed linting error

* pettingzoo wrapper fix

* fixed linting issue
2020-09-30 15:34:47 -07:00
Sven Mika
47eb6613b5
[RLlib] Remove unnecessary copies in compute_advantages. (#10897) 2020-09-29 12:25:20 +02:00
Sven Mika
f91c455527
[RLlib] Curiosity documentation. (#11066) 2020-09-29 09:39:22 +02:00
Eric Liang
8f79b4e45e
[rllib] Replay buffer size inaccurate with replay_seq_len option (#10988)
* support replay seq len

* update

* fix warn

* add test

* test
2020-09-25 13:47:23 -07:00
Eric Liang
609c1b8acd
Start moving ray internal files to _private module (#10994) 2020-09-24 22:46:35 -07:00
Eric Liang
ecdaaffc67
add large data warning (#10957) 2020-09-23 15:46:06 -07:00
Michael Luo
ba5a3ae9e2
Enable vtrace by default (#10962) 2020-09-22 22:18:21 -07:00
Eric Liang
daa03ba6e6
[rllib] Add execution module to package ref (#10941)
* add init

* add

* update
2020-09-21 23:03:06 -07:00
internetcoffeephone
840fb5543b
Change get_action_shape so that it uses the dtype of the Discrete object, rather than overwriting it with tf.int64. (#8424) 2020-09-21 17:08:31 -07:00
mvindiola1
2b893d1bb5
fix incorrect critic loss in TD3 (#10775)
Co-authored-by: Manny Vindiola <manuel.m.vindiola.civ@mail.mil>
2020-09-20 20:01:51 -07:00
Sven Mika
d7c42d6d92
[RLlib] Unity blogpost final fixes. (#10894) 2020-09-20 14:13:20 +02:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Sumanth Ratna
9da7bdcc8e
Use master for links to docs in source (#10866) 2020-09-19 00:30:45 -07:00
Benjamin Black
f2408b719c
Fixed PettingZooEnv (#10847) 2020-09-17 11:28:42 -07:00
Sven Mika
5c7b35d694
[RLlib] Issue 10833 TorchPolicy GPU. (#10834) 2020-09-17 09:04:46 +02:00