Commit graph

917 commits

Author SHA1 Message Date
Julius Frost
d7a5ec1830
[RLlib] SAC tuple observation space fix (#17356) 2021-07-28 12:39:28 -04:00
Sven Mika
0d8fce8fd8
[RLlib] Discussion 2294: Custom vector env example and fix. (#16083) 2021-07-28 10:40:04 -04:00
Rohan138
f30b444bac
[Rllib] set self._allow_unknown_config (#17335)
Co-authored-by: Sven Mika <sven@anyscale.io>
2021-07-28 11:48:41 +01:00
Sven Mika
58da5c1c9b
[RLlib] Discussion 3001: Fix comment on internal state shape (must be [B x S=state dim]). (#17341) 2021-07-27 21:41:53 -04:00
Sven Mika
90b21ce27e
[RLlib] De-flake 3 test cases; Fix config.simple_optimizer and SampleBatch.is_training warnings. (#17321) 2021-07-27 14:39:06 -04:00
Stefan Schneider
489febc6b2
[RLlib] Better example scripts: Description --no-tune and --local-mode CLI options (#17038) 2021-07-26 22:25:48 -04:00
Julius Frost
16be091702
[RLlib] Refactor if __name__ == "__main__" into main() method in rollout/train.py for better reusability (#17315) 2021-07-26 11:12:59 -04:00
Sven Mika
5231fdd996
[Testing] Split RLlib example scripts CI tests into 4 jobs (from 2). (#17331) 2021-07-26 10:52:55 -04:00
Sven Mika
0c5c70b584
[RLlib] Discussion 247: Allow remote sub-envs (within vectorized) to be used with custom APIs. (#17118) 2021-07-25 16:55:51 -04:00
Chris Bamford
29768a7c01
[RLLib] (P1 regression) Fixing view requirements in compute actions (#15856) 2021-07-25 14:25:07 -04:00
ddworak94
fba8461663
[RLlib] Add RNN-SAC agent (#16577)
Shoutout to @ddworak94 :)
2021-07-25 10:04:52 -04:00
Sven Mika
7bc4376466
[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). (#17077) 2021-07-22 10:59:13 -04:00
Richard Liaw
a78a2263e5
[RLlib] Fix reverted RockPaperScissors Pettingzoo example (#16896) 2021-07-22 10:55:07 -04:00
Vince Jankovics
05c9dfbbda
[RLlib] CV2 to Skimage dependency change (#16841) 2021-07-21 22:24:18 -04:00
Julius Frost
0b1b6222bc
[rllib] Add merge_trainer_config arguments to trainer template (#17160) 2021-07-21 15:43:06 -07:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 2021-07-20 14:58:13 -04:00
Amog Kamsetty
cb74053ee5
Retry remove gpustat dependency (#17115)
* remove gpustat

* move psutil imports
2021-07-19 11:14:10 -07:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031) 2021-07-19 13:16:03 -04:00
Sven Mika
e0640ad0dc
[RLlib] Fix seeding for ES and ARS. (#16744) 2021-07-19 13:13:05 -04:00
Sven Mika
649580d735
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046) 2021-07-15 05:51:24 -04:00
kk-55
13094a3f1c
AttributeError: 'numpy.ndarray' object has no attribute 'get_shape' when running with framework config tf2 or tfe (#16868) 2021-07-15 10:47:16 +01:00
Sven Mika
ce6dfc9b2d
[RLlib] Update tf1.x vs tf2.x documentation and eager example script. (#17030) 2021-07-13 20:02:17 -04:00
Grzegorz Bartyzel
d553d4da6c
[RLlib] DQN (Rainbow): Fix torch noisy layer support and loss (#16716) 2021-07-13 16:48:06 -04:00
Sven Mika
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014) 2021-07-13 14:01:30 -04:00
Antoine Galataud
16f1011c07
[RLlib] Issue 15910: APEX current learning rate not updated on local worker (#15911) 2021-07-13 14:01:00 -04:00
Amog Kamsetty
38b5b6d24c
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)" (#17036)
This reverts commit e4123fff27.
2021-07-13 09:57:15 -07:00
Kai Fricke
27d80c4c88
[RLlib] ONNX export for tensorflow (1.x) and torch (#16805) 2021-07-13 12:38:11 -04:00
Kai Fricke
3380b68b54
[RLlib] Issue 16683: Fix last infos dict (#16999). 2021-07-13 11:33:48 -04:00
Sven Mika
e4123fff27
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565) 2021-07-13 06:38:14 -04:00
Amog Kamsetty
df3dd81348
[rllib] skip highly flaky tests (#17010) 2021-07-12 11:18:28 -07:00
Amog Kamsetty
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action, not normalize_action." (#17002)
This reverts commit 7862dd64ea.
2021-07-12 11:09:14 -07:00
Sven Mika
55a90e670a
[RLlib] Trainer.add_policy() not working for tf, if added policy is trained afterwards. (#16927) 2021-07-11 23:41:38 +02:00
Julius Frost
a88b217d3f
[rllib] Enhancements to Input API for customizing offline datasets (#16957)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-10 15:05:25 -07:00
Francesco Stranieri
01c533c171
[rlib] Independent bound for each dimension AssertionError #16845 (#16860)
* Fix AssertionError for Box space type

Restored support for Box space type with independent bound for each dimension.

* Removed unnecessary assertion for Box space type
2021-07-10 14:48:35 -07:00
Sven Mika
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action, not normalize_action. (#16774) 2021-07-08 17:31:34 +02:00
Sven Mika
9f6a92163b
[RLlib] Remove old UsageTrackingDict code. (#16867) 2021-07-08 17:27:52 +02:00
Kai Fricke
10fd7111b3
[rllib] Improve test learning check, fix flaky two step qmix (#16843) 2021-07-06 19:39:12 +01:00
Amog Kamsetty
ecb632140f
Revert "RockPaperScissors Pettingzoo" (#16886)
This reverts commit bf3e3225b6.
2021-07-06 09:43:47 -07:00
Rodrigo de Lazcano
bf3e3225b6
RockPaperScissors Pettingzoo (#16725) 2021-07-05 09:52:08 -07:00
Julius Frost
7842bda50a
[rllib] Fix to allow input strings that are not file paths (#16830) 2021-07-03 01:12:47 -07:00
Amog Kamsetty
33f31f53c8
[Rllib] Torch Backwards Compatibility (#16813) 2021-07-01 19:17:54 -07:00
Rodrigo de Lazcano
5072d86323
[rllib] parallel pettingzoo import (#16722) 2021-07-01 18:37:59 -07:00
Julius Frost
ada0552f16
[rllib] d4rl: fix for paths with multiple periods (#16721) 2021-07-01 18:35:50 -07:00
Sven Mika
7eb1a29426
[RLlib] Fix ModelV2 custom metrics for torch. (#16734) 2021-07-01 13:01:40 +02:00
Sven Mika
ce3e550c43
[RLlib] Enhance comment in example script multi_agent_custom_policy. (#16740) 2021-07-01 10:28:38 +02:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
Amog Kamsetty
abd16a8438
[RLlib] Skip two_step_game_qmix test (#16758) 2021-06-29 14:27:48 -07:00
Travis Addair
e5dfa4cfb9
[tune] Only use TBXLoggerCallback when torch is installed (#16695)
* [tune] Only use TBXLoggerCallback when torch is installed

* Fix lint

* fix

* Update python/ray/tune/utils/callback.py

Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2021-06-28 16:34:20 -07:00
Amog Kamsetty
be1f6d59fa
[CI] Re-try Tag rllib flaky tests (#16680) 2021-06-28 18:42:54 +02:00
AnnaKosiorek
1e709771b2
[rllib][minor] clarification of the softmax axis in dqn_torch_policy (#16311)
pytorch nn.functional.softmax (unlike tf.nn.softmax) calculates softmax along zeroth dimension by default
2021-06-26 11:19:54 -07:00