Sven Mika
|
16ad46a654
|
[RLlib] Fix broken test_r2d2.py. (#19017)
|
2021-09-30 21:19:37 +02:00 |
|
Sven Mika
|
ac3371a148
|
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape]) and discrete component. (#18917)
|
2021-09-30 16:39:38 +02:00 |
|
Sven Mika
|
ed85f59194
|
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879)
|
2021-09-30 16:39:05 +02:00 |
|
Sven Mika
|
828f5d26b7
|
[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action and compute_actions_from_input_dict . (#18921)
|
2021-09-30 15:03:37 +02:00 |
|
Sven Mika
|
61a1274619
|
[RLlib] No Preprocessors (part 2). (#18468)
|
2021-09-23 12:56:45 +02:00 |
|
Sven Mika
|
a96dbd885b
|
[RLlib] Reinstate trajectory view API tests. (#18809)
|
2021-09-23 08:31:51 +02:00 |
|
Sven Mika
|
e6aae61487
|
[RLlib; testing] Fix bug in stress tests not handling >1 trials per experiment (due to grid-search in IMPALA stress tests). (#18705)
|
2021-09-20 15:31:57 +02:00 |
|
Sven Mika
|
ba1c489b79
|
[RLlib Testing] Lower --smoke-test "time_total_s" to make sure it doesn't time out. (#18670)
|
2021-09-16 18:22:23 +02:00 |
|
Sven Mika
|
8a72824c63
|
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591)
|
2021-09-15 22:16:48 +02:00 |
|
Sven Mika
|
3f89f35e52
|
[RLlib] Better error messages and hints; + failure-mode tests; (#18466)
|
2021-09-10 16:52:47 +02:00 |
|
Sven Mika
|
8a066474d4
|
[RLlib] No Preprocessors; preparatory PR #1 (#18367)
|
2021-09-09 08:10:42 +02:00 |
|
Sven Mika
|
45f60e51a9
|
[RLlib] DDPPO fixes and benchmarks. (#18390)
|
2021-09-08 19:39:01 +02:00 |
|
Sven Mika
|
cabaa3b3c6
|
[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. (#18381)
|
2021-09-07 11:48:41 +02:00 |
|
Sven Mika
|
5292b70fc6
|
[RLlib] Add multi-GPU attention net tests to nightly test suite (+ R2D2 tests for LSTM and attention nets). (#18368)
|
2021-09-06 17:48:05 +02:00 |
|
Sven Mika
|
e3e6ed7aaa
|
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358)
|
2021-09-06 12:14:20 +02:00 |
|
Sven Mika
|
59f796edf3
|
[RLlib] Fix crash when using StochasticSampling exploration (most PG-style algos) w/ tf and numpy > 1.19.5 (#18366)
|
2021-09-06 12:14:00 +02:00 |
|
Sven Mika
|
a772c775cd
|
[RLlib] Set random seed (if provided) to Trainer process as well. (#18307)
|
2021-09-04 11:02:30 +02:00 |
|
Kai Fricke
|
ac5d255c9c
|
[rllib/docker] silent unzip of atari roms (#18340)
|
2021-09-03 17:55:03 +01:00 |
|
Sven Mika
|
9a8ca6a69d
|
[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306)
|
2021-09-03 13:29:57 +02:00 |
|
Kai Fricke
|
fb38d06cfb
|
Move RLLib GPU release test dependencies to ml docker (#18208)
|
2021-09-03 09:35:18 +01:00 |
|
gjoliver
|
336e79956a
|
[RLlib] Make MultiAgentEnv inherit gym.Env to avoid direct class type manipulation (#18156)
|
2021-09-03 08:02:05 +02:00 |
|
Sven Mika
|
2357bbc0c8
|
[RLlib] Issue 18231: Better (earlier) env validation and error message improvement. (#18249)
|
2021-09-02 09:28:16 +02:00 |
|
gjoliver
|
6621bb5611
|
[RLlib] Minor renaming and cleanups related to last rollout worker seed fix. (#18155)
|
2021-09-02 06:57:46 +02:00 |
|
Sven Mika
|
a7670d9fab
|
[RLlib; Testing] Fix smoke-test settings for nightly learning_tests and stress_test ; Add pybullet_envs to app-config. (#18274)
|
2021-09-01 21:46:06 +02:00 |
|
Sven Mika
|
82465f9342
|
[RLlib] Better PolicyServer example (w/ or w/o tune) and add printing out actual listen port address in log-level=INFO. (#18254)
|
2021-08-31 22:03:23 +02:00 |
|
Sven Mika
|
4888d7c9af
|
[RLlib] Replay buffers: Add config option to store contents in checkpoints. (#17999)
|
2021-08-31 12:21:49 +02:00 |
|
Sven Mika
|
a428f10ebe
|
[RLlib] Add multi-GPU learning tests to nightly. (#17778)
|
2021-08-18 17:21:01 +02:00 |
|
Julius Frost
|
9322f6aab5
|
[rllib] Fix classes decorated with @Deprecated to be classes instead of methods (#17666)
* fix deprecated classes from being methods
* format
|
2021-08-10 18:25:31 -07:00 |
|
Sven Mika
|
3013d9b341
|
[RLlib] Fix "Cannot convert a symbolic Tensor (default_policy/strided_slice_3:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported" (#17587)
|
2021-08-05 11:39:15 -04:00 |
|
Sven Mika
|
5107d16ae5
|
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530)
|
2021-08-03 18:30:02 -04:00 |
|
Sven Mika
|
924f11cd45
|
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371)
|
2021-08-03 11:35:49 -04:00 |
|
Sven Mika
|
8a844ff840
|
[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use ray.get_gpu_ids() (b/c no GPUs assigned by ray). (#17444)
|
2021-08-02 17:29:59 -04:00 |
|
Sven Mika
|
90b21ce27e
|
[RLlib] De-flake 3 test cases; Fix config.simple_optimizer and SampleBatch.is_training warnings. (#17321)
|
2021-07-27 14:39:06 -04:00 |
|
Vince Jankovics
|
05c9dfbbda
|
[RLlib] CV2 to Skimage dependency change (#16841)
|
2021-07-21 22:24:18 -04:00 |
|
Sven Mika
|
5a313ba3d6
|
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
2021-07-20 14:58:13 -04:00 |
|
Sven Mika
|
18d173b172
|
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031)
|
2021-07-19 13:16:03 -04:00 |
|
Sven Mika
|
e0640ad0dc
|
[RLlib] Fix seeding for ES and ARS. (#16744)
|
2021-07-19 13:13:05 -04:00 |
|
Sven Mika
|
649580d735
|
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046)
|
2021-07-15 05:51:24 -04:00 |
|
Sven Mika
|
1fd0eb805e
|
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014)
|
2021-07-13 14:01:30 -04:00 |
|
Amog Kamsetty
|
38b5b6d24c
|
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)" (#17036)
This reverts commit e4123fff27 .
|
2021-07-13 09:57:15 -07:00 |
|
Sven Mika
|
e4123fff27
|
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)
|
2021-07-13 06:38:14 -04:00 |
|
Amog Kamsetty
|
bc33dc7e96
|
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action , not normalize_action ." (#17002)
This reverts commit 7862dd64ea .
|
2021-07-12 11:09:14 -07:00 |
|
Sven Mika
|
7862dd64ea
|
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action , not normalize_action . (#16774)
|
2021-07-08 17:31:34 +02:00 |
|
Sven Mika
|
9f6a92163b
|
[RLlib] Remove old UsageTrackingDict code. (#16867)
|
2021-07-08 17:27:52 +02:00 |
|
Kai Fricke
|
10fd7111b3
|
[rllib] Improve test learning check, fix flaky two step qmix (#16843)
|
2021-07-06 19:39:12 +01:00 |
|
Sven Mika
|
7eb1a29426
|
[RLlib] Fix ModelV2 custom metrics for torch. (#16734)
|
2021-07-01 13:01:40 +02:00 |
|
Sven Mika
|
53206dd440
|
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531)
|
2021-06-30 12:32:11 +02:00 |
|
mvindiola1
|
82a3ff795c
|
[RLlib] ensure curiosity exploration actions are passed in as tf tens… (#15704)
|
2021-06-21 10:03:17 -07:00 |
|
Sven Mika
|
d0014cd351
|
[RLlib] Policies get/set_state fixes and enhancements. (#16354)
|
2021-06-15 13:08:43 +02:00 |
|
Sven Mika
|
2d34216660
|
[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762)
|
2021-05-20 09:27:03 +02:00 |
|