kourosh hakhamaneshi
8d848890f1
[RLlib] Fix default view_requirement in policy.py ( #27255 )
2022-08-02 10:44:07 -07:00
Steven Morad
d0a8e3c36f
[RLlib] User-friendly RNN sequencing. ( #27087 )
2022-08-01 15:32:22 +02:00
Steven Morad
77318abfaf
[RLlib] Warn on PPO infinite KL loss term. ( #26629 )
2022-08-01 12:55:26 +02:00
Jun Gong
acf2bf9b2f
[RLlib] Get rid of all these deprecation warnings. ( #27085 )
2022-07-27 10:48:54 -07:00
Jun Gong
54df8bfe42
[RLlib] Try to checkpoint a durable policy name ( #27016 )
2022-07-27 00:01:14 -07:00
kourosh hakhamaneshi
8ddcf89096
[RLlib] Implemented ViewRequirementConnector ( #26998 )
2022-07-26 21:52:14 -07:00
kourosh hakhamaneshi
5030a4c1d3
[RLlib] Simplify agent collector ( #26803 )
2022-07-25 13:17:17 -07:00
Artur Niederfahrenhorst
e9a8f7d9ae
[RLlib] Unify gnorm mixin for tf and torch policies. ( #26102 )
2022-07-24 15:31:09 +02:00
Rohan Potdar
2f22262d39
[RLlib]: Fix SampleBatch.split_by_episode to use dones if episode id is not available ( #26492 )
2022-07-22 16:46:05 -07:00
Jun Gong
6b6d3017ba
[RLlib] more connector polishes and fixes. ( #26645 )
2022-07-19 08:50:28 -07:00
Ishant Mrinal
57244aeee3
[RLlib] Make DQN update_target use only trainable variables. ( #25226 )
2022-07-15 09:17:06 +02:00
Jun Gong
b383d987d1
[RLlib] Fix a bunch of issues related to connectors. ( #26510 )
2022-07-13 18:55:20 +02:00
Jun Gong
0c469e490e
[RLlib] Checkpoint and restore connectors. ( #26253 )
2022-07-09 01:06:24 -07:00
Steven Morad
0bc465f687
[RLlib] Fix docstring and add unit tests for rnn sequencing. ( #26197 )
2022-07-06 14:32:57 +02:00
Jun Gong
d83bbda281
[RLlib] Save serialized PolicySpec. Extract num_gpus
related logics into a util function. ( #25954 )
2022-06-30 11:38:21 +02:00
Jun Gong
52bb8e47d4
[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. ( #25922 )
2022-06-30 08:44:10 +02:00
Charles Sun
70f94e6d63
[RLlib] Migrating DDPG to PolicyV2. ( #26054 )
2022-06-28 15:52:56 +02:00
Sven Mika
3d6df50258
[RLlib] Fix get_num_samples_loaded_into_buffer
in TorchPolicyV2. ( #25956 )
2022-06-22 13:11:41 +02:00
Eric Liang
43aa2299e6
[api] Annotate as public / move ray-core APIs to _private and add enforcement rule ( #25695 )
...
Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.
2022-06-21 15:13:29 -07:00
Avnish Narayan
d859b84058
[RLlib] Add compute log likelihoods test for CRR. ( #25905 )
2022-06-21 16:06:10 +02:00
Artur Niederfahrenhorst
e10876604d
[RLlib] Include SampleBatch.T column in all collected batches. ( #25926 )
2022-06-21 13:20:22 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. ( #25869 )
2022-06-20 15:54:00 +02:00
Sven Mika
d90c6cfbd6
[RLlib] SimpleQ PolicyV2 (sub-classing). ( #25871 )
2022-06-17 20:12:16 +02:00
Avnish Narayan
393cf4d8f7
[RLlib] Fix action_sampler_fn
call in TorchPolicyV2
(obs_batch
instead of input_dict
arg). ( #25877 )
2022-06-17 08:39:39 +02:00
kourosh hakhamaneshi
f597e21ac8
[RLlib] Fix sample batch concat samples. ( #25572 )
2022-06-14 12:47:29 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. ( #24683 )
2022-06-10 16:47:51 +02:00
Artur Niederfahrenhorst
7495e9c89c
[RLlib] Dreamer Policy sub-classing schema. ( #25585 )
2022-06-09 17:14:15 +02:00
Artur Niederfahrenhorst
5133978adc
[RLlib] PG policy subclassing conversion. ( #25288 )
2022-06-06 13:07:47 +02:00
kourosh hakhamaneshi
d49d0efbaf
[RLlib] Bug fix: when on GPU, sample_batch.to_device() only converts the device and does not convert float64 to float32. ( #25460 )
2022-06-06 12:43:11 +02:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )" ( #25420 )
...
This reverts commit e4ceae19ef
.
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128 ). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b )
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )
2022-06-02 16:47:05 +02:00
Artur Niederfahrenhorst
71a8a443ce
[RLlib] Fix Policy global timesteps being off by init sample batch size. ( #25349 )
2022-06-02 10:19:21 +02:00
Eric Liang
905258dbc1
Clean up docstyle in python modules and add LINT rule ( #25272 )
2022-06-01 11:27:54 -07:00
Sven Mika
d95009a3ac
[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). ( #24967 )
2022-05-28 10:50:03 +02:00
Sven Mika
ab6c3027e5
[RLlib] A2/3C policy sub-classing schema. ( #25078 )
2022-05-28 09:54:47 +02:00
kourosh hakhamaneshi
9684ea3af6
[RLlib] Fix TorchPolicyV2 bug. ( #25203 )
2022-05-26 20:49:26 +02:00
Jun Gong
eaf9c941ae
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. ( #25117 )
2022-05-25 14:38:03 +02:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT ( #25060 )
2022-05-24 22:14:25 -07:00
Jun Gong
93ff0beb4e
[RLlib] Introduce utils to serialize gym Spaces (and thus ViewRequirements). ( #25007 )
2022-05-24 21:12:20 +02:00
Steven Morad
501d932449
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects ( #25059 )
2022-05-22 19:58:47 +02:00
Jun Gong
d5a6d46049
[RLlib] Migrate MAML, MB-MPO, MARWIL, and BC to use Policy sub-classing implementation. ( #24914 )
2022-05-20 14:10:59 +02:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits ( #24896 )
2022-05-19 18:30:42 +02:00
Jun Gong
dea134a472
[RLlib] Clean up Policy mixins. ( #24746 )
2022-05-17 17:16:08 +02:00
Sven Mika
25001f6d8d
[RLlib] APPO Training iteration fn. ( #24545 )
2022-05-17 10:31:07 +02:00
Jun Gong
bc3a1d35cf
[RLlib] Introduce new policy base classes. ( #24742 )
2022-05-13 21:48:30 +02:00
Artur Niederfahrenhorst
bd2fdf4752
[RLlib] Automate sequences in timeslice_along_seq_lens_with_overlap()
. ( #24561 )
2022-05-09 11:55:06 +02:00
Daewoo Lee
fee35444ab
[RLlib] Issue 24530: Fix add_time_dimension
( #24531 )
...
Co-authored-by: Daewoo Lee <dwlee@rtst.co.kr>
2022-05-06 15:21:42 +02:00
Edward Oakes
11954e6798
Issue 24143: Fix a few f-strings missing the f. ( #24232 )
2022-05-02 16:11:33 +02:00