Sven Mika
e2be41b407
[RLlib] MARWIL + BC: Various fixes and enhancements. ( #16218 )
2021-06-03 22:29:00 +02:00
Sven Mika
f6302d81be
[RLlib] Discussion 2210: BC algo broken, if "advantages" missing in offline data. ( #16019 )
2021-05-25 08:47:17 +02:00
Sven Mika
eaa7f6696d
[RLlib] Issue 15887: MARWIL adv norm update mismatch for tf (static-graph) vs torch versions. ( #15898 )
2021-05-19 15:44:11 -07:00
Michael Luo
474f04e322
[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup ( #14707 )
2021-05-19 16:32:29 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support ( #15841 )
2021-05-18 11:10:46 +02:00
Sven Mika
c4a3e1589b
[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. ( #15761 )
2021-05-13 09:17:23 +02:00
Michael Luo
4cbe13cdfd
[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. ( #15603 )
...
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-05-04 19:06:19 +02:00
Sven Mika
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). ( #14684 )
2021-04-27 10:44:54 +02:00
Sven Mika
8b3554e37e
[RLlib] Remove all (already soft-deprecated) SampleBatch.data
from code. ( #15335 )
2021-04-15 19:19:51 +02:00
Sven Mika
1bb70e4907
[RLlib] Issue 14523: Torch + py3.8 leads to GPU device error. ( #15014 )
2021-03-30 21:43:11 +02:00
Sven Mika
6708211b59
[RLlib] JSONReader: Mix files if > 1 at beginning (each worker should start with different file). ( #14865 )
2021-03-24 16:07:40 +01:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. ( #14860 )
2021-03-23 09:50:18 -07:00
Sven Mika
69202c6a7d
[RLlib] Obsolete usage tracking dict via sample batch. ( #13065 )
2021-03-17 08:18:15 +01:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. ( #13393 )
2021-03-08 15:41:27 +01:00
Sven Mika
eb0038612f
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. ( #13584 )
2021-02-08 15:02:19 +01:00
Sven Mika
d629292d63
[RLlib] Add grad_clip config option to MARWIL and stabilize grad clipping against inf global_norms. ( #13634 )
2021-01-22 19:36:02 +01:00
Sven Mika
a65ee92b69
[RLlib] MARWIL loss function test case and cleanup. ( #13455 )
2021-01-19 09:51:05 +01:00
Sven Mika
c524f86785
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. ( #13064 )
2020-12-27 09:46:03 -05:00
Sven Mika
99ae7bae05
[RLlib] JAXPolicy prep. PR #1 . ( #13077 )
2020-12-26 20:14:18 -05:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. ( #12420 )
2020-12-08 16:41:45 -08:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1 : Smaller cleanups. ( #12447 )
...
* WIP.
* Fix.
* Fix.
* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
bfc4f95e01
[RLlib] Fix test_bc.py test case. ( #11722 )
...
* Fix large json test file.
* Fix large json test file.
* WIP.
2020-10-31 00:16:09 -07:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). ( #11033 )
2020-10-06 20:28:16 +02:00
Eric Liang
ecdaaffc67
add large data warning ( #10957 )
2020-09-23 15:46:06 -07:00
Julius Frost
e72838c03d
[RLLib] Add missing .to() for MARWIL on PyTorch ( #10685 )
...
There was a missing .to() that caused a device mismatch error on PyTorch with MARWIL.
2020-09-09 18:52:55 -07:00
Sven Mika
4b278c36fc
[RLlib] Behavioral Cloning (from MARWIL). ( #10619 )
2020-09-09 17:33:21 +02:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all
script & backfill lint fixes ( #9956 )
2020-08-07 16:49:49 -07:00
Sven Mika
617eb8f279
[RLlib] Issue 9402 MARWIL producing nan rewards. ( #9429 )
2020-07-14 05:07:16 +02:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. ( #8752 )
2020-07-11 22:06:35 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()
). ( #9136 )
...
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
4fd8977eaf
[RLlib] Minor cleanup in preparation to tf2.x support. ( #9130 )
...
* WIP.
* Fixes.
* LINT.
* Fixes.
* Fixes and LINT.
* WIP.
2020-06-25 19:01:32 +02:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils
cleanup. ( #8932 )
2020-06-16 08:52:20 +02:00
Sven Mika
4ed796a7d6
[RLlib] Add testing Policy.compute_single_action()
for all agents. ( #8903 )
2020-06-13 17:51:50 +02:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch
in favor of framework=...
( #8520 )
2020-05-27 16:19:13 +02:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers ( #8345 )
2020-05-21 10:16:18 -07:00
Sven Mika
754290daad
[RLlib] Add light-weight Trainer.compute_action()
tests for all Algos. ( #8356 )
2020-05-08 16:31:31 +02:00
Sven Mika
c2cb5c2214
[RLlib] MARWIL torch. ( #7836 )
...
* WIP.
* WIP.
* LINT.
* Fix MARWIL so it can run with eager-mode.
* LINT.
2020-04-06 16:38:50 -07:00
roireshef
3c60caa448
[rllib] implemented compute_advantages without gae ( #6941 )
2020-01-31 22:25:45 -08:00
Jaroslaw Rzepecki
67319bc887
[RLlib] Update MARWIL to use tf policy template ( #6975 )
...
* update MARWIL to use tf policy template
* formatting fixes
2020-01-31 12:57:52 -08:00
Sven Mika
303547f119
[RLlib] Policy-classes cleanup and torch/tf unification. ( #6770 )
2020-01-17 22:26:28 -08:00
Sven Mika
e6227082bd
[RLlib] Add torch
flag to train.py ( #6807 )
2020-01-17 18:48:44 -08:00
Sven
60d4d5e1aa
Remove future imports ( #6724 )
...
* Remove all __future__ imports from RLlib.
* Remove (object) again from tf_run_builder.py::TFRunBuilder.
* Fix 2xLINT warnings.
* Fix broken appo_policy import (must be appo_tf_policy)
* Remove future imports from all other ray files (not just RLlib).
* Remove future imports from all other ray files (not just RLlib).
* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).
* Add two empty lines before Schedule class.
* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara
39a3459886
Remove (object) from class declarations. ( #6658 )
2020-01-02 17:42:13 -08:00
Eric Liang
a1d2e17623
[rllib] Autoregressive action distributions ( #5304 )
2019-08-10 14:05:12 -07:00
Matthew A. Wright
e3c9f7e83a
Custom action distributions ( #5164 )
...
* custom action dist wip
* Test case for custom action dist
* ActionDistribution.get_parameter_shape_for_action_space pattern
* Edit exception message to also suggest using a custom action distribution
* Clean up ModelCatalog.get_action_dist
* Pass model config to ActionDistribution constructors
* Update custom action distribution test case
* Name fix
* Autoformatter
* parameter shape static methods for torch distributions
* Fix docstring
* Generalize fake array for graph initialization
* Fix action dist constructors
* Correct parameter shape static methods for multicategorical and gaussian
* Make suggested changes to custom action dist's
* Correct instances of not passing model config to action dist
* Autoformatter
* fix tuple distribution constructor
* bugfix
2019-08-06 11:13:16 -07:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir ( #5324 )
2019-08-05 23:25:49 -07:00