Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. ( #10825 )
2020-09-20 11:27:02 +02:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. ( #10420 )
2020-09-02 14:03:01 +02:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all
script & backfill lint fixes ( #9956 )
2020-08-07 16:49:49 -07:00
Eric Liang
5acd3e66dd
[rllib] Fix torch TD error, IMPALA LR updates ( #9477 )
...
* update
* add test
* lint
* fix super call
* speed es test up
2020-07-23 12:50:25 -07:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. ( #8752 )
2020-07-11 22:06:35 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()
). ( #9136 )
...
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
5c6d5d4ab1
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). ( #8970 )
2020-06-27 20:50:01 +02:00
Sven Mika
4fd8977eaf
[RLlib] Minor cleanup in preparation to tf2.x support. ( #9130 )
...
* WIP.
* Fixes.
* LINT.
* Fixes.
* Fixes and LINT.
* WIP.
2020-06-25 19:01:32 +02:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils
cleanup. ( #8932 )
2020-06-16 08:52:20 +02:00
Sven Mika
4ed796a7d6
[RLlib] Add testing Policy.compute_single_action()
for all agents. ( #8903 )
2020-06-13 17:51:50 +02:00
Sven Mika
c74dc58f8b
[RLlib] Fix use_lstm
flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. ( #8734 )
2020-06-05 15:40:30 +02:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch
in favor of framework=...
( #8520 )
2020-05-27 16:19:13 +02:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers ( #8345 )
2020-05-21 10:16:18 -07:00
Eric Liang
9d012626e5
[rllib] Distributed exec workflow for impala ( #8321 )
2020-05-11 20:24:43 -07:00
Sven Mika
c7cb2f5416
[RLlib] IMPALA PyTorch GPU fixes ( #8397 )
2020-05-11 22:03:27 +02:00
Sven Mika
754290daad
[RLlib] Add light-weight Trainer.compute_action()
tests for all Algos. ( #8356 )
2020-05-08 16:31:31 +02:00
Sven Mika
166bb5d690
[RLlib] IMPALA PyTorch ( #8287 )
...
This PR adds an IMPALA PyTorch implementation.
- adds compilation tests for LSTM and w/o LSTM.
- adds learning test for CartPole.
2020-05-03 13:44:25 +02:00
Sven Mika
499ad5fbe4
[RLlib] PyTorch version of APPO. ( #8120 )
...
- Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases.
- Add learning test cases for APPO torch (both w/ and w/o v-trace).
- Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).
2020-04-23 09:11:12 +02:00
Sven Mika
e153e3179f
[RLlib] Exploration API: Policy changes needed for forward pass noisifications. ( #7798 )
...
* Rollback.
* WIP.
* WIP.
* LINT.
* WIP.
* Fix.
* Fix.
* Fix.
* LINT.
* Fix (SAC does currently not support eager).
* Fix.
* WIP.
* LINT.
* Update rllib/evaluation/sampler.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/evaluation/sampler.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/utils/exploration/exploration.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/utils/exploration/exploration.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* WIP.
* Fix.
* LINT.
* LINT.
* Fix and LINT.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Fix.
* Fix and LINT.
* Update rllib/utils/exploration/exploration.py
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Update rllib/policy/dynamic_tf_policy.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Fixes.
* LINT.
* WIP.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-01 00:43:21 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length ( #7503 )
...
* bulk rename
* deprecation warn
* update doc
* update fig
* line length
* rename
* make pytest comptaible
* fix test
* fi sys
* rename
* wip
* fix more
* lint
* update svg
* comments
* lint
* fix use of batch steps
2020-03-14 12:05:04 -07:00
Sven Mika
20ef4a8603
[RLlib] Cleanup/unify all test cases. ( #7533 )
2020-03-11 20:39:47 -07:00
Sven Mika
d537e9f0d8
[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). ( #7155 )
2020-02-19 12:18:45 -08:00
Eric Liang
399424c418
[rllib] Fix broken check in eval mode for IMPALA #7217
2020-02-19 11:54:30 -08:00
Sven Mika
2e60f0d4d8
[RLlib] Move all jenkins RLlib-tests into bazel (rllib/BUILD). ( #7178 )
...
* commit
* comment
2020-02-15 14:50:44 -08:00
Sven Mika
e6227082bd
[RLlib] Add torch
flag to train.py ( #6807 )
2020-01-17 18:48:44 -08:00
Sven
60d4d5e1aa
Remove future imports ( #6724 )
...
* Remove all __future__ imports from RLlib.
* Remove (object) again from tf_run_builder.py::TFRunBuilder.
* Fix 2xLINT warnings.
* Fix broken appo_policy import (must be appo_tf_policy)
* Remove future imports from all other ray files (not just RLlib).
* Remove future imports from all other ray files (not just RLlib).
* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).
* Add two empty lines before Schedule class.
* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara
39a3459886
Remove (object) from class declarations. ( #6658 )
2020-01-02 17:42:13 -08:00
gehring
b520f6141e
[rllib] Adds eager support with a generic TFEagerPolicy
class ( #5436 )
2019-08-23 14:21:11 +08:00
Eric Liang
e2e30ca507
Ray, Tune, and RLlib support for memory, object_store_memory options ( #5226 )
2019-08-21 23:01:10 -07:00
Eric Liang
a1d2e17623
[rllib] Autoregressive action distributions ( #5304 )
2019-08-10 14:05:12 -07:00
Eric Liang
592f313210
[rllib] Centralized critic / PPO example on TwoStepGame ( #5392 )
2019-08-08 14:03:28 -07:00
Matthew A. Wright
e3c9f7e83a
Custom action distributions ( #5164 )
...
* custom action dist wip
* Test case for custom action dist
* ActionDistribution.get_parameter_shape_for_action_space pattern
* Edit exception message to also suggest using a custom action distribution
* Clean up ModelCatalog.get_action_dist
* Pass model config to ActionDistribution constructors
* Update custom action distribution test case
* Name fix
* Autoformatter
* parameter shape static methods for torch distributions
* Fix docstring
* Generalize fake array for graph initialization
* Fix action dist constructors
* Correct parameter shape static methods for multicategorical and gaussian
* Make suggested changes to custom action dist's
* Correct instances of not passing model config to action dist
* Autoformatter
* fix tuple distribution constructor
* bugfix
2019-08-06 11:13:16 -07:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir ( #5324 )
2019-08-05 23:25:49 -07:00