Commit graph

30 commits

Author SHA1 Message Date
gjoliver
9226f9bddc
[RLlib] Report timesteps_this_iter to Tune, so it can track/checkpoint/restore total timesteps trained. (#19264)
* Report timesteps_this_iter to Tune, so it can track/checkpoint/restore
total timesteps trained.

* Trigger Build

* lint
2021-10-12 16:03:41 +02:00
Sven Mika
698b4eeed3
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 2021-09-21 22:00:14 +02:00
Sven Mika
3803e796ff
[RLlib] Multi-GPU learner thread (IMPALA) error messages/comments/code-cleanup. (#18540) 2021-09-13 19:27:53 +02:00
Sven Mika
ea4a22249c
[RLlib] Add simple action-masking example script/env/model (tf and torch). (#18494) 2021-09-11 23:08:09 +02:00
Sven Mika
599e589481
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 2021-07-20 14:58:13 -04:00
Sven Mika
169ddabae7
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429) 2021-06-19 22:42:00 +02:00
Sven Mika
bdda73e2dd
[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421)
Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!
2021-04-22 11:29:42 +02:00
Sven Mika
c90de315e5
[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. (#15295) 2021-04-15 16:50:42 +01:00
Sven Mika
ef944bc5f0
[RLlib] Re-enable placement group support for RLlib. (#14384) 2021-03-05 08:16:24 +01:00
Eric Liang
9db000ff2c
Auto report object store memory usage; remove some deprecated code (#14260) 2021-03-01 13:19:44 -08:00
Richard Liaw
a2d2275ee1
Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289)" (#14360)
This reverts commit 6cd0cd3bd9.
2021-02-25 14:27:35 -08:00
Sven Mika
6cd0cd3bd9
[RLlib + Tune] Add placement group support to RLlib. (#14289) 2021-02-25 16:01:31 +01:00
Sven Mika
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238) 2021-01-19 14:22:36 +01:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika
19c8033df2
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Eric Liang
5acd3e66dd
[rllib] Fix torch TD error, IMPALA LR updates (#9477)
* update

* add test

* lint

* fix super call

* speed es test up
2020-07-23 12:50:25 -07:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520) 2020-05-27 16:19:13 +02:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
Eric Liang
9d012626e5
[rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00
Sven Mika
166bb5d690
[RLlib] IMPALA PyTorch (#8287)
This PR adds an IMPALA PyTorch implementation.

- adds compilation tests for LSTM and w/o LSTM.
- adds learning test for CartPole.
2020-05-03 13:44:25 +02:00
Sven Mika
499ad5fbe4
[RLlib] PyTorch version of APPO. (#8120)
- Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases.
- Add learning test cases for APPO torch (both w/ and w/o v-trace).
- Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).
2020-04-23 09:11:12 +02:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length (#7503)
* bulk rename

* deprecation warn

* update doc

* update fig

* line length

* rename

* make pytest comptaible

* fix test

* fi sys

* rename

* wip

* fix more

* lint

* update svg

* comments

* lint

* fix use of batch steps
2020-03-14 12:05:04 -07:00
Sven Mika
d537e9f0d8
[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155) 2020-02-19 12:18:45 -08:00
Sven Mika
e6227082bd [RLlib] Add torch flag to train.py (#6807) 2020-01-17 18:48:44 -08:00
Sven
60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara
39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Eric Liang
e2e30ca507 Ray, Tune, and RLlib support for memory, object_store_memory options (#5226) 2019-08-21 23:01:10 -07:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
Renamed from python/ray/rllib/agents/impala/impala.py (Browse further)