Commit graph

83 commits

Author SHA1 Message Date
Alex Wu
197dab0e2f
[docs] Deploying Ray (#16538)
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-19 10:07:15 -07:00
Chris Bamford
fd1a97e39f
[RLlib] Memory leak docs (#15908) 2021-06-10 18:10:21 +02:00
Sven Mika
e61922c4ac
[RLlib] Add one-liner to docs on internship/RL-engineer position. (#16050)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-05-25 12:58:54 -07:00
Sven Mika
4e9555cad3
[RLlib] Issue 15724: Breaking example script in docs due to outdated eager config flag (use framework='tf2|tfe' instead). (#15736) 2021-05-18 11:34:46 +02:00
Sven Mika
d89fb82bfb
[RLlib] Add simple curriculum learning API and example script. (#15740) 2021-05-16 17:35:10 +02:00
Stefan Schneider
49ba51979e
Functions for restoring from last or best checkpoint (#14735)
Adds a helper function to retrieve the latest checkpoint after selecting the best trial according to a metric.
2021-04-06 12:19:09 +02:00
Kai Fricke
757866ec01
[tune] enable placement groups per default (#13906)
* Refactor placement group factory object to accept placement_group arguments instead of callables

* Convert resources to pgf

* Enable placement groups per default

* Fix tests WIP

* Fix stop/resume with placement groups

* Fix progress reporter test

* Fix trial executor tests

* Check resource for trial, not resource object

* Move ENV vars into class

* Fix tests

* Sphinx

* Wait for trial start in PBT

* Revert merge errors

* Support trial reuse with placement groups

* Better check for just staged trials

* Fix trial queuing

* Wait for pg after trial termination

* Clean up PGs before tune run

* No PG settings in pbt scheduler

* Fix buffering tests

* Skip test if ray reports erroneous available resources

* Disable PG for cluster resource counting test

* Debug output for tests

* Output in-use resources for placement groups

* Don't start new trial on trial start failure

* Add docs

* Cleanup PGs once futures returned

* Fix placement group shutdown

* Use updated_queue flag

* Apply suggestions from code review

* Apply suggestions from code review

* Update docs

* Reuse placement groups independently from actors

* Do not remove placement groups for paused trials

* Only continue enqueueing trials if it didn't fail the first time

* Rename parameter

* Fix pause trial

* Code review + try_recover

* Update python/ray/tune/utils/placement_groups.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Move placement group lifecycle management

* Move total used resources to pg manager

* Update FAQ example

* Requeue trial if start was unsuccessful

* Do not cleanup pgs at start of run

* Revert "Do not cleanup pgs at start of run"

This reverts commit 933d9c4c

* Delayed PG removal

* Fix trial requeue test

* Trigger pg cleanup on status update

* Fix tests

* Fix docs

* fix-test

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-23 18:46:02 +01:00
Sven Mika
f91c455527
[RLlib] Curiosity documentation. (#11066) 2020-09-29 09:39:22 +02:00
Eric Liang
e5d089384b
[1.0] Ray whitepaper link and tagline update (#10455) 2020-09-01 09:48:35 -07:00
Eric Liang
bd245a1c18
[api] Clean up and document Actor name / lifetime API (#10332) 2020-08-27 13:38:39 -07:00
Bill Chambers
067c2752f8
[TUNE] Tune Docs re-organization (#9600)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-07-29 11:22:44 -07:00
Stefan Schneider
6db55ca8db
[docs][rllib] Recommended workflow for training, saving, and testing (#9319) 2020-07-09 15:47:10 -07:00
Eric Liang
34bae27ac7
[rllib] Flexible multi-agent replay modes and replay_sequence_length (#8893) 2020-06-12 20:17:27 -07:00
internetcoffeephone
9166e22085
Add doc explanation about synchronous algorithm shared GPU utilization between workers and driver. (#8400) 2020-06-11 01:06:04 -07:00
Edward Oakes
860eb6f13a
Update named actor API (#8559) 2020-05-24 20:08:03 -05:00
Richard Liaw
b506f87117
[tune] New Doc edits, add Concepts page (#8083)
Co-Authored-By: Sven Mika <sven@anyscale.io>
2020-04-25 18:25:56 -07:00
Sven Mika
165a86f1ab
[RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063)
SAC (both torch and tf versions) are showing issues (crashes) due to numeric instabilities in the SquashedGaussian distribution (sampling + logp after extreme NN outputs).
This PR fixes these. Stable MuJoCo learning (HalfCheetah) has been confirmed on both tf and torch versions. A Distribution stability test (using extreme NN outputs) has been added for SquashedGaussian (can be used for any other type of distribution as well).
2020-04-19 10:20:23 +02:00
roireshef
dbcad35022
[RLlib] Added DefaultCallbacks which replaces old callbacks dict interface (#6972) 2020-04-16 16:06:42 -07:00
Eric Liang
5cebee68d6
[rllib] Add scaling guide to documentation, improve bandit docs (#7780)
* update

* reword

* update

* ms

* multi node sgd

* reorder

* improve bandit docs

* contrib

* update

* ref

* improve refs

* fix build

* add pillow dep

* add pil

* update pil

* pillow

* remove false
2020-03-27 22:05:43 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. (#7482) 2020-03-23 12:19:30 -07:00
Eric Liang
9392cdbf74
[rllib] Add high-performance external application connector (#7641) 2020-03-20 12:43:57 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length (#7503)
* bulk rename

* deprecation warn

* update doc

* update fig

* line length

* rename

* make pytest comptaible

* fix test

* fi sys

* rename

* wip

* fix more

* lint

* update svg

* comments

* lint

* fix use of batch steps
2020-03-14 12:05:04 -07:00
Eric Liang
52cf77f5a9
[rllib] SAC no_done_at_end should default to False (#7594)
* update

* update doc

* stochastic

* cleanu
2020-03-14 11:16:54 -07:00
Sven Mika
2d97650b1e
[RLlib] Add Exploration API documentation. (#7373)
* Add Exploration API documentation.

* Add Exploration API documentation.

* Add Exploration API documentation.

* Update exporation docs.
2020-03-01 16:55:41 -08:00
Eric Liang
5df801605e
Add ray.util package and move libraries from experimental (#7100) 2020-02-18 13:43:19 -08:00
Eric Liang
fbc545c03b
[rllib] Support parallel, parameterized evaluation (#6981)
* eval api

* update

* sync eval filters

* sync fix

* docs

* update

* docs

* update

* link

* nit

* doc updates

* format
2020-02-01 22:12:12 -08:00
Eric Liang
e659699ca9
[tune] Fix directory naming regression (#6839) 2020-01-27 15:53:40 -08:00
Sven Mika
e6227082bd [RLlib] Add torch flag to train.py (#6807) 2020-01-17 18:48:44 -08:00
Maltimore
0ec613c95a [rllib] doc: fix typo: on_postprocess_batch -> on_postprocess_traj (#6438) 2019-12-11 15:00:53 -08:00
Eric Liang
bc5e259264
[rllib] Add a doc section on computing actions (#6326)
* options doc

* add note

* hint shr

* doc update
2019-12-03 00:10:50 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity (#6154) 2019-11-13 18:50:45 -08:00
David Bignell
3f83b2daa9 [rllib] Rollout extensions (#6065)
* Rollout improvements

* Make info-saving optional, to avoid breaking change.

* Store generating ray version in checkpoint metadata

* Keep the linter happy

* Add small rollout test

* Terse.

* Update test_io.py
2019-11-05 20:34:18 -08:00
gehring
8903bcd0c3 [rllib] Tracing for eager tensorflow policies with tf.function (#5705)
* Added tracing of eager policies with `tf.function`

* lint

* add config option

* add docs

* wip

* tracing now works with a3c

* typo

* none

* file doc

* returns

* syntax error

* syntax error
2019-09-17 01:44:20 -07:00
Eric Liang
74abeab057
[rllib] Improve accessing model state docs (#5656)
* [rllib] better model docs

* fix

* s
2019-09-08 23:01:26 -07:00
Eric Liang
1455a19c85
Consolidate and clean up documentation (#5645) 2019-09-07 11:50:18 -07:00
Richard Liaw
34f6d2fc5c [tune] Update trainable docs and support hparams (#5558) 2019-09-04 12:44:42 -07:00
Eric Liang
daf38c8723
[tune] Deprecate tune.function (#5601)
* remove tune function

* remove examples

* Update tune-usage.rst
2019-08-31 16:00:10 -07:00
Eric Liang
550c96b965 [rllib] Add docs on policy.model (#5597) 2019-08-30 21:10:42 -07:00
Eric Liang
7d28bbbdbb
[rllib] Document on traj postprocess (#5532)
* document on traj postprocess

* shorten it
2019-08-24 20:37:45 -07:00
gehring
b520f6141e [rllib] Adds eager support with a generic TFEagerPolicy class (#5436) 2019-08-23 14:21:11 +08:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
Richard Liaw
1eaa57c98f
[tune] Distributed example + walkthrough (#5157) 2019-08-02 09:17:20 -07:00
Kristian Hartikainen
13fb9fe3db [rllib] Feature/soft actor critic v2 (#5328)
* Add base for Soft Actor-Critic

* Pick changes from old SAC branch

* Update sac.py

* First implementation of sac model

* Remove unnecessary SAC imports

* Prune unnecessary noise and exploration code

* Implement SAC model and use that in SAC policy

* runs but doesn't learn

* clear state

* fix batch size

* Add missing alpha grads and vars

* -200 by 2k timesteps

* doc

* lazy squash

* one file

* ignore tfp

* revert done
2019-08-01 23:37:36 -07:00
Eric Liang
20450a4e82
[rllib] Add rock paper scissors multi-agent example (#5336) 2019-08-01 13:03:59 -07:00
Eric Liang
9e328fbe6f
[rllib] Add docs on how to use TF eager execution (#4927) 2019-06-07 16:42:37 -07:00
Eric Liang
7501ee51db
[rllib] Rename PolicyEvaluator => RolloutWorker (#4820) 2019-06-03 06:49:24 +08:00
Eric Liang
4f46d3e9bf
[rllib] Add multi-agent examples for hand-coded policy, centralized VF (#4554) 2019-04-09 00:36:49 -07:00
Eric Liang
37208216ae
[rllib] Rename Agent to Trainer (#4556) 2019-04-07 00:36:18 -07:00
Eric Liang
fce0062380
[rllib] Switch to tune.run() instead of run_experiments() (#4515) 2019-03-30 14:07:50 -07:00
Eric Liang
cff08e19ff
[rllib] Print out intermediate data shapes on the first iteration (#4426) 2019-03-26 00:27:59 -07:00