Commit graph

1388 commits

Author SHA1 Message Date
Eric Liang
27cd6ea401
[rllib] Flip sign of A2C, IMPALA entropy coefficient; raise DeprecationWarning if negative (#4374) 2019-03-17 18:07:37 -07:00
Richard Liaw
ea5a6f8455
[tune] Simplify API (#4234)
Uses `tune.run` to execute experiments as preferred API.

@noahgolmant

This does not break backwards compat, but will slowly internalize `Experiment`. 

In a separate PR, Tune schedulers should only support 1 running experiment at a time.
2019-03-17 13:03:32 -07:00
markgoodhead
20a155d03d [Tune] Support initial parameters for SkOpt search algorithm (#4341)
Similar to the recent change to HyperOpt (#https://github.com/ray-project/ray/pull/3944) this implements both:
1. The ability to pass in initial parameter suggestion(s) to be run through Tune first, before using the Optimiser's suggestions. This is for when you already know good parameters and want the Optimiser to be aware of these when it makes future parameter suggestions.
2. The same as 1. but if you already know the reward value for those parameters you can pass these in as well to avoid having to re-run the experiments. In the future it would be nice for Tune to potentially support this functionality directly by loading previously run Tune experiments and initialising the Optimiser with these (kind of like a top level checkpointing functionality) but this feature allows users to do this manually for now.
2019-03-16 23:11:30 -07:00
Eric Liang
b513c0f498
[autoscaler] Restore error message for setup 2019-03-16 18:00:37 -07:00
Richard Liaw
5e95abe63e
[tune] Fix performance issue and fix reuse tests (#4379)
* fix tests

* better name

* reduce warnings

* better resource tracking

* oops

* revertmessage

* fix_executor
2019-03-16 13:52:02 -07:00
Eric Liang
a45019d98c
[rllib] Add option to proceed even if some workers crashed (#4376) 2019-03-16 13:34:09 -07:00
Philipp Moritz
c5e2c9af4d Build wheels for macOS with Bazel (#4280) 2019-03-15 10:37:57 -07:00
Leon Sievers
6b93ec3034 Fixed calculation of num_steps_trained for multi_gpu_optimizer (#4364) 2019-03-14 19:46:02 -07:00
Eric Liang
2c1131e8b2
[tune] Add warnings if tune event loop gets clogged (#4353)
* add guards

* comemnts
2019-03-14 19:44:01 -07:00
Yuhong Guo
becffc6cef
Fix checkpoint crash for actor creation task. (#4327)
* Fix checkpoint crash for actor creation task.

* Lint

* Move test to test_actor.py

* Revert unused code in test_failure.py

* Refine test according to Raul's suggestion.
2019-03-14 23:42:57 +08:00
Philipp Moritz
2f37cd7e27 fix wheel building doc (#4360) 2019-03-13 23:11:30 -07:00
Philipp Moritz
b0c4e60ffb Build wheels for Linux with Bazel (#4281) 2019-03-13 15:57:33 -07:00
Ameer Haj Ali
8a6403c26e [rllib] bug fix: merging --config params with params.pkl (#4336) 2019-03-13 11:26:55 -07:00
Andrew Tan
87bfa1cf82 [tune] add output flag for Tune CLI (#4322) 2019-03-12 23:56:59 -07:00
Eric Liang
d5f4698305
[tune] Avoid scheduler blocking, add reuse_actors optimization (#4218) 2019-03-12 23:49:31 -07:00
Stefan Pantic
2202a81773 Fix multi discrete (#4338)
* Revert "Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" (#4332)"

This reverts commit 3c41cb9b60.

* Fix a bug with log rhos for vtrace

* Reformat

* lint
2019-03-12 20:32:11 -07:00
Eric Liang
3c41cb9b60
Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" (#4332)
This reverts commit 962b17f567.
2019-03-11 22:51:26 -07:00
Kai Yang
7ff56ce826 Introduce set data structure in GCS (#4199)
* Introduce set data structure in GCS. Change object table to Set instance.

* Fix a logic bug. Update python code.

* lint

* lint again

* Remove CURRENT_VALUE mode

* Remove 'CURRENT_VALUE'

* Add more test cases

* rename has_been_created to subscribed.

* Make `changed` parameter type of `bool *`

* Rename mode to notification_mode

* fix build

* RAY.SET_REMOVE return error if entry doesn't exist

* lint

* Address comments

* lint and fix build
2019-03-11 14:42:58 -07:00
Andrew Tan
c435013b27 [tune] add-note command for Tune CLI (#4321)
Co-Authored-By: andrewztan <andrewztan12@gmail.com>
2019-03-11 14:16:44 -07:00
Stefan Pantic
36cbde651a Add action space to model (#4210) 2019-03-09 19:23:12 -08:00
justinwyang
5adb4a6941 Set _remote() function args and kwargs as optional (#4305) 2019-03-09 16:40:14 -08:00
Richard Liaw
6630a35353
[tune] Initial Commit for Tune CLI (#3983)
This introduces a light CLI for Tune.
2019-03-08 16:46:05 -08:00
Simon Mo
3064fad96b Add ray.experimental.serve Module (#4095) 2019-03-08 16:22:05 -08:00
Eric Liang
c7f74dbdc7
[rllib] Add async remote workers (#4253) 2019-03-08 15:39:48 -08:00
Robert Nishihara
fd2d8c2c06 Remove Jenkins backend tests and add new long running stress test. (#4288) 2019-03-08 15:29:39 -08:00
Richard Liaw
c3a3360a4a
[tune] Add custom field for serializations (#4237) 2019-03-08 11:00:25 -08:00
Kristian Hartikainen
7e4b4822cf [tune] Fix worker recovery by setting force=False when calling logger sync_now (#4302)
## What do these changes do?
Fixes a tune autoscaling problem where worker recovery causes things to stall.
2019-03-08 10:59:31 -08:00
Philipp Moritz
95254b3d71 Remove the old web UI (#4301) 2019-03-07 23:15:11 -08:00
Yuhong Guo
b9ea821d16
Use strongly typed IDs in C++. (#4185)
*  Use strongly typed IDs for C++.

* Avoid heap allocation in cython.

* Fix JNI part

* Fix rebase conflict

* Refine

* Remove type check from __init__

* Remove unused constructor declarations.
2019-03-07 21:43:01 +08:00
Eric Liang
b0332551dd
[rllib] Fix APPO + continuous spaces, feed prev_rew/act to A3C properly (#4286) 2019-03-06 21:36:26 -08:00
Hao Chen
f0465bc68c
[Java] Refine tests and fix single-process mode (#4265) 2019-03-07 09:59:13 +08:00
Philipp Moritz
39eed24d47 update version from 0.7.0.dev0 to 0.7.0.dev1 (#4282) 2019-03-06 14:43:09 -08:00
Eric Liang
2781d74680
[rllib] Reserve CPUs for replay actors in apex (#4217) 2019-03-06 10:22:12 -08:00
Eric Liang
6d705036f3
[rllib] Add callback accessor for raw observation, fix prev actions (#4212) 2019-03-06 10:21:05 -08:00
Eric Liang
0e77a8f8c0
[rllib] Add end-to-end tests for RNN sequencing (#4258) 2019-03-06 09:55:07 -08:00
Philipp Moritz
ff5e3384ce Update version to 0.7.0.dev1 and update docs 0.6.3 -> 0.6.4 (#4276) 2019-03-05 22:22:29 -08:00
Stephanie Wang
b7ebf17650 Fix test (#4264) 2019-03-05 18:37:00 -08:00
Eric Liang
78ad9c4cbb Add "ray timeline" command to auto-dump Chrome trace for the current Ray instance (#4239) 2019-03-05 16:28:00 -08:00
Richard Liaw
a5441a3381
[tune] Fix testTrialNoSave (#4262)
Left a `last_result == None` after changing last_result to always be a
dict.



Fixes https://github.com/ray-project/ray/issues/4259.
2019-03-05 09:28:33 -08:00
Robert Nishihara
fa8c07dd19 Sleep for half a second at exit in order to avoid losing log messages… (#4254) 2019-03-04 20:39:09 -08:00
Eric Liang
30bf8e46c7
[rllib] Use nested scope in custom loss example 2019-03-04 18:29:22 -08:00
Kristian Hartikainen
df9beb7123 [tune] Fix trial result fetching (#4219)
* Fix trial results wait in RayTrialExecutor.get_next_available_trial

* Add comment for the results shuffling

* Remove timeout from the wait

* Change random.sample to random.shuffle
2019-03-04 14:26:10 -08:00
Eric Liang
6e3384a719
[rllib] Add three new long-running stress tests {APEX, IMPALA, PBT} (#4215) 2019-03-04 14:05:42 -08:00
Stephanie Wang
8b871af555
Fix ray.wait bug for tasks on remote nodes and timeout=0 (#4242)
* Regression test

* Fix

* cleaner code
2019-03-04 11:46:06 -08:00
Hao Chen
a22d6ef955
Fix RemoteFunction._last_export_session (#4243) 2019-03-04 19:57:42 +08:00
Yuhong Guo
5866fd7005
Add type check in free and change Exception to TypeError (#4221) 2019-03-04 16:40:04 +08:00
Philipp Moritz
e96e06e031 bump version to 0.6.4 (#4226) 2019-03-03 14:39:05 -08:00
Adi Zimmerman
9551f2a92e [tune] Properly handle closing files in Trainable (#4232)
Fixes #3965.

Using the with keyword/block will close to file immediately after the block ends
2019-03-03 14:23:05 -08:00
Richard Liaw
3483282254
[tune] Local Mode support (#4138) 2019-03-03 14:05:59 -08:00
Peiren Yang
e2e6ef198b [autoscaler] Make commands bash -i to support newer bash (#4181)
The generated command in autoscaler/updater.py throws non-zero exit status 127 on Ubuntu 18.04.

## Related issue number
Closes #4155, Closes #1444.
2019-03-03 13:46:07 -08:00