Commit graph

2655 commits

Author SHA1 Message Date
William Ma
f423909aec Temporary fix for many_actor_task.py (#4315) 2019-03-09 00:07:45 -08:00
Richard Liaw
6630a35353
[tune] Initial Commit for Tune CLI (#3983)
This introduces a light CLI for Tune.
2019-03-08 16:46:05 -08:00
Simon Mo
3064fad96b Add ray.experimental.serve Module (#4095) 2019-03-08 16:22:05 -08:00
Eric Liang
c7f74dbdc7
[rllib] Add async remote workers (#4253) 2019-03-08 15:39:48 -08:00
Robert Nishihara
fd2d8c2c06 Remove Jenkins backend tests and add new long running stress test. (#4288) 2019-03-08 15:29:39 -08:00
Richard Liaw
c3a3360a4a
[tune] Add custom field for serializations (#4237) 2019-03-08 11:00:25 -08:00
Kristian Hartikainen
7e4b4822cf [tune] Fix worker recovery by setting force=False when calling logger sync_now (#4302)
## What do these changes do?
Fixes a tune autoscaling problem where worker recovery causes things to stall.
2019-03-08 10:59:31 -08:00
Yuhong Guo
d5fb7b70a9
Update arrow version to fix plasma bugs (#4127)
* Update arrow

* Change to 2c511979b13b230e73a179dab1d55b03cd81ec02 which is rebased on Arrow 46f75d7

* Update to fix comment

* disable tests which use python/ray/rllib/tests/data/cartpole_small

* Fix get order of meta and data in MockObjectStore.java
2019-03-08 18:03:58 +08:00
Philipp Moritz
95254b3d71 Remove the old web UI (#4301) 2019-03-07 23:15:11 -08:00
Robert Nishihara
4c80177d6f Unpin gym in Python 2 since gym 0.12 was released. (#4291) 2019-03-07 15:59:30 -08:00
Philipp Moritz
dec7c3f8f5 [build] Add debug info to Bazel (#4278) 2019-03-07 15:21:13 -08:00
Eric Liang
437459f40a
[build] Make travis logs not as long (#4213)
* clean it up

* Update .travis.yml

* Update .travis.yml

* update

* fix example

* suppress

* timeout

* print periodic progress

* Update suppress_output

* Update run_silent.sh

* Update suppress_output

* Update suppress_output

* manually do timeout

* sleep 300

* fix test

* Update run_silent.sh

* Update suppress_output

* Update .travis.yml
2019-03-07 12:09:03 -08:00
Yuhong Guo
b9ea821d16
Use strongly typed IDs in C++. (#4185)
*  Use strongly typed IDs for C++.

* Avoid heap allocation in cython.

* Fix JNI part

* Fix rebase conflict

* Refine

* Remove type check from __init__

* Remove unused constructor declarations.
2019-03-07 21:43:01 +08:00
Eric Liang
b0332551dd
[rllib] Fix APPO + continuous spaces, feed prev_rew/act to A3C properly (#4286) 2019-03-06 21:36:26 -08:00
Hao Chen
f0465bc68c
[Java] Refine tests and fix single-process mode (#4265) 2019-03-07 09:59:13 +08:00
Philipp Moritz
39eed24d47 update version from 0.7.0.dev0 to 0.7.0.dev1 (#4282) 2019-03-06 14:43:09 -08:00
Robert Nishihara
f151aa8723 Update long running stress tests and add actor death test. (#4275) 2019-03-06 14:26:45 -08:00
Philipp Moritz
4bea25076f [build] Fix bazel glog error (#4279) 2019-03-06 11:03:59 -08:00
Eric Liang
2781d74680
[rllib] Reserve CPUs for replay actors in apex (#4217) 2019-03-06 10:22:12 -08:00
Eric Liang
6d705036f3
[rllib] Add callback accessor for raw observation, fix prev actions (#4212) 2019-03-06 10:21:05 -08:00
Eric Liang
0e77a8f8c0
[rllib] Add end-to-end tests for RNN sequencing (#4258) 2019-03-06 09:55:07 -08:00
Philipp Moritz
ff5e3384ce Update version to 0.7.0.dev1 and update docs 0.6.3 -> 0.6.4 (#4276) 2019-03-05 22:22:29 -08:00
Stephanie Wang
0ccaf118a2
Disconnect object manager clients if receiving an object fails (#4141)
* Disconnect object manager clients if ReadBuffer fails

* unused

* put back EINTR handling
2019-03-05 22:08:26 -08:00
Stephanie Wang
b7ebf17650 Fix test (#4264) 2019-03-05 18:37:00 -08:00
Eric Liang
78ad9c4cbb Add "ray timeline" command to auto-dump Chrome trace for the current Ray instance (#4239) 2019-03-05 16:28:00 -08:00
Adi Zimmerman
4cf2c9ecb8 [tune] Doc fixes (#4207)
Co-Authored-By: adizim <adizim@berkeley.edu>
2019-03-05 14:11:53 -08:00
Richard Liaw
a5441a3381
[tune] Fix testTrialNoSave (#4262)
Left a `last_result == None` after changing last_result to always be a
dict.



Fixes https://github.com/ray-project/ray/issues/4259.
2019-03-05 09:28:33 -08:00
Wang Qing
a116b7f646 [Java] Add runtime context (#4194) 2019-03-05 20:25:29 +08:00
bibabolynn
c73d5086f3 [Java] Single-process mode (#4245) 2019-03-05 13:50:20 +08:00
Robert Nishihara
fa8c07dd19 Sleep for half a second at exit in order to avoid losing log messages… (#4254) 2019-03-04 20:39:09 -08:00
Eric Liang
30bf8e46c7
[rllib] Use nested scope in custom loss example 2019-03-04 18:29:22 -08:00
Kristian Hartikainen
df9beb7123 [tune] Fix trial result fetching (#4219)
* Fix trial results wait in RayTrialExecutor.get_next_available_trial

* Add comment for the results shuffling

* Remove timeout from the wait

* Change random.sample to random.shuffle
2019-03-04 14:26:10 -08:00
Eric Liang
6e3384a719
[rllib] Add three new long-running stress tests {APEX, IMPALA, PBT} (#4215) 2019-03-04 14:05:42 -08:00
Stephanie Wang
8b871af555
Fix ray.wait bug for tasks on remote nodes and timeout=0 (#4242)
* Regression test

* Fix

* cleaner code
2019-03-04 11:46:06 -08:00
Hao Chen
a22d6ef955
Fix RemoteFunction._last_export_session (#4243) 2019-03-04 19:57:42 +08:00
Yuhong Guo
5866fd7005
Add type check in free and change Exception to TypeError (#4221) 2019-03-04 16:40:04 +08:00
Philipp Moritz
e96e06e031 bump version to 0.6.4 (#4226) 2019-03-03 14:39:05 -08:00
Adi Zimmerman
9551f2a92e [tune] Properly handle closing files in Trainable (#4232)
Fixes #3965.

Using the with keyword/block will close to file immediately after the block ends
2019-03-03 14:23:05 -08:00
Richard Liaw
3483282254
[tune] Local Mode support (#4138) 2019-03-03 14:05:59 -08:00
Peiren Yang
e2e6ef198b [autoscaler] Make commands bash -i to support newer bash (#4181)
The generated command in autoscaler/updater.py throws non-zero exit status 127 on Ubuntu 18.04.

## Related issue number
Closes #4155, Closes #1444.
2019-03-03 13:46:07 -08:00
Richard Liaw
fb1369d96f
[tune] Dynamic Resources for Trials (#3974)
## What do these changes do?

Provides a small helper function for modifying the resource requirements of a trial.

Also implements the following:
 - setting the last_result to be {} instead of None
 - Adding a shuffle to the BasicVariantGenerator
2019-03-03 11:38:36 -08:00
Eric Liang
ba03048254
[rllib] TF model custom_loss() should actually allow access to full rollout data (#4220) 2019-03-02 22:57:51 -08:00
Eric Liang
ff6dd8459a
[autoscaler] Timeout ssh master connection after 5 minutes 2019-03-02 22:57:22 -08:00
Philipp Moritz
295099b863 Fix release docs (#4225) 2019-03-02 22:01:43 -08:00
Philipp Moritz
fbdd5da9c1 Fix application stress tests (#4228)
Fixes https://github.com/ray-project/ray/issues/4227
2019-03-02 21:57:27 -08:00
Richard Liaw
a27cb225b6
Modularize Tune tests from multi-node tests (#4204) 2019-03-02 19:21:08 -08:00
Philipp Moritz
180414710e Make sure Bazel generated files get overwritten (#4205) 2019-03-02 13:38:37 -08:00
Robert Nishihara
4b89eebfc7 Move test folders under rllib/tune from test -> tests. (#4214) 2019-03-02 13:37:16 -08:00
Yuhong Guo
6f46edca51 Skip dead nodes to avoid connection timeout. (#4154) 2019-03-02 13:11:19 -08:00
Eric Liang
9950f63e8c Send task error instead of raw exception for signal (#4150) 2019-03-01 23:59:29 -08:00