Commit graph

1838 commits

Author SHA1 Message Date
Eric Liang
b3eb374817
[tune] Really disable retries by default 2019-12-11 13:12:28 -08:00
Edward Oakes
82f7dbc7a7
Increase TaskID size by 2 bytes, taken from JobID (#6425)
* Increase TaskID size by 2 bytes, taken from JobID

* comments

* check max job id

* fix doc

* fix local mode
2019-12-11 10:45:14 -08:00
Yuhao Yang
3db8faab0d [tune] fix log dir race condition (#6420) 2019-12-10 21:00:19 -08:00
Simon Mo
c61db84b8d Bump dev6->dev7 for two files not changed yet. (#6428) 2019-12-10 20:58:14 -08:00
Edward Oakes
044527adb8
Remove ref counting dependencies on ray.get() (#6412)
* Remove ref counting dependencies on Get()

* comment

* don't send IDs when disabled

* pass through internal config

* fix

* allow reinit

* remove flag
2019-12-10 18:11:34 -08:00
Ujval Misra
4e1d1ed00d [tune] Report trials by state fairly (#6395)
* Fairly represented trial states.

* filter test

* Indent

* Add test to BUILD

* Address Eric's comments (show truncation by state).

* Sort trials, only show 20.

* Fix lint
2019-12-10 14:56:54 -08:00
Philipp Moritz
16be483af7
[Projects] Return parameters for a command (#6409) 2019-12-10 10:25:01 -08:00
Chaokun Yang
6272907a57 [Streaming] Streaming data transfer and python integration (#6185) 2019-12-10 20:33:24 +08:00
Rong Rong
c1d4ab8bb4 Move top level RayletClient to ray::raylet::RayletClient (#6404) 2019-12-09 21:08:59 -08:00
Eric Liang
304b4f0d3d
Shard unit tests into medium sized files for test stability (#6398) 2019-12-09 13:15:29 -08:00
Eric Liang
a6bc2b1842
Misc direct call fixes from unit tests (#6394) 2019-12-08 19:34:02 -08:00
visatish
e2ba8c1898 [tune] Fixed bug in PBT where initial trial result is empty. (#6351)
* Fixed bug in tune pbt where initial result is empty.

* Updated mock trial executor in test suite.

* Added comment.
2019-12-06 15:30:27 -08:00
Zhijun Fu
b88b8202cc fix java build failure (#6062) 2019-12-06 14:38:43 +08:00
Ion
1c638a11a7 Refactor helper methods for new scheduler integration (#6354) 2019-12-05 18:49:25 -08:00
Edward Oakes
f63b64310a
Bump version to 0.8.0.dev7 (#6303) 2019-12-05 18:33:54 -08:00
Philipp Moritz
dd27bfbb75
Rename .rayproject to ray-project (#6278) 2019-12-05 16:15:42 -08:00
Eric Liang
6223d2ed0b
[direct call] Assign resource ids for direct call tasks (#6364) 2019-12-05 10:16:04 -08:00
Eric Liang
4c6739476b
[rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() (#6365) 2019-12-05 10:13:54 -08:00
micafan
668ce47360 [GCS]Add abstract interface of actor to GCS Client (#6269) 2019-12-05 13:38:29 +08:00
Eric Liang
1a3b83abf8
[direct call] Fix hang when caller id changes for actor task submission (#6338) 2019-12-04 12:01:35 -08:00
Ujval Misra
fa5d62e8ba [tune] Retry restore on timeout (#6284)
* Retry recovery on timeout

* fix bug, revert some code

* Add test for restore time outs.

* Fix lint

* Address comments

* Don't timeout restores.
2019-12-02 20:01:47 -08:00
Edward Oakes
dff6017272
Fix "failed to create head node" issue (#6304)
* Fix failed to create head node issue

* comments
2019-12-02 15:22:00 -08:00
Mitchell Stern
43d20fff62 Refactor dashboard codebase to improve modularity (#6330)
* Refactor dashboard codebase to improve modularity

* Simplify feature interface

* Use arrow notation in makeFeature argument types

* Use separate components for node and worker features rather than a single conditionally-rendered component

* Add comments about Ray worker process titles

* Add comments to non-obvious fields in node info API response
2019-12-02 11:05:40 -08:00
Stephanie Wang
da41180dc0
[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py (#6306)
* multinode failures direct

* Add number of retries allowed for tasks

* Retry tasks

* Add failing test for object reconstruction

* Handle return status and debug

* update

* Retry task unit test

* update

* update

* todo

* Fix max_retries decorator, fix test

* Fix test that flaked

* lint

* comments
2019-12-02 10:20:57 -08:00
Philipp Moritz
22fa9b564b fix linting (#6322) 2019-12-01 14:06:35 -08:00
Simon Mo
4033d65e4f
Fix redis-server stoping in linux (#6296)
* Cleanup test_calling_start_ray_head

* Kill redis-server with args instead of comm

In linux, ps -o pid,comm output just redis-server instead of the
full executable path
2019-11-29 22:50:05 -08:00
Yuhao Yang
ffa043d4b7 [tune] replace self.config (#6313) 2019-11-29 11:09:30 -08:00
Stephanie Wang
724a5e3909
Turn on direct calls for test_failure.py (#6291) 2019-11-28 12:28:30 -08:00
Eric Liang
b7b655c851
Also use NotifyDirectCallTaskBlock/Unblocked for plasma store accesses (#6249)
* wip

* fix it

* lint

* wip

* fix

* unblock

* flaky

* use fetch only flag

* Revert "use fetch only flag"

This reverts commit 56e938a0ee2024f5c99c9ab2d55fd35558fb15e1.

* restore error resolution

* use worker task id

* proto comments

* fix if
2019-11-27 22:46:15 -08:00
Simon Mo
22b305223a
Build Docker Containers for Linux Wheels (#6233) 2019-11-27 17:05:36 -08:00
Stephanie Wang
2797c11b69
[direct task] For serialized object IDs, check with owner before declaring object unreconstructable (#6286)
* Track borrowed vs owned objects

* Serialize owner address with object ID

* serialize owner task id

* Deserialize object IDs

* Pass direct task ID instead of plasma ID

* it works

* Fix ref count test

* Add unit test

* update warning

* we own ray.put objects

* missing file

* doc

* Fix unit test

* comments

* Fix py2

* lint

* update
2019-11-27 15:31:44 -08:00
Edward Oakes
e4f9b3b7d9
Use process reaper for cleanup (#6253) 2019-11-26 22:00:08 -06:00
Eric Liang
30b2fc1d81
Fix actor creation hang due to race in SWAP queue (#6280) 2019-11-26 15:21:03 -08:00
Simon Mo
1ca8c427e3 Consistent Name for Process Title (#6276)
* Consistent naming for setprotitle

* Address comments

* Add debug/verbose mode

* Fix test
2019-11-26 11:56:28 -08:00
Robert Nishihara
ffb9c0ecae Fix bug in which remote function redefinition doesn't happen. (#6175) 2019-11-26 11:19:19 -06:00
Edward Oakes
7f8de61441 [hotfix] Remove python/ray/tests/__init__.py (#6279)
* Remove python/ray/tests/__init__.py for bazel

* Comment out checks
2019-11-25 17:04:20 -08:00
Eric Liang
64a3a7239e
Set RAY_FORCE_DIRECT=1 for run_rllib_tests, test_basic (#6171) 2019-11-25 14:12:11 -08:00
Edward Oakes
e72aef2ba6
[hotfix] Fix building linux wheels 2019-11-25 12:45:31 -07:00
Simon Mo
c8b69727cd
ray stop only kills process with ray keyword (#6257)
* Use psutil to kill processes

* Psutil as core requirement

* Revert "Psutil as core requirement"

This reverts commit d3235ce3d994d2bb7db39e3ad4a46049703898bb.

* Revert "Use psutil to kill processes"

This reverts commit de0ed874fed673f5e98715950688f418bbcc415c.

* Revert back to subproc

* Add comments, grep for ray as well

* SIGTERM
2019-11-24 16:32:07 -08:00
Eric Liang
e5b5c98558
Fix python PATH for build (#6260) 2019-11-24 15:32:06 -08:00
Eric Liang
53641f1f74
Move more unit tests to bazel (#6250)
* move more unit tests to bazel

* move to avoid conflict

* fix lint

* fix deps

* seprate

* fix failing tests

* show tests

* ignore mismatch

* try combining bazel runs

* build lint

* remove tests from install

* fix test utils

* better config

* split up

* exclusive

* fix verbosity

* fix tests class

* cleanup

* remove flaky

* fix metrics test

* Update .travis.yml

* no retry flaky

* split up actor

* split basic test

* split up trial runner test

* split stress

* fix basic test

* fix tests

* switch to pytest runner for main

* make microbench not fail

* move load code to py3

* test is no longer package

* bazel to end
2019-11-24 11:43:34 -08:00
Simon Mo
aa8d5d2f6c
Rate limit asyncio actor (#6242) 2019-11-24 11:39:28 -08:00
Yuhao Yang
f6a5baf844 [tune] minor doc fix (#6248) 2019-11-23 21:54:41 -08:00
Stephanie Wang
d2662fecea
Miscellaneous bug fixes to throw unreconstructable errors for direct calls (#6245)
* Test cases

* Fix InPlasmaError

* raylet fixes to force errors for direct calls

* Disable lineage logging and task pending checks for direct calls

* move todo

* Clean up tests

* Fix bugs in object store for Contains and Delete

* Use direct call in tests

* Fixes, separate actor creation direct call from normal direct call spec
2019-11-23 15:05:49 -08:00
Eric Liang
b052bcf1fc
Bazelify tune tests in travis (#6219) 2019-11-22 13:58:50 -08:00
Simon Mo
eb6a93c0f0
[hotfix] fix lint (#6236) 2019-11-21 18:30:57 -08:00
Eric Liang
7559fdb141 [rllib/tune] Cache get_preprocessor() calls, default max_failur… (#6211) 2019-11-21 15:55:56 -08:00
Stephanie Wang
d3227f2f2d
Fix bug in direct task calls for objects that were evicted (#6216)
* Fix bug and add some checks

* rename
2019-11-21 15:38:31 -08:00
Simon Mo
29ba6bfc64
Basic Async Actor Call (#6183)
* Start trying to figure out where to put fibers

* Pass is_async flag from python to context

* Just running things in fiber works

* Yield implemented, need some debugging to make it work

* It worked!

* Remove debug prints

* Lint

* Revert the clang-format

* Remove unnecessary log

* Remove unncessary import

* Add attribution

* Address comment

* Add test

* Missed a merge conflict

* Make test pass and compile

* Address comment

* Rename async -> asyncio

* Move async test to py3 only

* Fix ignore path
2019-11-21 11:56:46 -08:00
Philipp Moritz
a4437813eb
[Projects] Unify hyphen vs underscore handling for arguments (#6208) 2019-11-20 23:52:41 -08:00