Commit graph

3757 commits

Author SHA1 Message Date
Eric Liang
a6bc2b1842
Misc direct call fixes from unit tests (#6394) 2019-12-08 19:34:02 -08:00
Ameer Haj Ali
1a9948eef9 Update rllib-examples.rst (#6396) 2019-12-08 16:21:50 -08:00
Dean Wampler
65694cdc4c [bug] Attempt to fix links not working. (#6390) 2019-12-07 14:31:50 -08:00
Victor Le
4e24c805ee AlphaZero and Ranked reward implementation (#6385) 2019-12-07 12:08:40 -08:00
Yuhao Yang
c327ae152f [doc] Update the test command in getting-involved. (#6347) 2019-12-07 11:03:52 -08:00
Kai Yang
eb912b68b1 [Java] Fix instanceof RayPyActor (#6377) 2019-12-07 16:28:29 +08:00
Kai Yang
7e9fddf3ed [Java] Add java exception check in JNI (#6378) 2019-12-07 16:25:17 +08:00
visatish
e2ba8c1898 [tune] Fixed bug in PBT where initial trial result is empty. (#6351)
* Fixed bug in tune pbt where initial result is empty.

* Updated mock trial executor in test suite.

* Added comment.
2019-12-06 15:30:27 -08:00
Dean Wampler
53d62d3eec Expanded with new pages for getting started, etc. Blog links unchanged. (#6388) 2019-12-06 15:18:47 -08:00
Kai Yang
2003d2d952 explicit delete local reference in task_execution_callback for garbage collection (#6379) 2019-12-06 18:53:24 +08:00
Qstar
ed294f4c23 Ray Kubernetes Operator Part 1: readme, structure, config and CRD realted file (#6332)
* Ray-Operator first PR
1.RayCluster CRD and CR, structure code in golang
2.config file in Kubernetes

* Delete go.sum

* Ray-Operator first PR
1.add directory structure
2.add guide for submitting RayCluster

* Delete ray_v1_raycluster.bk.yaml

* Ray-Operator first PR
1.delete file bk
2.add more description about kubernetes and ray-operator features

* Ray-Operator first PR: adjust grammar

* Ray-Operator first PR: add More Information about proposal

* Ray-Operator first PR:
1.add heterogeneous version of CR
2.add reference ot key words, and reference links to the props in yaml
3.file structure to yaml level and function description

* Ray-Operator first PR: add ray operator proposal doc

* Ray-Operator first PR: add More Information about proposal

* Ray-Operator first PR: add command to start

* Ray-Operator first PR: add More Information about proposal

* Update deploy/ray-operator/README.md

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update deploy/ray-operator/api/v1/raycluster_types.go

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update deploy/ray-operator/api/v1/raycluster_types.go

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Ray-Operator first PR: add More Information about proposal

* Ray-Operator first PR: remove License

* Ray-Operator first PR: rename version from v1 to v1alpha1

* Ray-Operator first PR: use replicas instead of numNodes

* Ray-Operator first PR: update replicas in CR yaml file

* Ray-Operator first PR: add More Information about proposal
2019-12-05 22:45:03 -08:00
Zhijun Fu
b88b8202cc fix java build failure (#6062) 2019-12-06 14:38:43 +08:00
Ion
1c638a11a7 Refactor helper methods for new scheduler integration (#6354) 2019-12-05 18:49:25 -08:00
Edward Oakes
f63b64310a
Bump version to 0.8.0.dev7 (#6303) 2019-12-05 18:33:54 -08:00
Philipp Moritz
a454c815f1
Fix long running stress tests (#6374) 2019-12-05 18:29:41 -08:00
Philipp Moritz
dd27bfbb75
Rename .rayproject to ray-project (#6278) 2019-12-05 16:15:42 -08:00
mehrdadn
17103376c5 Patch arrow for Windows (#6363) 2019-12-05 16:09:21 -08:00
Eric Liang
6223d2ed0b
[direct call] Assign resource ids for direct call tasks (#6364) 2019-12-05 10:16:04 -08:00
Eric Liang
4c6739476b
[rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() (#6365) 2019-12-05 10:13:54 -08:00
micafan
668ce47360 [GCS]Add abstract interface of actor to GCS Client (#6269) 2019-12-05 13:38:29 +08:00
Zhijun Fu
7611e484ec properly handle a forwarded task that gets forwarded back (#6271) 2019-12-05 13:37:52 +08:00
Zhijun Fu
fa98694dd0 Fix raylet crash during cluster shutdown (#6272) 2019-12-05 11:08:58 +08:00
Simon Mo
ac6aa21411
Fix the autoscaler docker file to use rayproject (#6357) 2019-12-04 16:20:04 -08:00
Edward Oakes
f65d65f5de
Add WorkerID check to AssignTask (#6355) 2019-12-04 12:38:29 -08:00
Eric Liang
1a3b83abf8
[direct call] Fix hang when caller id changes for actor task submission (#6338) 2019-12-04 12:01:35 -08:00
Simon Mo
31113aeded
Use rayproject repo (#6353) 2019-12-03 22:36:40 -08:00
Stephanie Wang
a82fb5585d
[direct task] Remove timeout for resolving futures that were deserialized (#6337)
* Reply GetObjectStatus once the task completes

* Remove timeout-based future resolution

* fix

* Update core_worker.h
2019-12-03 12:04:59 -08:00
Stephanie Wang
d5720779b3 Set the actor ID as the assigned task ID for direct actor workers (#6335)
* Fix

* rename
2019-12-03 10:54:26 -08:00
Kai Yang
d51583dbd6 Add test listener to show the test progress of java UT (#6341) 2019-12-03 16:34:07 +08:00
Eric Liang
bc5e259264
[rllib] Add a doc section on computing actions (#6326)
* options doc

* add note

* hint shr

* doc update
2019-12-03 00:10:50 -08:00
Shital Shah
670cb6374e Doc enhancement: use build.sh for ray, clarification on how rllib selects VisionNetwork, note on setup-dev.py for rllib. (#6092) 2019-12-02 22:19:01 -08:00
Ujval Misra
fa5d62e8ba [tune] Retry restore on timeout (#6284)
* Retry recovery on timeout

* fix bug, revert some code

* Add test for restore time outs.

* Fix lint

* Address comments

* Don't timeout restores.
2019-12-02 20:01:47 -08:00
Richard Liaw
0b3d5d989b
[docs] Add public materials (#6331)
* startup

* update tune readme

* usingrah
2019-12-02 19:59:23 -08:00
Simon Mo
216ef8e41a
Remove the encrypted docker password. Use web UI. (#6333) 2019-12-02 17:22:59 -08:00
Edward Oakes
d2c66ba795
Don't add assigned tasks to SWAP queue (#6325) 2019-12-02 16:39:02 -08:00
Edward Oakes
dff6017272
Fix "failed to create head node" issue (#6304)
* Fix failed to create head node issue

* comments
2019-12-02 15:22:00 -08:00
Ion
2a3adf2d70 New scheduler integration (#6321) 2019-12-02 14:42:16 -08:00
Mitchell Stern
43d20fff62 Refactor dashboard codebase to improve modularity (#6330)
* Refactor dashboard codebase to improve modularity

* Simplify feature interface

* Use arrow notation in makeFeature argument types

* Use separate components for node and worker features rather than a single conditionally-rendered component

* Add comments about Ray worker process titles

* Add comments to non-obvious fields in node info API response
2019-12-02 11:05:40 -08:00
Stephanie Wang
69dd5c9319
[direct task] Fix bug that starts duplicate connections from the worker to the local raylet (#6307)
* Fix bug and add unit test

* rename
2019-12-02 10:25:05 -08:00
Stephanie Wang
da41180dc0
[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py (#6306)
* multinode failures direct

* Add number of retries allowed for tasks

* Retry tasks

* Add failing test for object reconstruction

* Handle return status and debug

* update

* Retry task unit test

* update

* update

* todo

* Fix max_retries decorator, fix test

* Fix test that flaked

* lint

* comments
2019-12-02 10:20:57 -08:00
Eric Liang
0b0a16982a [doc] Use .options() (#6323)
* options doc

* add note

* hint shr
2019-12-01 17:24:00 -08:00
mehrdadn
75cc994e0a Update various build options relating to Windows (#6315)
* Update .bazelrc for Windows compatibility

* Block inclusion of (legacy) WinSock.h to avoid errors

* Suppress warnings for Windows code

* Include boost::asio in includes so that it is passed as -isystem to avoid warnings

* Link with -lpthread only on non-Windows

* Undefine BOOST_FALLTHROUGH, which is unnecessary and causes macro redefinition warnings

* Define RAY_STATIC and ARROW_STATIC to compile for Windows

* Add WinSock import library for Arrow
2019-12-01 15:05:50 -08:00
Philipp Moritz
22fa9b564b fix linting (#6322) 2019-12-01 14:06:35 -08:00
mehrdadn
10d49a3f6f Use Boost's socket_holder instead of manually managing the socket (#6314)
* Use Boost's socket_holder instead of manually managing sockets.

Socket types are not ints on Windows, and we need to use wrapper for proper lifetime management regardless.
2019-12-01 13:27:52 -08:00
fangfengbin
7275556365 Reconstruct local dead actors immediately instead of waiting for initial_reconstruction_timeout_ms (#6243) 2019-11-30 18:03:48 +08:00
Simon Mo
4033d65e4f
Fix redis-server stoping in linux (#6296)
* Cleanup test_calling_start_ray_head

* Kill redis-server with args instead of comm

In linux, ps -o pid,comm output just redis-server instead of the
full executable path
2019-11-29 22:50:05 -08:00
mehrdadn
e28e464158 Convert io_service_ from reference to smart pointer (#6285) 2019-11-29 16:09:46 -08:00
mehrdadn
b8cfdba752 Bazelify hiredis (#6203) 2019-11-29 15:32:45 -08:00
Yuhao Yang
ffa043d4b7 [tune] replace self.config (#6313) 2019-11-29 11:09:30 -08:00
Stephanie Wang
724a5e3909
Turn on direct calls for test_failure.py (#6291) 2019-11-28 12:28:30 -08:00