1
0
Fork 0
mirror of https://github.com/vale981/ray synced 2025-03-17 08:36:38 -04:00
Commit graph

8911 commits

Author SHA1 Message Date
SangBin Cho
539c51a003
[Core] Support GCS server port assignment. () 2020-07-14 11:49:56 -05:00
SangBin Cho
f6eb47fc1f
[Stats] metrics agent exporter () 2020-07-14 11:49:16 -05:00
Siyuan (Ryans) Zhuang
5b192842b5
Fix ObjectRef and ActorHandle serialization () 2020-07-14 09:42:32 -07:00
Siyuan (Ryans) Zhuang
d57ff5e2af
Remove legacy C++ code () 2020-07-14 00:57:42 -07:00
krfricke
deba082cb4
[tune] PyTorch CIFAR10 example ()
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-07-13 23:16:05 -07:00
kisuke95
276fe109c5
change error code name of boost timer () 2020-07-14 11:50:58 +08:00
fangfengbin
3c90f960fb
Fix gcs_pubsub_test bug()
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-07-14 11:34:50 +08:00
Sven Mika
617eb8f279
[RLlib] Issue 9402 MARWIL producing nan rewards. () 2020-07-14 05:07:16 +02:00
Sven Mika
03ab86567f
[RLlib] Layout of Trajectory View API (new class: Trajectory; not used yet). () 2020-07-14 04:27:49 +02:00
Max Fitton
222635b63f
Machine View Sorting / Grouping ()
* Convert NodeInfo.tsx to a functional component

* Update NodeRowGroup to be a functional component

* lint

* Convert TotalRow to functional component.

* lint

* move node info over to using the sortable table head component. spacing is still a little wonky.

* Factor a NoewWorkerRow class out of NodeRowGroup that will be usable when grouping / ungrouping

* Compilation checkpoint, I factored the worker filtering logic out of node info into the reducer

* Add sort accessors for CPU

* Add sort accessors for Disk

* Add sort accessors for RAM

* add a table sort util for function based accessors (rather than flat attribute-based accessor)

* wip refactor node info features

* wip

* Rendering Checkpoint. I've refactored the features and how they are called to add sorting support. Also reworks the way error counts and log counts are passed to the front-end to remove some ugly logic

* wip

* wip

* wip

* Finish adding sorting and grouping of machine view

* lint

* fix bug in filtration of logs and errors by worker from recent refactor.

* Add export of Cluster Disk feature

* fix some merge issues

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-07-13 20:45:17 -05:00
SangBin Cho
22b2e51152
Fix test-multi-node () 2020-07-13 20:44:27 -05:00
Richard Liaw
a567f7977c
[tune] Put examples under proper version control ()
Co-authored-by: krfricke <krfricke@users.noreply.github.com>
2020-07-13 18:01:10 -07:00
Richard Liaw
7abf7a0109
[docs] Render ActorPool documentation, etc () 2020-07-13 17:59:22 -07:00
Vasily Litvinov
6ad13e0da8
Add ability to specify SOCKS proxy for SSH connections () 2020-07-13 16:10:07 -07:00
Richard Liaw
dfe3ebe4a2
[tune] sklearn comment out () 2020-07-13 16:06:44 -07:00
Siyuan (Ryans) Zhuang
4da97a7c99
[Core] Build raylet client as an independent component () 2020-07-13 16:00:32 -07:00
mehrdadn
3d65682e62
Bazel selects compiler flags based on compiler ()
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-13 15:31:46 -07:00
Henk Tillman
c7714ca575
GCP authentication using oauth tokens () 2020-07-13 14:36:40 -07:00
Ian Rodney
0085cf75d0
Allow --lru-evict to be passed into ray start () 2020-07-13 14:09:39 -07:00
Amog Kamsetty
4454d05bcf
[Tune] Trainable documentation fix () 2020-07-13 13:15:01 -07:00
Hao Chen
e6225bdfa1
[GCS] Fix the bug about raylet receiving duplicate actor creation tasks () 2020-07-13 11:34:02 -07:00
mehrdadn
5291bf235b
TRAVIS_PULL_REQUEST is false for non-PRs, not empty ()
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-13 14:52:40 +02:00
Nicolaus93
b5a6c57295
[tune] handling nan values () 2020-07-12 17:08:36 -07:00
Tanay Wakhare
15aa08a3d1
[RLLib] WindowStat bug fix ()
* WindowStat error catching, which processes NaNs properly instead of erroring. This ought to resolve issue .
https://github.com/ray-project/ray/issues/7910
2020-07-12 23:01:32 +02:00
Tanay Wakhare
3536d8e4b3
Masking error. With t*valid_mask, we get the error np.inf*0 = np.inf () 2020-07-12 22:59:35 +02:00
Siyuan (Ryans) Zhuang
381c242f6b
[Core] Simplify Raylet Client () 2020-07-12 12:42:54 -07:00
Henk Tillman
8c985dc797
Update conda and ray wheel on GCP images () 2020-07-12 12:12:27 -07:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. () 2020-07-11 22:06:35 +02:00
mehrdadn
5c853eaa6a
Fix copy to workspace () 2020-07-11 14:27:56 +02:00
Ian Rodney
26fcda50e7
Pass run args to DockerCommandRunner () 2020-07-10 18:09:01 -07:00
Simon Mo
d4a5d09dab
[Serve] Merge router with HTTPProxy () 2020-07-10 13:52:48 -07:00
Siyuan (Ryans) Zhuang
1798deae94
[Core] Plasma RAII support () 2020-07-10 09:22:29 -07:00
SangBin Cho
d8a0d76d02
Fix macos compliation bug ()
* Fix.
2020-07-10 09:18:09 -07:00
Sven Mika
14160ca58c
[RLlib] Issue (DQN w/o dueling produces invalid actions). () 2020-07-10 12:43:03 +02:00
Kai Yang
c89b59cf48
Remove the RAY_CHECK in Worker::Port() () 2020-07-10 18:06:25 +08:00
Kai Yang
a98cd0670e
[Java] Improve JNI performance when submitting and executing tasks () 2020-07-10 17:51:07 +08:00
Hao Chen
d49dadf891
Change Python's ObjectID to ObjectRef () 2020-07-10 17:49:04 +08:00
Tao Wang
6311e5a947
[HOTFIX] Fix compile direct_actor_transport_test on mac () 2020-07-10 17:19:34 +08:00
fangfengbin
35861f17a3
Fix gcs_table_storage testcase bug ()
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-07-10 16:16:28 +08:00
Hao Chen
bed1be611e
Fix flaky test_dynres.py () 2020-07-10 10:34:23 +08:00
mehrdadn
dd2cc6eb48
Update hiredis and remove Windows patches ()
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-09 18:45:44 -07:00
Patrick Ames
dc51b08c36
[autoscaler] Allow users to disable the cluster config cache ()
* [autoscaler] Remove autoscaler config cache.

* [autoscaler] Add flag allowing users to explicitly disable the config cache.
2020-07-09 15:47:58 -07:00
Stefan Schneider
6db55ca8db
[docs][rllib] Recommended workflow for training, saving, and testing () 2020-07-09 15:47:10 -07:00
Eric Liang
09b9b81ea4
[autoscaler] Move command runners into separate file and clean up interface. ()
* cleanup

* wip

* fix imports

* fix lint
2020-07-09 15:40:56 -07:00
Zhuohan Li
8a76f4cbb5
[Core] put small objects in memory store ()
* remove the put in memory store

* put small objects directly in memory store

* cast data type

* fix another place that uses Put to spill to plasma store

* fix multiple tests related to memory limits

* partially fix test_metrics

* remove not functioning codes

* fix core_worker_test

* refactor put to plasma codes

* add a flag for the new feature

* add flag to more places

* do a warmup round for the plasma store

* lint

* lint again

* fix warmup store

* Update _raylet.pyx

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-07-09 15:39:40 -07:00
Alex Wu
34b85659d4
[Core] New scheduler fixes ()
* .

* test_args passes

* .

* test_basic.py::test_many_fractional_resources causes ray to hang

* test_basic.py::test_many_fractional_resources causes ray to hang

* .

* .

* useful

* test_many_fractional_resources fails instead of hanging now :)

* Passes test_fractional_resources

* .

* .

* Some cleanup

* git is hard

* cleanup

* Fixed scheduling tests

* .

* .
2020-07-09 15:37:51 -07:00
Alisa
f0a72ad985
[Core] Add placement group scheduler and some api in resource scheduler ()
* Add placement group scheduler and some api of resource scheduler.
Merge fix cv hang in multithread variables race ().

* change the bundle id and delete unit count in bundle

change vector<bundle_spec> to vector<shared_ptr<bundle_spec>>

Add placement group scheduler and some api of resource scheduler.
Merge fix cv hang in multithread variables race ().

change the bundle id and delete unit count in bundle

remove CheckIfSchedulable()

add comments and fix the bug in resource

* fix placement group schedule

* add placement group scheduler and change some api in resource scheduler

* fix by the comments

* fix conflict

* fix lint

* fix lint

* fix bug in merge

* fix lint

Co-authored-by: Lingxuan Zuo <skyzlxuan@gmail.com>
2020-07-09 15:37:18 -07:00
Stephanie Wang
0389735d7a
[core] Pass owner address from the workers to the raylet ()
* Add intended worker ID to GetObjectStatus, tests

* Remove TaskID owner_id

* lint

* Add owner address to task args

* Make TaskArg a virtual class, remove multi args

* Set owner address for task args

* merge

* Fix tests

* Add ObjectRefs to task dependency manager, pass from task spec args

* tmp

* tmp

* Fix

* Add ownership info for task arguments

* Convert WaitForDirectActorCallArgs

* lint

* build

* update

* build

* java

* Move code

* build

* Revert "Fix Google log directory again ()"

This reverts commit 275da2e400.

* Fix free

* fix tests

* Fix tests

* build

* build

* fix

* Change assertion to warning to fix java
2020-07-09 14:35:54 -07:00
mehrdadn
4687b807c4
Combine different severities into the same log files ()
* Combine different severities into the same log files

Co-authored-by: Mehrdad <noreply@github.com>
2020-07-09 14:14:28 -07:00
Richard Liaw
b5103bacd1
[tune] Fix github readme ()
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-07-09 12:37:24 -07:00