Tao Wang
25ac8f9aa5
[GCS]Use new flag to indicate whether resources are updated and update realtime resources view ( #10906 )
...
* Handle resources turning empty and update realtime view
* add up missing flag
* per comments
* use flag instead of special key to represent if resource changed
* Update src/ray/protobuf/gcs.proto
Co-authored-by: fangfengbin <869218239a@zju.edu.cn>
* fix lint in gcs.proto
* fix embarrassed mistake
Co-authored-by: fangfengbin <869218239a@zju.edu.cn>
2020-09-28 01:57:27 -07:00
fangfengbin
2e41a29c8f
[Placement Group]Support placement group request processing idempotent in raylet ( #10998 )
...
* add part code
* fix review comment
* fix review comment
* fix review comment
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-09-28 01:56:43 -07:00
fang yeqing
0765d989ae
[Core] Simplify logic in node manager class. ( #11063 )
...
Co-authored-by: 逗角 <yeqing.fyq@antfin.com>
2020-09-28 01:54:06 -07:00
fangfengbin
142234cbcb
[GCS]Fix ServiceBasedGcsClientTest bug ( #11031 )
...
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-09-28 14:50:36 +08:00
fangfengbin
86e5db4d59
[GCS]Fix GCS actor manager idempotent bug ( #11003 )
...
* [GCS]Fix GCS actor manager idempotent bug
* fix review comment
* fix review comment
* fix review comments
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-09-27 21:12:42 -07:00
SangBin Cho
1e39c40370
[Placement Group] Capture child tasks by default. ( #11025 )
...
* In progress.
* Finished up.
* Improve comment.
* Addressed code review.
* Fix test failure.
* Fix ci failures.
* Fix CI issues.
2020-09-27 19:33:00 -07:00
fangfengbin
f0787a63da
[GCS] fix rpc port bug ( #11055 )
...
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-09-27 19:08:48 -07:00
Kai Fricke
e7315b0856
[tune] Callbacks for tune runs ( #11001 )
2020-09-27 16:50:07 -07:00
Christian Kasim Loan
4285bee517
[Doc] Added John Snow Lab's NLU reference to community integration page ( #10985 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-27 13:07:12 -07:00
DK.Pino
db7097fb1f
[Refactor] Rename ClientId to NodeId ( #10992 )
...
* rename ClientId to NodeId
* format lint
* format lint
* fix conflicts
* rename new ClientId to NodeId
* update lint
* make same version of clang-format with travis ci
2020-09-27 10:24:21 -07:00
Tao Wang
f69b390755
[BUILD]ignore pyenv version file in git ( #11053 )
2020-09-27 09:26:26 -07:00
Eric Liang
b5ecdd088b
Deflake object manager test ( #11052 )
2020-09-26 18:33:02 -07:00
Richard Liaw
51038f2197
confirm ( #11049 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-26 13:48:13 -07:00
Stephanie Wang
552ebdbeda
[Core] Announce worker port at end of constructor ( #11036 )
2020-09-25 21:56:00 -07:00
SangBin Cho
29663d89f1
[Placement Group] Remove warning msg for placement groups. ( #11034 )
...
* Done.
* Addressed code review.
* Fixed typo.
* Addressed code review.
2020-09-25 20:53:42 -07:00
Eric Liang
8f79b4e45e
[rllib] Replay buffer size inaccurate with replay_seq_len option ( #10988 )
...
* support replay seq len
* update
* fix warn
* add test
* test
2020-09-25 13:47:23 -07:00
SangBin Cho
8abe13023f
[Metric] Fix issue 10634 ( #10940 )
...
* Fix.
* Revert "Fix."
This reverts commit 52c9c1ee646b551a4dd2b639c78be67683db2b1c.
* ADdressed code review.
* Addressed code review.
2020-09-25 09:11:05 -07:00
SangBin Cho
109481afd9
[Metric] custom metrics refinement ( #10861 )
...
* In progress
* In Progress.
* Addressed code review.
* Add unit tests.
* Add a simple doc.
* Fixed test failure.
* Fix all test failures from serve.
* Addressed code review.
2020-09-25 09:10:28 -07:00
Eric Liang
609c1b8acd
Start moving ray internal files to _private module ( #10994 )
2020-09-24 22:46:35 -07:00
Ameer Haj Ali
3b6fe72029
Make the command runner interface public ( #11022 )
2020-09-24 22:45:17 -07:00
Amog Kamsetty
ee85cb31a5
[Tune] Fix Memory Leak ( #10989 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-24 20:26:55 -07:00
Eric Liang
a26394d184
[tests] Blacklist failing win32 tests ( #10935 )
2020-09-24 20:15:28 -07:00
Richard Liaw
a563344bc2
[docs] remove ref to google groups -> github discussions ( #11019 )
2020-09-24 18:09:51 -07:00
Simon Mo
4f6e218a3d
Add a new _available_resources_per_node for state API ( #11014 )
2020-09-24 17:25:15 -07:00
Alex Wu
0f168bf2ef
[hotfix] Use ref in WorkerPool::TryKillingIdleWorkers ( #11017 )
2020-09-24 17:23:56 -07:00
Amog Kamsetty
3716fb1ca9
[Ray SGD] Fix hotfix test ( #11015 )
2020-09-24 14:14:03 -07:00
Amog Kamsetty
07bdf062b9
[Ray SGD] [Hotfix] Worker group hotfix ( #11008 )
2020-09-24 12:21:30 -07:00
SangBin Cho
8c241d5f1d
[Core] Use node ip address properly in ray.init ( #10829 )
...
* Fix.
* Addressed code review.
* Addressed code review.
2020-09-24 11:44:52 -07:00
Ameer Haj Ali
4ac58d54d6
prepare for head node ( #10997 )
...
Co-authored-by: Ameer Haj Ali <ameerhajali@ameers-mbp.lan>
2020-09-24 11:18:38 -07:00
Kai Fricke
d9c4dea7cf
[tune] strict metric checking ( #10972 )
2020-09-24 10:00:48 -07:00
SangBin Cho
5e6b887f2d
[Placement Group] Capture Child Task Part 1 ( #10968 )
...
* In progress.
* In progers.
* Done.
* Addressed code review.
* Increase timeout to make a test less flaky.
* Addressed code review.
* Addressed code review.
2020-09-24 09:02:03 -07:00
Keqiu Hu
46a560e876
[cli][ray] ray start
should error by default if there's already an instance running ( #10826 )
2020-09-24 10:29:59 -05:00
chaokunyang
842861b4fc
[Streaming] refine streaming tests sleep condition ( #10991 )
2020-09-24 17:06:55 +08:00
Amog Kamsetty
f42ab54112
[Docs] [Tune] Fix Tune Quick Start docs ( #10996 )
2020-09-24 00:28:01 -07:00
SongGuyang
f9b040db52
add log-dir to new dashboard ( #10885 )
2020-09-24 13:40:37 +08:00
DK.Pino
4fa6523e4e
[Core] Remove unnecessary if judgment ( #10971 )
...
* Remove unnecessary if judgment
* format code style
2020-09-23 21:24:11 -07:00
fangfengbin
2a79571c29
[Placement Group] Optimize log ( #10974 )
...
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-09-23 20:28:08 -07:00
Kai Yang
b251a445dd
[Core] Fix maximum_startup_concurrency
caused by AnnounceWorkerPort
( #10853 )
...
* Fix maximum_startup_concurrency caused by AnnounceWorkerPort
* Address comment
* Update src/ray/raylet/worker_pool.h
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-09-23 20:27:44 -07:00
Amog Kamsetty
52e1495e30
[Ray SGD] TorchTrainable pre 0.8.7 deprecation warning ( #10984 )
...
* torch trainable add pre 0.8.7 backwards compat
* raise instead
* Update python/ray/util/sgd/torch/torch_trainer.py
2020-09-23 18:19:43 -07:00
Ian Rodney
4c3f09094a
[docs] redis-port -> port ( #10937 )
2020-09-23 17:04:13 -07:00
SangBin Cho
cebab8886e
[Placement group] Refine doc ( #10922 )
2020-09-23 16:26:32 -07:00
Alex Wu
295782d411
[New Scheduler] Refactor cluster resource scheduler ( #10938 )
2020-09-23 15:46:31 -07:00
Eric Liang
ecdaaffc67
add large data warning ( #10957 )
2020-09-23 15:46:06 -07:00
Allen
567009d5fd
[Autoscaler] Fix k8s command runner when command fails ( #10966 )
...
Co-authored-by: Allen Yin <allenyin@anyscale.io>
2020-09-23 13:25:49 -07:00
SangBin Cho
7931b6ce2e
Fix placement group bug failing in release test ( #10944 )
2020-09-23 12:37:28 -07:00
Kai Fricke
5921e87ecd
[tune] Only add new trial when there is no pending trial ( #10979 )
2020-09-23 11:08:12 -07:00
Amog Kamsetty
7dbd0ff824
fix example ( #10964 )
2020-09-23 10:33:19 -07:00
fangfengbin
a260e66016
[Placement Group]Fix CommitResources crash bug ( #10951 )
2020-09-23 17:24:53 +08:00
SangBin Cho
390107b6cb
[Core] Allow to pass node ip address to gcs server. ( #10946 )
...
* Allow to pass node ip address to gcs server.
* Fix.
* Addressed code review.
* Fixed an error.
* Addressed code review.
2020-09-23 01:52:26 -07:00
Michael Luo
ba5a3ae9e2
Enable vtrace by default ( #10962 )
2020-09-22 22:18:21 -07:00