SangBin Cho
cef6286f63
[Pubsub] Batch messages ( #15084 )
...
* batch pubsub 1
* Logic done. Tests left.
* done.
2021-04-02 16:42:18 -07:00
Alex Wu
aea28c53ce
. ( #15093 )
...
Co-authored-by: Alex <alex@anyscale.com>
2021-04-02 16:39:38 -07:00
Yi Cheng
4caf7a511d
Deflaky test failure in win32 ( #15090 )
...
* deflake win32 failure
* skip dask on w32
2021-04-02 14:56:24 -07:00
SangBin Cho
015369db34
[Core] Fix plasma store segfault ( #15071 )
...
* Use shared pointer instead of a raw pointer
* Lint.
* Addressed code review.
* Addressed code review.g
2021-04-02 14:54:20 -07:00
architkulkarni
d8f8583e80
Revert "[Serve] Set controller and HTTP proxy num_cpus=0 by default ( #15000 )" ( #15091 )
...
This reverts commit 39aa01fc2c
.
2021-04-02 13:01:57 -07:00
Yi Cheng
ecb94b3fe9
Add test case to check job conf compatible issue ( #15082 )
2021-04-02 12:03:21 -07:00
Dmitri Gekhtman
42565d5bbe
[autoscaler] Fix update/terminate race condition ( #15019 )
...
Co-authored-by: AmeerHajAli <ameerh@berkeley.edu>
2021-04-02 11:57:02 -07:00
SangBin Cho
3578d4e9d8
[Object Spilling] Limit number of objects to fuse ( #15034 )
...
* ready to go.
* Done.
* done.
* Done.
* Addressed code review.
* Fix a build issue.
2021-04-02 10:49:15 -07:00
Edward Oakes
96cc7897f7
[serve] Use longest prefix matching for path routing ( #15041 )
2021-04-02 12:01:47 -05:00
architkulkarni
39aa01fc2c
[Serve] Set controller and HTTP proxy num_cpus=0 by default ( #15000 )
2021-04-02 12:01:22 -05:00
Dmitri Gekhtman
6f81ec1998
[kubernetes][test] Operator test tweaks. ( #15074 )
2021-04-02 09:20:52 -07:00
Kai Fricke
8de66fce3d
[tune] Improve BOHB/ConfigSpace dependency check ( #15064 )
2021-04-02 10:19:49 +02:00
SangBin Cho
3965310f93
[Core] Fix the check failure from object manager ( #15070 )
2021-04-01 21:21:42 -07:00
Alex Wu
f52c855704
[core] Fix placement group GPU assignment bug ( #15049 )
2021-04-01 17:46:09 -07:00
Yi Cheng
d4c20c970b
[core] Fix UTIL worker issue ( #14925 )
...
* Fix
* format
* more
* format
* fix
* fix
* fix comment
* fix test failure
2021-04-01 17:36:45 -07:00
Simon Mo
c9dac9328e
[Serve] Fix serializing nested fields in Pydantic ( #15069 )
2021-04-01 17:20:34 -07:00
Siyuan (Ryans) Zhuang
6ad379864e
[doc] Fix inconsistent doc about ObjectID bytes ( #15072 )
2021-04-01 17:14:30 -07:00
Alex Wu
4fba05ae4d
[core] Hybrid scheduling policy. ( #14790 )
2021-04-01 16:59:59 -07:00
Dmitri Gekhtman
474fb6bf0c
[kubernetes][client][docs] Note requirement for matching Ray versions ( #15068 )
2021-04-01 15:08:25 -07:00
Edward Oakes
346994745a
[serve] Get handle in starlette endpoint constructor instead of lazily ( #15066 )
2021-04-01 16:07:28 -05:00
Ian Rodney
22c1aeb240
[Tests] Skip autoscaler tests on Windows ( #15033 )
2021-04-01 10:16:42 -07:00
fangfengbin
18728b2b7e
Fix c++ gcs test bug ( #15063 )
...
* fix ut bug
* fix bug
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2021-04-01 09:19:24 -07:00
SangBin Cho
005cff0092
Revert "Revert "[Core] Implement long polling-based pubsub to reduce … ( #14909 )
2021-04-01 09:03:15 -07:00
Eric Liang
b2c5093054
Disable flaky windows://python/ray/tests:test_gcs_fault_tolerance ( #15052 )
2021-04-01 10:54:43 -05:00
Tomas Babej
bc42e69503
ci: Fix broken symlink detection ( #15054 )
2021-04-01 08:33:51 -07:00
Kai Fricke
d33b0e4bc3
[tune] Reconcile placement groups every N seconds to avoid bottlenecks when running many short trials ( #15011 )
...
Closes a release blocking issue
2021-04-01 17:04:44 +02:00
Hao Chen
3e1a0439b7
Fix concurrent actor starting too many threads. ( #14927 )
2021-04-01 19:58:18 +08:00
Michael Schock
12b4560afa
Minor doc fix ( #15055 )
2021-03-31 22:58:31 -07:00
Ameer Haj Ali
e02bd990d8
Move monitor.py to autoscaler/_private directory ( #15050 )
...
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
2021-04-01 07:41:47 +03:00
SangBin Cho
463e9b2ef9
Try fixing it. ( #15046 )
2021-03-31 16:31:47 -07:00
Stephanie Wang
a86a7a6a98
[core] Cap total memory used by executing tasks' arguments ( #15027 )
...
* Task dependency map
* Pinned args threshold
* Unit test and fix
* no leaks
* update
* update
* remove assertion
2021-03-31 15:38:40 -07:00
Edward Oakes
126b9a6c14
[serve] Add basic upscaling test using cluster_utils ( #15044 )
2021-03-31 17:18:02 -05:00
Alex Wu
70f45af541
Deflake test_failure ( #15026 )
2021-03-31 14:59:38 -07:00
Edward Oakes
4061b72f2e
[serve] Add serve.get_deployment() API ( #14953 )
2021-03-31 14:57:39 -05:00
Simon Mo
57256b456a
[serve] Make sure test_imported_backend is ran ( #15043 )
2021-03-31 14:32:45 -05:00
SangBin Cho
79a6aa97b7
[Core] Optimize get core worker Stats ( #15008 )
...
* in progress.
* Optimize get core worker stats.
* Fix a segfault.
* Addressed code review.
* Update comments.
* Addressed code review.
2021-03-31 12:21:53 -07:00
Yi Cheng
4480132229
[core] Integration runtime_env with ray client ( #14881 )
...
* server side ready
* client size
* py
* fix
* up
* format
* add files
* add pyx
* up
* up
* up
* add keys
* format
* update
* format
* add unittests
* add files
* up
* up
* fix
* up
* fix thread issue
* format
* fix
* update proto
* Fix
* format
* fix
* more
* fix conflict
* fix
* fix order
* format
* add
* up
* compiling fix
* lint
* fix
* format
* fix some
* some fix
* fix comment
* test cases
* add test
* comments
* fix name
* format
* fix
* revert gcs-kv
* fix comments
* fix failure
* fix test
* format
* fix timeout
* fix
* fix
* fix
* format
* format
* fix flaky test
Co-authored-by: Yi Cheng <singye888@gmail.com>
2021-03-31 11:39:34 -07:00
Clark Zinzow
91cf272c2e
[Core] Exit autoscaler with a non-zero exit code upon handling SIGINT/SIGTERM ( #14518 )
2021-03-31 10:08:02 -07:00
Ian Rodney
32e50b8c67
[Docker] Run docker stop in parallel ( #14901 )
...
* first pass at parallel docker stop
* real impl
* use env var variable
* lint fix
2021-03-31 08:41:52 -07:00
Edward Oakes
107effb370
[serve] Add tests for reconnecting to cluster with ray client ( #15029 )
2021-03-31 10:08:12 -05:00
Edward Oakes
12f5e5ab62
[serve] Small cleanup in HTTP proxy ( #15028 )
2021-03-31 09:18:11 -05:00
Kai Yang
b0ea947fa3
[Java] Support getCurrentActorId in local mode ( #14890 )
2021-03-31 21:39:39 +08:00
Kai Yang
6278df8604
[Java] refine generation of jvm options ( #14931 )
2021-03-31 21:04:52 +08:00
Ian Rodney
73fb5d6022
[Autoscaler][Docker] Make disable_shm_size_detection more usable ( #14913 )
2021-03-30 18:10:09 -07:00
Siyuan (Ryans) Zhuang
3aa39142db
[Core] Remove code paths that run plasma store as a process ( #14924 )
...
* enable plasma store as thread by default
remove unused code path that runs plasma store as a process
2021-03-30 16:19:03 -07:00
Sven Mika
1bb70e4907
[RLlib] Issue 14523: Torch + py3.8 leads to GPU device error. ( #15014 )
2021-03-30 21:43:11 +02:00
Adam Lee
b643f4fc6d
fix paper link for ICM docs ( #14973 )
...
fix broken arvix https link for ICM (intrinsic curiosity module)
2021-03-30 12:27:34 -07:00
Sven Mika
95686a8fdd
[RLlib] Issue 14533: Tf-eager properly use tree.map_struct
on value of type Repeated
(RLlib-specific space) ( #15015 )
2021-03-30 19:28:45 +02:00
Sven Mika
c8ca4d03ad
[RLlib] Issue with agent-id -> pol-id mapping not required to be fixed across different episodes. ( #15020 )
2021-03-30 19:25:52 +02:00
Raphael CHEN
93d4244d9c
[RLlib] Correctly get bytes size of SampleBatch ( #14801 )
2021-03-30 19:24:58 +02:00