Edward Oakes
c9314098b9
Implement direct task worker lease timeouts ( #6188 )
2019-11-25 14:48:19 -07:00
Eric Liang
7917bbef78
Set progress report interval for bazel explicitly ( #6262 )
...
* set progress internval
* add keep alive
* add keepalive
* remove cat
* smaller time
* squash error
* reduce log spam
2019-11-24 22:37:59 -08:00
Simon Mo
aa8d5d2f6c
Rate limit asyncio actor ( #6242 )
2019-11-24 11:39:28 -08:00
Stephanie Wang
d2662fecea
Miscellaneous bug fixes to throw unreconstructable errors for direct calls ( #6245 )
...
* Test cases
* Fix InPlasmaError
* raylet fixes to force errors for direct calls
* Disable lineage logging and task pending checks for direct calls
* move todo
* Clean up tests
* Fix bugs in object store for Contains and Delete
* Use direct call in tests
* Fixes, separate actor creation direct call from normal direct call spec
2019-11-23 15:05:49 -08:00
Stephanie Wang
c4fa3b3afb
fix ( #6251 )
2019-11-23 15:04:48 -08:00
Eric Liang
ea270495a1
Remove stray change ( #6247 )
2019-11-23 00:07:45 -08:00
Edward Oakes
ae5abc48a9
Fix race condition in redis_async_context.cc ( #6231 )
...
* dispatch callback to backend thread
* tmp: test in loop
* compiling
* Works using shared_ptrs
* Revert "tmp: test in loop"
This reverts commit faf1f8f74b34a99396906f56827d2691472ae7d4.
* Copy into CallbackReply
* fix comment
* warning
* add nil case
2019-11-22 15:51:40 -08:00
Ion
68ac08332b
Initial commit of new cluster resource scheduler ( #6178 )
2019-11-22 11:14:46 -08:00
Stephanie Wang
d3227f2f2d
Fix bug in direct task calls for objects that were evicted ( #6216 )
...
* Fix bug and add some checks
* rename
2019-11-21 15:38:31 -08:00
Stephanie Wang
eb7b73d731
Disconnect direct task workers that died ( #6213 )
...
* Disconnect workers that died so that we push the worker died error to redis
* Push error if actor is non nil
* fix test
2019-11-21 15:37:15 -08:00
Simon Mo
29ba6bfc64
Basic Async Actor Call ( #6183 )
...
* Start trying to figure out where to put fibers
* Pass is_async flag from python to context
* Just running things in fiber works
* Yield implemented, need some debugging to make it work
* It worked!
* Remove debug prints
* Lint
* Revert the clang-format
* Remove unnecessary log
* Remove unncessary import
* Add attribution
* Address comment
* Add test
* Missed a merge conflict
* Make test pass and compile
* Address comment
* Rename async -> asyncio
* Move async test to py3 only
* Fix ignore path
2019-11-21 11:56:46 -08:00
Eric Liang
7f52d019ca
Inline memory_store_provider into memory_store ( #6217 )
2019-11-21 10:13:53 -08:00
Eric Liang
1f9ab74293
Fix hang on Ray shutdown ( #6201 )
2019-11-20 23:30:35 -08:00
Eric Liang
425edb5cd9
Support NotifyBlocked/UnBlocked for direct call tasks ( #6177 )
2019-11-20 22:07:12 -08:00
mehrdadn
95bf977839
Rename UpdateResource due to conflict with Windows ( #6205 )
...
* Rename UpdateResource due to conflict with Windows
* Rename UpdateResource_ to UpdateResourceCapacity
2019-11-20 20:44:13 -08:00
Stephanie Wang
c0be9e6738
Resolve dependencies locally before submitting direct actor tasks ( #6191 )
...
* Priority queue in direct actor transport by task number
* Move LocalDependencyResolver out to separate file, share with direct actor transport
* works
* Test case for ordering
* Cleanups
* Remove priority queue
* comment
* Share ClientFactoryFn with direct actor transport
* Unit test
* fix
2019-11-20 16:45:19 -08:00
micafan
e7dbafa000
fix gcs::RedisAsioClient non-thread safe ( #5946 )
2019-11-20 10:18:35 -08:00
Eric Liang
23ef58716d
Fix crash on sys.exit of direct task calls ( #6202 )
2019-11-19 21:30:48 -08:00
ashione
a1744f67fe
Add hostname to nodeinfo( #6156 )
2019-11-19 15:03:46 +08:00
Danyang Zhuo
4f583ec784
Improve Object Transfer Performance ( #6067 )
2019-11-18 14:40:34 -08:00
Stephanie Wang
66edebce3a
Spillback scheduling for direct task calls ( #6164 )
...
* add dac
* remove cachign
* rename return buffer
* cleanup
* add tests
* add perf
* fix
* flip
* remove
* remove it
* lint
* remove fork safety
* lint
* comments
* s/core/client
* wip
* remove
* fmt
* consistently return direct naming
* basic pass by ref
* fix bugs
* wip
* wip
* wip
* wip
* add test
* works now
* fix constructor
* fix merge
* add todo for perf
* fix single client test
* use lower n
* bazel
* faster
* fix core worker test
* init
* fix tests
* no plasma for direct call
* Update worker.py
* add order test
* fixes
* comments
* remove old assert
* lint
* add test
* Very wip
* wip
* add options for tasks
* add test
* fmt
* add backpressure
* remove idle prof event
* lint
* Fix 0 returns
* Set memcopy threads globally
* add benchmark
* Fix object exists
* Fix reference
* Remove return_buffer
* Add check
* add exit handler
* update benchmarks
* Fix compile error
* Fix NoReturn
* Use is instead of == for NoReturn
* fix
* Remove list comprehension
* Fix core worker test
* comment
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* fix merge error
* lint
* wip
* fix merge
* wip
* finish
* lint
* task interface
* add file
* add
* wip
* now works!
* updated
* wip
* dep resolution
* remove remote dep handling
* comments
* fix test_multithreading
* fix merge
* fix exit handling
* fix merge
* comments
* get fallback fetch working
* handle contains
* fix typo
* Skeleton for SubmitTask proto
* Update src/ray/common/id.h
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>
* comments
* rename to core worker service
* lint
* fix compile
* wip
* update
* error code
* fix up and rename
* clean up call manager
* comments
* add test and cleanup deserialization
* fix pickle
* fix comments, lint
* test todo
* comments
* use shared ptr
* rename
* Update src/ray/protobuf/gcs.proto
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>
* require transport type for ids; lint
* cleanup
* comments 1
* use worker available for real
* wip
* fix test
* resolve local dependencies test
* add num pending metric
* client factory
* unit test task submission
* wip
* fix bug
* rename
* Pass through node manager port, connect in raylet client
* finish rename
* Switch submit task to grpc
* fix crash
* Check port in use
* fix merge
* comments more
* doc
* Remove default port, set port randomly from driver
* add unique_ptr comment about TaskSpec
* lint
* fix test
* update
* fix lint
* GetMessageMutable should not be const
* iwyu
* fix const
* Update direct_task_transport_test.cc
* fix segfault
* Fix test
* Add RpcAddress, set in actor table data
* fix serialization
* fix lint
* Pass through task caller address
* Fix object manager test
* RpcAddress -> Address
* merge
* Port WorkerLease to grpc
* wip
* fix test
* add mem test
* update
* comments
* fix core worker tests
* fix
* remove old worker lease code
* First pass on spillback
* lint
* crash?
* Debug
* Fix task spec copy, extend test basic
* lint
* Port return worker to grpc
* lint
* Return worker to the correct raylet
* Only request worker if queued tasks
* A bit better failure handling
* Fix unit test
* Add unit test for spillback
* fix
* python test multinode
* update
* updates
* fix
2019-11-17 20:29:32 -08:00
Ion
1b80675206
Scheduling ids ( #6137 )
2019-11-15 16:04:16 -08:00
Edward Oakes
33040d734f
Disable stopgap GC by default ( #6165 )
...
* disable stopgap gc by default
* fix gc testss
2019-11-15 15:42:59 -08:00
Eric Liang
7d33e9949b
Integrate ref count module into local memory store ( #6122 )
2019-11-15 10:52:19 -08:00
Eric Liang
8ff393a7bd
Handle exchange of direct call objects between tasks and actors ( #6147 )
2019-11-14 17:32:04 -08:00
Edward Oakes
2758cd0b34
Make log message debug ( #6166 )
2019-11-14 15:05:36 -08:00
Eric Liang
0a3623ded6
Fix memory store wait ( #6152 )
2019-11-14 10:17:30 -08:00
Stephanie Wang
bbadde57e0
Pass through caller address when submitting a task ( #6143 )
...
* Add RpcAddress, set in actor table data
* Pass through task caller address
* RpcAddress -> Address
* update
* fix
* lint
* fix cc tests
2019-11-14 09:14:08 -08:00
Ujval Misra
e3e3ad4b25
Add timeout param to ray.get ( #6107 )
2019-11-14 00:50:04 -08:00
Edward Oakes
51e76151d6
Use shared_ptr for gcs client in profiler ( #6150 )
2019-11-13 15:24:01 -08:00
Eric Liang
f3f86385d6
Minimal implementation of direct task calls ( #6075 )
2019-11-12 11:45:28 -08:00
Stephanie Wang
35d177f459
Use grpc for communication from worker to local raylet (task submission and direct actor args only) ( #6118 )
...
* Skeleton for SubmitTask proto
* Pass through node manager port, connect in raylet client
* Switch submit task to grpc
* Check port in use
* doc
* Remove default port, set port randomly from driver
* update
* Fix test
* Fix object manager test
2019-11-11 21:17:25 -08:00
Edward Oakes
5780ec1b62
Refresh ObjectIDs in raylet for stopgap GC ( #6109 )
2019-11-10 23:12:59 -08:00
Philipp Moritz
ccbcc4bafa
Use GRCP and Bazel 1.0 ( #6002 )
2019-11-08 15:58:28 -08:00
Philipp Moritz
5a05eaaa54
Fix compilation on master ( #6116 )
2019-11-07 22:38:42 -08:00
Eric Liang
4a28306186
Allow large returns from direct actor calls ( #6088 )
2019-11-07 21:28:55 -08:00
Edward Oakes
ca53af4d0f
Add pending task dependencies to ObjectID ref counting ( #6054 )
2019-11-07 18:37:10 -08:00
Edward Oakes
9820c10a09
Simplify gRPC service definition for the worker ( #6095 )
2019-11-06 13:00:39 -08:00
mehrdadn
e312f3d282
Compatibility issues ( #6071 )
...
* Pass -f - to tar to force stdin on Windows
* Quote paths that may contain spaces (causes issues on Windows)
* Copy over Windows code from Arrow for glog signal handle uninstall
* Add missing COPTS to build rules since we'll need them for Windows compatibility
* Begin adding COPTS for Windows compatibility
* Disable glog on Arrow until we change WIN32 to _WIN32 there
* Missing header files that cause problems on Windows
* WORD typedef conflicts with Windows; remove it
* uint -> unsigned int wherever we're dealing with milliseconds (signed version is already int)
* uint -> unsigned int for enums
* uint -> size_t, wherever we're dealing with sizes or indices into arrays
* Work around Boost 1.68 bug in detecting clang-cl (revert this after upgrading)
* Missing #include <unistd.h>
* Add check for signal handler uninstallation failure
* Linting issue
2019-11-05 00:08:14 -08:00
Edward Oakes
043d1f4094
Return RayObjects to core worker ( #6052 )
2019-11-04 20:27:57 -08:00
Eric Liang
8485304e83
Support concurrent Actor calls in Ray ( #6053 )
2019-11-04 01:14:35 -08:00
Philipp Moritz
1c5446851a
Use Plasma with LRU refreshing integrated ( #6050 )
2019-11-03 16:19:05 -08:00
Eric Liang
fb34928a2a
[minor] Perf optimizations for direct actor task submission ( #6044 )
...
* merge optimizations
* fix
* fix memory err
* optimize
* fix tests
* fix serialization of method handles
* document weakref
* fix check
* bazel format
* disable on 2
2019-11-01 14:41:14 -07:00
Eric Liang
eef4ad3bba
Report census view data as part of raylet node stats ( #6060 )
2019-11-01 14:26:09 -07:00
Simon Mo
7f5b3502da
Implement Detached Actor ( #6036 )
...
* Arg propagation works
* Implement persistent actor
* Add doc
* Initialize is_persistent_
* Rename persistent->detached
* Address comment
* Make test passes
* Address comment
* Python2 compatiblity
* Fix naming, py2
* Lint
2019-11-01 10:28:23 -07:00
Eric Liang
c86f945520
Support pass by ref args in for direct actor calls ( #6040 )
2019-10-31 16:55:10 -07:00
Edward Oakes
16e9dfd2e1
Exit workers when raylet dies unexpectedly ( #6014 )
2019-10-30 20:29:25 -07:00
Eric Liang
8ebba202df
[minor] Reduce perf overhead of object ref tracking ( #6041 )
2019-10-29 18:14:51 -07:00
Eric Liang
b89cac976a
Basic direct actor call support in Python ( #5991 )
2019-10-28 22:09:04 -07:00
Edward Oakes
c1418b04df
Remove CoreWorkerObjectInterface ( #6023 )
2019-10-28 10:48:41 -07:00