Amog Kamsetty
bd19ed31e7
[Tune] Fix PBT Transformers Example ( #13174 )
2021-01-05 16:31:11 -08:00
Hao Zhang
7e52351ae5
[Collective] Some necessary abstraction of collective calls before introducing stream management ( #13162 )
2021-01-05 16:20:12 -08:00
Edward Oakes
dc101fd087
[serve] Move controller state into separate files ( #13204 )
2021-01-05 14:37:16 -06:00
Edward Oakes
d738610dc9
Disable atexit test on windows ( #13207 )
2021-01-05 14:33:51 -06:00
Kai Fricke
96c2d3d2b5
[tune] better signature check for tune.sample_from
( #13171 )
...
* [tune] better signature check for `tune.sample_from`
* Update python/ray/tune/sample.py
Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
2021-01-05 08:04:18 -08:00
Edward Oakes
e8162f1b1f
[serve] Merge ActorReconciler and BackendState ( #13139 )
2021-01-05 09:56:22 -06:00
Hao Zhang
4150970226
[Collective][PR 2/6] Driver program declarative interfaces ( #12874 )
...
* scaffold of the code
* some scratch and options change
* NCCL mostly done, supporting API#1
* interface 2.1 2.2 scratch
* put code into ray and fix some importing issues
* add an addtional Rendezvous class to safely meet at named actor
* fix some small bugs in nccl_util
* some small fix
* scaffold of the code
* some scratch and options change
* NCCL mostly done, supporting API#1
* interface 2.1 2.2 scratch
* put code into ray and fix some importing issues
* add an addtional Rendezvous class to safely meet at named actor
* fix some small bugs in nccl_util
* some small fix
* add a Backend class to make Backend string more robust
* add several useful APIs
* add some tests
* added allreduce test
* fix typos
* fix several bugs found via unittests
* fix and update torch test
* changed back actor
* rearange a bit before importing distributed test
* add distributed test
* remove scratch code
* auto-linting
* linting 2
* linting 2
* linting 3
* linting 4
* linting 5
* linting 6
* 2.1 2.2
* fix small bugs
* minor updates
* linting again
* auto linting
* linting 2
* final linting
* Update python/ray/util/collective_utils.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update python/ray/util/collective_utils.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update python/ray/util/collective_utils.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* added actor test
* lint
* remove local sh
* address most of richard's comments
* minor update
* remove the actor.option() interface to avoid changes in ray core
* minor updates
Co-authored-by: YLJALDC <dal177@ucsd.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-04 20:57:37 -08:00
Tao Wang
c617291b27
[build]Update description and add some keywords ( #13163 )
2021-01-05 11:34:03 +08:00
Barak Michener
9643e44af6
[ray_client]: Move from experimental to util ( #13176 )
...
Change-Id: I9f054881f0429092d265cd6944d89804cce9d946
2021-01-04 17:51:56 -08:00
Eric Liang
dfb326d4b5
Surface object store spilling statistics in ray memory
( #13124 )
2021-01-04 17:35:39 -08:00
Amog Kamsetty
e181515dff
[SGD] Fix Docstring for as_trainable
( #13173 )
2021-01-04 17:21:24 -08:00
Amog Kamsetty
15e86581bd
[XGboost] Update Documentation ( #13017 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-04 17:21:04 -08:00
Raed Shabbir
d632b0f0f7
[Serve] Bug in Serve node memory-related resources calculation #11198 ( #13061 )
2021-01-04 11:04:59 -08:00
Clark Zinzow
c2bff64699
[Core] Locality-aware leasing: Milestone 1 - Owned refs, pinned location ( #12817 )
...
* Locality-aware leasing for owned refs (pinned locations).
* LessorPicker --> LeasePolicy.
* Consolidate GetBestNodeIdForTask and GetBestNodeIdForObjects.
* Update comments.
* Turn on locality-aware leasing feature flag by default.
* Move local fallback logic to LeasePolicy, move feature flag check to CoreWorker constructor, add local-only lease policy.
* Add lease policy consulting assertions to the direct task submitter tests.
* Add lease policy tests.
* LocalityLeasePolicy --> LocalityAwareLeasePolicy.
* Add missing const declarations.
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* Add RAY_CHECK for raylet address nullptr when creating lease client.
* Make the fact that LocalLeasePolicy always returns the local node more explicit.
* Flatten GetLocalityData conditionals to make it more readable.
* Add ReferenceCounter::GetLocalityData() unit test.
* Add data-intensive microbenchmarks for single-node perf testing.
* Add data-intensive microbenchmarks for simulated cluster perf testing.
* Remove redundant comment.
* Remove data-intensive benchmarks.
* Add locality-aware leasing Python test.
* Formatting changes in ray_perf.py.
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2021-01-04 09:49:08 -08:00
Dmitri Gekhtman
31453621ef
[kubernetes][docs][minor] Kubernetes version warning ( #13161 )
2021-01-04 10:29:17 -06:00
architkulkarni
a95275bdd9
[Serve] [Doc] Add existing web server integration ServeHandle tutorial ( #13127 )
2021-01-04 10:28:34 -06:00
Simon Mo
fece8db70d
[Serve] Use a small object to track requests ( #13125 )
2020-12-31 11:43:03 -08:00
Ian Rodney
acb082fc47
[serve] Async controller ( #13111 )
2020-12-31 10:51:33 -06:00
Amog Kamsetty
7120f3a6ab
[Tune] Update URL to fix 403 not found error in PBT tranformers test case ( #13131 )
2020-12-31 10:45:57 -05:00
chaokunyang
33089c44e2
Fix streaming ci failure ( #12830 )
2020-12-30 10:45:52 +08:00
architkulkarni
032a6546d5
Serve metrics docs ( #13096 )
2020-12-29 14:03:34 -06:00
Ameer Haj Ali
44483f465c
[autoscaler] Make placement groups bypass max launch limit ( #13089 )
2020-12-29 10:06:11 -08:00
Ian Rodney
7ad56826db
[docker] Fix restart behavior with Docker ( #12898 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: ijrsvt <ilr@anyscale.com>
2020-12-28 18:56:28 -08:00
architkulkarni
cc1c2c3dc9
[Serve] Use ServeHandle in HTTP proxy ( #12523 )
2020-12-28 18:33:42 -08:00
Simon Mo
30c22921d9
[Serve] Implement Graceful Shutdown ( #13028 )
2020-12-28 17:53:53 -08:00
Lavanya Shukla
350917958c
[docs] fix wandb url ( #13094 )
2020-12-28 17:19:17 -08:00
Eric Liang
836c5d5a91
Deprecate experimental / dynamic resources ( #13019 )
2020-12-28 11:52:36 -08:00
architkulkarni
9a0218fb89
[Serve] [Doc] Front page update ( #13032 )
2020-12-28 10:19:36 -08:00
Hao Zhang
18f5743416
[Collective][PR 3.5/6] Send/Recv calls and some initial code for communicator caching ( #12935 )
...
* other collectives all work
* auto-linting
* mannual linting #1
* mannual linting 2
* bugfix
* add send/recv point-to-point calls
* add some initial code for communicator caching
* auto linting
* optimize imports
* minor fix
* fix unpassed tests
* support more dtypes
* rerun some distributed tests for send/recv
* linting
2020-12-28 09:48:07 -08:00
Sumanth Ratna
b11bd22111
[docs] Fix args + kwargs instead of docstrings ( #13068 )
...
* functools wraps
* Fix typo (functoools -> functools)
2020-12-23 19:09:23 -08:00
Edward Oakes
3cc213ddf6
[serve] Centralize HTTP-related logic in HTTPState ( #13020 )
2020-12-23 18:00:02 -06:00
Alex Wu
8df94e33e0
[Autoscaler] New output log format ( #12772 )
2020-12-23 12:02:55 -08:00
Antoni Baum
a4f2dd2138
[Tune]Add integer loguniform support ( #12994 )
...
* Add integer quantization and loguniform support
* Fix hyperopt qloguniform not being np.log'd first
* Add tests, __init__
* Try to fix tests, better exceptions
* Tweak docstrings
* Type checks in SearchSpaceTest
* Update docs
* Lint, tests
* Update doc/source/tune/api_docs/search_space.rst
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2020-12-23 09:27:16 -08:00
Ameer Haj Ali
d37e2c3a20
[joblib] Fix flaky joblib test. ( #13046 )
2020-12-23 10:43:34 -06:00
Barak Michener
c4e273920f
[ray_client]: Insert decorators into the real ray module to allow for client mode ( #13031 )
2020-12-22 22:51:45 -08:00
Simon Mo
bc68260144
[Serve] Handle Bug Fixes ( #12971 )
2020-12-22 19:13:16 -08:00
Edward Oakes
b52cce6632
[serve] Refactor SystemState into EndpointState and BackendState ( #13018 )
2020-12-21 20:39:13 -06:00
Eric Liang
8068041006
Don't release resources during plasma fetch ( #13025 )
2020-12-21 18:32:40 -08:00
Edward Oakes
015a0f9935
[serve] Rename replica_tag -> replica in metrics for consistency ( #13022 )
2020-12-21 17:19:39 -06:00
Eric Liang
03a5b90ed6
Revert "Revert "Increase the number of unique bits for actors to avoi… ( #12990 )
2020-12-21 15:16:42 -08:00
architkulkarni
8b4b4bf0a2
[Serve] Migrate from Flask.Request to Starlette Request ( #12852 )
2020-12-21 15:34:15 -06:00
Hao Zhang
5b48480e29
[Collective][PR 3/6] Other collectives ( #12864 )
2020-12-21 12:48:00 -08:00
Barak Michener
43b9c7811e
[ray_client] add client microbenchmarks ( #13007 )
2020-12-21 12:17:44 -08:00
Ameer Haj Ali
5e2b850836
[autoscaler] Fixes max_workers bug. ( #13008 )
2020-12-21 10:30:03 -08:00
Kai Yang
5a6801dde7
[Core] Remove delete_creating_tasks
( #12962 )
2020-12-22 00:01:27 +08:00
Barak Michener
c576f0b073
[ray_client] Implement a gRPC streaming logs API for the client ( #13001 )
2020-12-20 19:35:34 -08:00
Barak Michener
e715ade2d1
Support retrieval of named actor handles ( #13000 )
...
Change-Id: I05d31c9c67943d2a0230782cbdaa98341584cbc7
2020-12-20 16:34:50 -08:00
Barak Michener
80f6dd16b2
[ray_client] Implement optional arguments to ray.remote() and f.options() ( #12985 )
2020-12-20 15:43:48 -08:00
Ameer Haj Ali
11f34f72d8
[autoscaler] Do not count head node with min_workers constraint. ( #12980 )
2020-12-20 14:54:46 -08:00
Barak Michener
7ab9164f1b
[ray_client] Integrate with test_basic, test_basic_2 and test_actor ( #12964 )
2020-12-20 14:54:18 -08:00