Clark Zinzow
c2bff64699
[Core] Locality-aware leasing: Milestone 1 - Owned refs, pinned location ( #12817 )
...
* Locality-aware leasing for owned refs (pinned locations).
* LessorPicker --> LeasePolicy.
* Consolidate GetBestNodeIdForTask and GetBestNodeIdForObjects.
* Update comments.
* Turn on locality-aware leasing feature flag by default.
* Move local fallback logic to LeasePolicy, move feature flag check to CoreWorker constructor, add local-only lease policy.
* Add lease policy consulting assertions to the direct task submitter tests.
* Add lease policy tests.
* LocalityLeasePolicy --> LocalityAwareLeasePolicy.
* Add missing const declarations.
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* Add RAY_CHECK for raylet address nullptr when creating lease client.
* Make the fact that LocalLeasePolicy always returns the local node more explicit.
* Flatten GetLocalityData conditionals to make it more readable.
* Add ReferenceCounter::GetLocalityData() unit test.
* Add data-intensive microbenchmarks for single-node perf testing.
* Add data-intensive microbenchmarks for simulated cluster perf testing.
* Remove redundant comment.
* Remove data-intensive benchmarks.
* Add locality-aware leasing Python test.
* Formatting changes in ray_perf.py.
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2021-01-04 09:49:08 -08:00
Dmitri Gekhtman
31453621ef
[kubernetes][docs][minor] Kubernetes version warning ( #13161 )
2021-01-04 10:29:17 -06:00
architkulkarni
a95275bdd9
[Serve] [Doc] Add existing web server integration ServeHandle tutorial ( #13127 )
2021-01-04 10:28:34 -06:00
Ameer Haj Ali
61c3b6d3bf
[docs] Small fix in C++ documentation. ( #13154 )
...
* prepare for head node
* move command runner interface outside _private
* remove space
* Eric
* flake
* min_workers in multi node type
* fixing edge cases
* eric not idle
* fix target_workers to consider min_workers of node types
* idle timeout
* minor
* minor fix
* test
* lint
* eric v2
* eric 3
* min_workers constraint before bin packing
* Update resource_demand_scheduler.py
* Revert "Update resource_demand_scheduler.py"
This reverts commit 818a63a2c86d8437b3ef21c5035d701c1d1127b5.
* reducing diff
* make get_nodes_to_launch return a dict
* merge
* weird merge fix
* auto fill instance types for AWS
* Alex/Eric
* Update doc/source/cluster/autoscaling.rst
* merge autofill and input from user
* logger.exception
* make the yaml use the default autofill
* docs Eric
* remove test_autoscaler_yaml from windows tests
* lets try changing the test a bit
* return test
* lets see
* edward
* Limit max launch concurrency
* commenting frac TODO
* move to resource demand scheduler
* use STATUS UP TO DATE
* Eric
* make logger of gc freed refs debug instead of info
* add cluster name to docker mount prefix directory
* grrR
* fix tests
* moving docker directory to sdk
* move the import to prevent circular dependency
* smallf fix
* ian
* fix max launch concurrency bug to assume failing nodes as pending and consider only load_metric's connected nodes as running
* small fix
* deflake test_joblib
* lint
* placement groups bypass
* remove space
* Eric
* first ocmmit
* lint
* exmaple
* documentation
* hmm
* file path fix
* fix test
* some format issue in docs
* modified docs
Co-authored-by: Ameer Haj Ali <ameerhajali@ameers-mbp.lan>
Co-authored-by: Alex Wu <alex@anyscale.io>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
Co-authored-by: root <root@ip-172-31-56-188.us-west-2.compute.internal>
2021-01-02 11:47:06 -08:00
fangfengbin
456d08ad40
Deprecate setResource java api ( #13117 )
2021-01-02 12:17:45 +08:00
Ameer Haj Ali
27cbac576d
[docs] Minor change to formating C++ docs. ( #13151 )
2021-01-01 19:43:59 -08:00
Qing Wang
d3dd5b87ce
[Java] Support wasCurrentActorRestarted
in actor task. ( #13120 )
...
* Remove check.
* Add test
* fix lint
* lint
* Fix spotless lint
* Address comments.
* Fix lint
Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
2021-01-02 11:31:08 +08:00
Ameer Haj Ali
710615c228
[docs] Documentation + example for the C++ language API ( #13138 )
2021-01-01 18:18:41 -08:00
Sven Mika
9eba1871bb
[RLlib] Support easy use_attention=True
flag for using the GTrXL model. ( #11698 )
2021-01-01 14:06:23 -05:00
Dmitri Gekhtman
4ca64549e2
[docs][kubernetes][minor] Update K8s examples in doce ( #13129 )
2020-12-31 16:25:38 -06:00
Simon Mo
fece8db70d
[Serve] Use a small object to track requests ( #13125 )
2020-12-31 11:43:03 -08:00
Edward Oakes
ef6d859e9b
[dashboard] Fix RAY_RAYLET_PID KeyError on Windows ( #12948 )
2020-12-31 10:54:40 -06:00
Ian Rodney
acb082fc47
[serve] Async controller ( #13111 )
2020-12-31 10:51:33 -06:00
Amog Kamsetty
7120f3a6ab
[Tune] Update URL to fix 403 not found error in PBT tranformers test case ( #13131 )
2020-12-31 10:45:57 -05:00
Qing Wang
f5412c0417
[Java] Avoid failure of serializing a user-defined unserializable exception. ( #13119 )
2020-12-31 19:47:35 +08:00
Sven Mika
8726521604
[RLlib] JAXPolicy prep PR #2 (move get_activation_fn (backward-compatibly), minor fixes and preparations). ( #13091 )
2020-12-30 22:30:52 -05:00
fyrestone
6a54897577
Job module without submission ( #13081 )
...
Co-authored-by: 刘宝 <po.lb@antfin.com>
2020-12-31 11:12:17 +08:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. ( #12718 )
2020-12-30 17:32:21 -08:00
Sven Mika
28ac4243f4
[RLlib] Deflake test case: 2-step game MADDPG. ( #13121 )
2020-12-30 18:37:37 -05:00
Max Fitton
25f7bdc0d8
[Bugfix][Dashboard] Fix undefined logCount, errorCount UI crash ( #13113 )
2020-12-30 14:19:56 -06:00
Michael Luo
42cd414e5b
[RLlib] New Offline RL Algorithm: CQL (based on SAC) ( #13118 )
2020-12-30 10:11:57 -05:00
chaokunyang
33089c44e2
Fix streaming ci failure ( #12830 )
2020-12-30 10:45:52 +08:00
Sumanth Ratna
59e9b80903
[Doc] Fix Sphinx.add_stylesheet deprecation ( #13067 )
2020-12-29 16:35:40 -08:00
Michael Luo
eae7a1f433
[RLLib] Readme.md Documentation for Almost All Algorithms in rllib/agents ( #13035 )
2020-12-29 18:45:55 -05:00
Sven Mika
d811d65920
[RLlib] run_regression_tests.py: --framework flag (instead of --torch). ( #13097 )
2020-12-29 15:27:59 -05:00
architkulkarni
032a6546d5
Serve metrics docs ( #13096 )
2020-12-29 14:03:34 -06:00
Ameer Haj Ali
44483f465c
[autoscaler] Make placement groups bypass max launch limit ( #13089 )
2020-12-29 10:06:11 -08:00
Eric Liang
5a4e50c9d9
Disable broken streaming tests ( #13095 )
2020-12-29 00:58:18 -08:00
Ian Rodney
7ad56826db
[docker] Fix restart behavior with Docker ( #12898 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: ijrsvt <ilr@anyscale.com>
2020-12-28 18:56:28 -08:00
chaokunyang
d1dd3410c8
[Java] Format ray java code ( #13056 )
2020-12-29 10:36:16 +08:00
architkulkarni
cc1c2c3dc9
[Serve] Use ServeHandle in HTTP proxy ( #12523 )
2020-12-28 18:33:42 -08:00
Simon Mo
30c22921d9
[Serve] Implement Graceful Shutdown ( #13028 )
2020-12-28 17:53:53 -08:00
Lavanya Shukla
350917958c
[docs] fix wandb url ( #13094 )
2020-12-28 17:19:17 -08:00
Eric Liang
836c5d5a91
Deprecate experimental / dynamic resources ( #13019 )
2020-12-28 11:52:36 -08:00
architkulkarni
9a0218fb89
[Serve] [Doc] Front page update ( #13032 )
2020-12-28 10:19:36 -08:00
Hao Zhang
18f5743416
[Collective][PR 3.5/6] Send/Recv calls and some initial code for communicator caching ( #12935 )
...
* other collectives all work
* auto-linting
* mannual linting #1
* mannual linting 2
* bugfix
* add send/recv point-to-point calls
* add some initial code for communicator caching
* auto linting
* optimize imports
* minor fix
* fix unpassed tests
* support more dtypes
* rerun some distributed tests for send/recv
* linting
2020-12-28 09:48:07 -08:00
Sven Mika
c524f86785
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. ( #13064 )
2020-12-27 09:46:03 -05:00
Sven Mika
a5318961de
[RLlib] Preprocessor fixes (multi-discrete) and tests. ( #13083 )
2020-12-26 20:14:36 -05:00
Sven Mika
99ae7bae05
[RLlib] JAXPolicy prep. PR #1 . ( #13077 )
2020-12-26 20:14:18 -05:00
fangfengbin
25f9f0d781
[GCS] Move resource usage info to gcs resource manager ( #13059 )
2020-12-25 15:17:45 +08:00
Siyuan (Ryans) Zhuang
cf9952a028
[Core] Remote outdated external store ( #13080 )
...
* remove outdated external store
2020-12-24 17:30:06 -08:00
Siyuan (Ryans) Zhuang
bf7f6a7de3
[Core] Remove cuda support in plasma store ( #13070 )
...
* remove cuda support in plasma store
2020-12-24 13:24:56 -08:00
Alind Khare
2059a2090d
[C++ API] Added reference counting to ObjectRef ( #13058 )
...
* Added reference counting to ObjectRef
* Addressed the comments
2020-12-24 09:32:52 -08:00
Michael Luo
4bcd475671
[RLlib] Improved Documentation for PPO, DDPG, and SAC ( #12943 )
2020-12-24 09:31:35 -05:00
Michael Luo
a2d1215200
[RLlib] Execution Annotation ( #13036 )
2020-12-24 09:30:33 -05:00
ZhuSenlin
85f1716a1f
speed up local mode object store get ( #13052 )
...
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
2020-12-24 14:59:14 +08:00
Max Fitton
81bfee79bc
Fix OS X Wheel Build - Update brew cask install ( #13062 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-23 20:51:50 -08:00
Sumanth Ratna
b11bd22111
[docs] Fix args + kwargs instead of docstrings ( #13068 )
...
* functools wraps
* Fix typo (functoools -> functools)
2020-12-23 19:09:23 -08:00
Stephanie Wang
4461f9980a
Refactor TaskDependencyManager, allow passing bundles of objects to ObjectManager ( #13006 )
...
* New dependency manager
* Switch raylet to new DependencyManager
* PullManager accepts bundles
* Cleanup, remove old task dependency manager
* x
* PullManager unit tests
* lint
* Unit tests
* Rename
* lint
* test
* Update src/ray/raylet/dependency_manager.cc
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* Update src/ray/raylet/dependency_manager.cc
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* x
* lint
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2020-12-23 18:36:00 -08:00
Edward Oakes
3cc213ddf6
[serve] Centralize HTTP-related logic in HTTPState ( #13020 )
2020-12-23 18:00:02 -06:00