architkulkarni
496dd297e5
skip test_basic_reconstruction_actor_task on win ( #14110 )
2021-02-15 10:17:33 -08:00
architkulkarni
0fb96a61fc
[Serve] Add support for variable routes ( #13968 )
2021-02-15 11:42:42 -06:00
Richard Liaw
4d727e4cdf
[tune] enable more tests ( #13969 )
...
* try-this
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* test
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix-tests
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* address
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* real-ray
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix-client
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix-race-condition
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* revert-new-tune-tests
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* Revert "revert-new-tune-tests"
This reverts commit 3866b920bc47ac4b5cb9dab8f7b9d50e4acdb27a.
* format
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* update
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* build
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-15 09:19:55 -08:00
architkulkarni
bcb51a27c6
[Serve] [Doc] Add version warning ( #14001 )
2021-02-15 11:16:01 -06:00
javi-redondo
b8b2d6410d
[docs] new Ray Cluster documentation ( #13839 )
...
Co-authored-by: Javier Redondo <javier@anyscale.com>
Co-authored-by: AmeerHajAli <ameerh@berkeley.edu>
2021-02-15 00:47:14 -08:00
Kathryn Zhou
82539f2da4
Export additional metrics to Prometheus ( #14061 )
2021-02-14 23:16:26 -08:00
SangBin Cho
b45ae76765
Revert "Unhandled exception handler based on local ref counting ( #14049 )" ( #14099 )
...
This reverts commit 9dc671ae02
.
2021-02-14 22:08:32 -08:00
architkulkarni
75568f856c
skip restart and multi restart test on win ( #14084 )
2021-02-14 15:17:54 -08:00
Alex Wu
5636af8084
[hotfix] Fix mac build ( #14075 )
...
* .
* done?
* .
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-02-14 14:26:51 -08:00
Eric Liang
9dc671ae02
Unhandled exception handler based on local ref counting ( #14049 )
2021-02-12 22:58:38 -08:00
Erik Erlandson
ff1b26274e
[operator] expose RAY_CONFIG_DIR env var ( fix #14074 ) ( #14076 )
2021-02-12 17:47:00 -08:00
architkulkarni
20f6cc2cb2
skip test_basic_reconstruction_put on win ( #14082 )
2021-02-12 15:47:00 -08:00
Clark Zinzow
c9a9d422c7
[OBOD] Disable the ownership-based object directory for all tests that use ray.objects(). ( #14065 )
2021-02-12 12:12:57 -08:00
Clark Zinzow
c7ff69f4bf
[OBOD] Add support for ownership-based object directory object recovery. ( #14066 )
2021-02-12 11:58:31 -08:00
Sven Mika
936cb5929c
[RLlib] Issue #13646 : Rewards still not available in loss/json-output in certain situations when using the traj. view API. ( #14036 )
2021-02-12 10:07:44 +01:00
Dmitri Gekhtman
6644a0fe50
[autoscaler][kubernetes][docs] Updated Kubernetes Documentation ( #14016 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-11 23:00:25 -08:00
Alex Wu
02938f3a21
[hotfix] Disable dashboard agent windows ( #14062 )
2021-02-11 17:54:55 -08:00
Amog Kamsetty
24e020b062
[Doc] Add PTL and RAG to community integrations ( #14064 )
2021-02-11 15:48:19 -08:00
Amog Kamsetty
a430ac2334
[Tune] Revert Pinning Tune Dependencies ( #14059 )
...
* remove lockfiles
* docker
* remove constraint file
* fix
2021-02-11 15:43:09 -08:00
Jeroen Boeye
2af1f0616d
Fix broken link to Flow docs ( #14058 )
2021-02-11 13:20:34 -08:00
SangBin Cho
cb8523a5e6
Fix the wrong spark on ray link. ( #14057 )
2021-02-11 12:31:18 -08:00
Clark Zinzow
cd7e567a57
[Core] Ownership-based Object Directory - Added support for object spilling in the ownership-based object directory. ( #13948 )
...
* Add support for object spilling in the ownership-based object directory.
* Move owner address hashmap into pinned_objects_ and objects_pending_spill_.
* Update local object manager tests.
* Feedback and misc. fixes.
* Move spilled unpin callback lambda to std::binded private method.
* Skip test_delete_objects_multi_node test on MacOS for now.
2021-02-11 10:36:22 -08:00
Sven Mika
4db86404ad
[RLlib] Issue #13507 : Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. ( #14037 )
2021-02-11 18:58:46 +01:00
Sven Mika
a2f7998026
[RLlib] Issue #13342 : Add validate_spaces
to MB-MPO. ( #14038 )
2021-02-11 11:36:53 +01:00
Ian Rodney
f6cfc44dbd
[autoscaler] run setup commands with restart_only=True ( #13836 )
2021-02-10 20:17:20 -08:00
Ameer Haj Ali
d87a82e891
Revert "Revert "[Autoscaler] Monitor refactor for backward compatability. ( #13970 )" ( #14046 )" ( #14050 )
...
* prepare for head node
* move command runner interface outside _private
* remove space
* Eric
* flake
* min_workers in multi node type
* fixing edge cases
* eric not idle
* fix target_workers to consider min_workers of node types
* idle timeout
* minor
* minor fix
* test
* lint
* eric v2
* eric 3
* min_workers constraint before bin packing
* Update resource_demand_scheduler.py
* Revert "Update resource_demand_scheduler.py"
This reverts commit 818a63a2c86d8437b3ef21c5035d701c1d1127b5.
* reducing diff
* make get_nodes_to_launch return a dict
* merge
* weird merge fix
* auto fill instance types for AWS
* Alex/Eric
* Update doc/source/cluster/autoscaling.rst
* merge autofill and input from user
* logger.exception
* make the yaml use the default autofill
* docs Eric
* remove test_autoscaler_yaml from windows tests
* lets try changing the test a bit
* return test
* lets see
* edward
* Limit max launch concurrency
* commenting frac TODO
* move to resource demand scheduler
* use STATUS UP TO DATE
* Eric
* make logger of gc freed refs debug instead of info
* add cluster name to docker mount prefix directory
* grrR
* fix tests
* moving docker directory to sdk
* move the import to prevent circular dependency
* smallf fix
* ian
* fix max launch concurrency bug to assume failing nodes as pending and consider only load_metric's connected nodes as running
* small fix
* Revert "Revert "[Autoscaler] Monitor refactor for backward compatability. (#13970 )" (#14046 )"
This reverts commit 6f9d39fb3e
.
* fake news
Co-authored-by: Ameer Haj Ali <ameerhajali@ameers-mbp.lan>
Co-authored-by: Alex Wu <alex@anyscale.io>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
2021-02-10 17:59:08 -08:00
Clark Zinzow
c5574a33e4
[dask-on-ray] Add better Dask-on-Ray example, and detail custom shuffle optimization. ( #13950 )
...
* Add better Dask-on-Ray example, and detail custom shuffle optimization.
* Misc. updates and feedback.
* Update doc/source/dask-on-ray.rst
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
* Set max_branch to infinity in shuffle optimization example.
* Feedback
* Apply suggestions from code review
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* 80 col width
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-10 14:24:09 -08:00
Crissman Loomis
05ab75fbe1
[docs] Add mode to Ray Tune quick start ( #14023 )
2021-02-10 12:41:45 -08:00
Thomas J. Fan
75fbd48edd
[doc] Minor fix to indentation ( #14040 )
2021-02-10 12:31:47 -08:00
Stephanie Wang
fc89984162
Subtract from num bytes in use ( #13944 )
2021-02-10 12:22:08 -08:00
architkulkarni
6f9d39fb3e
Revert "[Autoscaler] Monitor refactor for backward compatability. ( #13970 )" ( #14046 )
...
This reverts commit 7a6f8054d1
.
2021-02-10 12:16:52 -08:00
Alex Wu
68e985ddcd
[hotfix][docs] RayDP tensorflow != pytorch ( #14044 )
2021-02-10 11:23:02 -08:00
Kai Fricke
1ef2a6790c
[tune] add scalability release tests ( #13986 )
...
* Add scalability tests
* Network overhead cluster
* Update xgboost tests
* Document release tests
* Don't raise on failed trial
* Update to multi node yamls
* Update yamls
* Revert xgboost test changes
* Fix import
* Update release/tune_tests/scalability_tests/workloads/test_bookkeeping_overhead.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Pass aws credentials (WIP)
* Update durable trainable example
* Update xgboost sweep
* Change xgboost scope, fix durable trainable stop condition
* Fix max depth to limit total test length
* Add cluster information to test descriptions. Update release checklist/process docs
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-10 17:16:31 +01:00
Sven Mika
81e7434091
[RLlib] TFPolicy.export_model: Add timestep placeholder to model's signature, if needed. ( #13988 )
2021-02-10 15:21:46 +01:00
Sven Mika
37c7daa3c0
[RLlib] DDPG: Support simplex action space. ( #14011 )
2021-02-10 15:10:01 +01:00
fangfengbin
1754359281
[Core]Fix ray.kill doesn't cancel pending actor bug ( #14025 )
2021-02-10 15:30:21 +08:00
Alex Wu
ce80ef5aee
[Docs] RayDP Documentation ( #14018 )
...
* .
* done?
* Docs
* Docs
* Update raydp.rst
* Update raydp.rst
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-02-09 23:05:18 -08:00
Dmitri Gekhtman
8ca0a32819
HotFix k8s autoscaling ( #14024 )
2021-02-09 22:34:24 -08:00
Eric Liang
8b7cf7cab9
Add tip on how to disable Ray OOM handler ( #14017 )
2021-02-09 21:52:22 -08:00
Ameer Haj Ali
7a6f8054d1
[Autoscaler] Monitor refactor for backward compatability. ( #13970 )
2021-02-09 21:41:50 -08:00
Eric Liang
7f342eb371
Update example shuffle script ( #14021 )
2021-02-09 20:47:41 -08:00
Clark Zinzow
79c7c181f3
[dask-on-ray] Add multiple return DataFrame shuffle optimization. ( #13951 )
2021-02-09 15:39:48 -08:00
Kai Yang
e0b81796c5
Revert "Revert "[Java] fix test hang occasionally when running FailureTest ( #13934 )" ( #13992 )" ( #14008 )
2021-02-09 12:43:26 -08:00
Simon Mo
f51c26bae6
Revert "[Core]Fix ray.kill doesn't cancel pending actor bug ( #13254 )" ( #14013 )
...
This reverts commit 2092b097ea
.
2021-02-09 11:36:38 -08:00
Alex Wu
1dcdfe9101
[autoscaler/dashboard] Publish resource usage in units of bytes ( #14002 )
2021-02-09 10:27:26 -08:00
Crissman Loomis
43083b9653
[docs] optuna variable typo ( #14006 )
...
* fix variable name typo
* align
2021-02-09 09:51:29 -08:00
Kai Fricke
3c8b164882
[tune] pass trainable function name when using tune.with_parameters
( #14009 )
2021-02-09 08:51:14 -08:00
Sven Mika
d7301a51f4
[RLlib]: Trajectory View API: Keep env infos (e.g. for postprocessing callbacks), no matter what. ( #13555 )
2021-02-09 17:05:26 +01:00
fangfengbin
2092b097ea
[Core]Fix ray.kill doesn't cancel pending actor bug ( #13254 )
2021-02-09 10:59:14 +08:00
Simon Mo
914696ac3f
Skip placement tests on Windows ( #14000 )
2021-02-08 18:27:11 -08:00