Kai Fricke
a3dc92ead6
[tune] fix specifying nested metrics in progress reporter ( #14189 )
2021-02-18 22:26:03 +01:00
Barak Michener
50ccd41cbf
fix and test the errors, limited to pickling ( #14174 )
...
Change-Id: I95c4715c0f54b1d5909aeb8eb96403db22aa0f07
2021-02-18 11:13:15 -08:00
SangBin Cho
3ad05337f7
[Shuffle] Use progress bar for experimental.shuffle ( #14179 )
...
* done.
* Add time.
2021-02-18 11:05:54 -08:00
architkulkarni
6d88036340
[ray_client]: Skip flaky test_cancel_chain on Windows ( #14167 )
...
* skip test_cancel_chain on windows
* lint
* lint
2021-02-18 10:43:15 -08:00
SangBin Cho
66f93a3d63
Revert "Fix OSX error and re-merge unhandled exceptions handling ( #14138 )" ( #14180 )
...
This reverts commit ee584e8328
.
2021-02-18 10:35:38 -08:00
Qing Wang
b579186791
Fix reset load_code_from_local in 2nd session. ( #13985 )
...
Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
2021-02-18 13:52:36 +08:00
Siyuan (Ryans) Zhuang
af8c0c1add
fix numpy ufunc serialization failures ( #14143 )
2021-02-17 21:28:21 -08:00
dependabot[bot]
323c7da70c
[tune](deps): Bump matplotlib from 3.3.3 to 3.3.4 in /python/requirements ( #14087 )
...
Bumps [matplotlib](https://github.com/matplotlib/matplotlib ) from 3.3.3 to 3.3.4.
- [Release notes](https://github.com/matplotlib/matplotlib/releases )
- [Commits](https://github.com/matplotlib/matplotlib/compare/v3.3.3...v3.3.4 )
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-02-17 19:31:07 -08:00
Amog Kamsetty
be7114639d
[Tune] Update Transformers Example ( #14150 )
...
Co-authored-by: Ubuntu <ubuntu@ip-172-31-6-151.us-west-2.compute.internal>
2021-02-17 18:37:27 -08:00
EscapeReality846089495
5ce1d262a3
[tune] Fixed atomic_save w/ os.replace ( #14089 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-17 15:48:39 -08:00
Antoni Baum
58d7398246
[Tune] Add HEBOSearch
Searcher ( #13863 )
...
* HEBO first pass
* Fix bad quotes
* Fixes
* Reproductibility
* Update python/ray/tune/suggest/hebo.py
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
* Add hebo_example.py to BUILD
* Nit
* Update to pypi package
* Alphabetical HEBO requirement
* Fix syntax error
* Fix wrong space in hebo example
* Move validate_warmstart to utils
* Space assertion in HEBO
* Comment
* Apply suggestions from code review
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
* Formatting
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-02-17 22:53:10 +01:00
Eric Liang
ee584e8328
Fix OSX error and re-merge unhandled exceptions handling ( #14138 )
2021-02-17 13:35:07 -08:00
dependabot[bot]
67bdccca41
[tune](deps): Bump smart-open from 4.0.1 to 4.2.0 in /python/requirements ( #14158 )
...
Bumps [smart-open](https://github.com/piskvorky/smart_open ) from 4.0.1 to 4.2.0.
- [Release notes](https://github.com/piskvorky/smart_open/releases )
- [Changelog](https://github.com/RaRe-Technologies/smart_open/blob/develop/CHANGELOG.md )
- [Commits](https://github.com/piskvorky/smart_open/compare/4.0.1...v4.2.0 )
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-02-17 12:03:22 -08:00
architkulkarni
d9124e9329
Revert "[Core]Fix ray.kill doesn't cancel pending actor bug ( #14025 )" ( #14146 )
...
This reverts commit 1754359281
.
2021-02-16 17:22:25 -08:00
SangBin Cho
1b1420e069
[Scheduler] Fix spillback is done deterministically. ( #14096 )
...
* update.
* Fix comments.
* Addressed code review.
* fix a test.
* Addressed last code review.
* d.
* done.
2021-02-16 16:46:16 -08:00
SangBin Cho
4d7ab3c886
[Doc] Ray logging document. ( #14102 )
...
* Initial draft done.
* Addressed code review.
2021-02-16 15:27:30 -08:00
Barak Michener
edf24580a6
[ray_client]: Set gRPC max message size to 4GiB ( #14063 )
...
* [ray_client]: Set gRPC max message size to 4GiB
Change-Id: Id4d6887cdd90dd761dd25248f10f104701462667
* reduce size
Change-Id: I71625ed3cffd9d8b3d7d3d7a981bb4dda00ed0a1
* Update test_basic_2.py
* Update test_advanced.py
* Update test_basic.py
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-02-16 14:32:23 -08:00
architkulkarni
3ce03a52bc
Revert "Revert "Revert "Unhandled exception handler based on local ref counti… ( #14113 )" ( #14136 )
...
This reverts commit e457872fe1
.
2021-02-16 11:47:09 -08:00
SangBin Cho
b05f87d7b2
[Object Spilling] Share the same S3 session for smart_open spilling. ( #13904 )
2021-02-16 10:40:55 -08:00
Barak Michener
c43a64230e
[ray_client]: Fix mutual recursion ( #14122 )
2021-02-16 10:37:58 -08:00
SangBin Cho
684bb32cdf
Fix assert get_outer_ref None
failed + Support better traceback. ( #14126 )
...
* in progress.
* Better exception handling & stacktrace.
* done.
2021-02-16 10:09:01 -08:00
Richard Liaw
864956f817
fix-skopt ( #14116 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-16 14:36:19 +01:00
Eric Liang
e434ffe06c
[tune] Avoid crash in client mode when return results creating logdir ( #14115 )
2021-02-15 19:25:14 -08:00
Ian Rodney
350fb5b9d1
[autoscaler] Remove Hardcoded 8265 ( #14112 )
2021-02-15 18:04:00 -08:00
Patrick Ames
da0c2c99a0
[autoscaler] Fix bad reference error when specifying IamInstanceProfile by name in config. ( #14083 )
2021-02-15 16:29:36 -08:00
Jack Parker-Holder
ebb6e552d2
[tune] PB2 - add small constant ( #14118 )
2021-02-15 16:04:10 -08:00
Edward Oakes
5e763893ea
[serve] Don't overwrite self.handle in StarletteEndpoint ( #14111 )
2021-02-15 17:51:54 -06:00
SangBin Cho
4ad79ca963
[Object Spilling] Remove LRU eviction ( #13977 )
...
* done.
* formatting.
* done.
* done.
2021-02-15 14:24:53 -08:00
Eric Liang
e457872fe1
Revert "Revert "Unhandled exception handler based on local ref counti… ( #14113 )
...
* Revert "Revert "Unhandled exception handler based on local ref counting (#14049 )" (#14099 )"
This reverts commit b45ae76765
.
* reomve test
* fix
* fix
2021-02-15 14:11:11 -08:00
architkulkarni
496dd297e5
skip test_basic_reconstruction_actor_task on win ( #14110 )
2021-02-15 10:17:33 -08:00
architkulkarni
0fb96a61fc
[Serve] Add support for variable routes ( #13968 )
2021-02-15 11:42:42 -06:00
Richard Liaw
4d727e4cdf
[tune] enable more tests ( #13969 )
...
* try-this
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* test
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix-tests
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* address
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* real-ray
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix-client
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix-race-condition
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* revert-new-tune-tests
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* Revert "revert-new-tune-tests"
This reverts commit 3866b920bc47ac4b5cb9dab8f7b9d50e4acdb27a.
* format
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* update
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* build
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-15 09:19:55 -08:00
SangBin Cho
b45ae76765
Revert "Unhandled exception handler based on local ref counting ( #14049 )" ( #14099 )
...
This reverts commit 9dc671ae02
.
2021-02-14 22:08:32 -08:00
architkulkarni
75568f856c
skip restart and multi restart test on win ( #14084 )
2021-02-14 15:17:54 -08:00
Eric Liang
9dc671ae02
Unhandled exception handler based on local ref counting ( #14049 )
2021-02-12 22:58:38 -08:00
Erik Erlandson
ff1b26274e
[operator] expose RAY_CONFIG_DIR env var ( fix #14074 ) ( #14076 )
2021-02-12 17:47:00 -08:00
architkulkarni
20f6cc2cb2
skip test_basic_reconstruction_put on win ( #14082 )
2021-02-12 15:47:00 -08:00
Clark Zinzow
c9a9d422c7
[OBOD] Disable the ownership-based object directory for all tests that use ray.objects(). ( #14065 )
2021-02-12 12:12:57 -08:00
Amog Kamsetty
a430ac2334
[Tune] Revert Pinning Tune Dependencies ( #14059 )
...
* remove lockfiles
* docker
* remove constraint file
* fix
2021-02-11 15:43:09 -08:00
Clark Zinzow
cd7e567a57
[Core] Ownership-based Object Directory - Added support for object spilling in the ownership-based object directory. ( #13948 )
...
* Add support for object spilling in the ownership-based object directory.
* Move owner address hashmap into pinned_objects_ and objects_pending_spill_.
* Update local object manager tests.
* Feedback and misc. fixes.
* Move spilled unpin callback lambda to std::binded private method.
* Skip test_delete_objects_multi_node test on MacOS for now.
2021-02-11 10:36:22 -08:00
Ian Rodney
f6cfc44dbd
[autoscaler] run setup commands with restart_only=True ( #13836 )
2021-02-10 20:17:20 -08:00
Ameer Haj Ali
d87a82e891
Revert "Revert "[Autoscaler] Monitor refactor for backward compatability. ( #13970 )" ( #14046 )" ( #14050 )
...
* prepare for head node
* move command runner interface outside _private
* remove space
* Eric
* flake
* min_workers in multi node type
* fixing edge cases
* eric not idle
* fix target_workers to consider min_workers of node types
* idle timeout
* minor
* minor fix
* test
* lint
* eric v2
* eric 3
* min_workers constraint before bin packing
* Update resource_demand_scheduler.py
* Revert "Update resource_demand_scheduler.py"
This reverts commit 818a63a2c86d8437b3ef21c5035d701c1d1127b5.
* reducing diff
* make get_nodes_to_launch return a dict
* merge
* weird merge fix
* auto fill instance types for AWS
* Alex/Eric
* Update doc/source/cluster/autoscaling.rst
* merge autofill and input from user
* logger.exception
* make the yaml use the default autofill
* docs Eric
* remove test_autoscaler_yaml from windows tests
* lets try changing the test a bit
* return test
* lets see
* edward
* Limit max launch concurrency
* commenting frac TODO
* move to resource demand scheduler
* use STATUS UP TO DATE
* Eric
* make logger of gc freed refs debug instead of info
* add cluster name to docker mount prefix directory
* grrR
* fix tests
* moving docker directory to sdk
* move the import to prevent circular dependency
* smallf fix
* ian
* fix max launch concurrency bug to assume failing nodes as pending and consider only load_metric's connected nodes as running
* small fix
* Revert "Revert "[Autoscaler] Monitor refactor for backward compatability. (#13970 )" (#14046 )"
This reverts commit 6f9d39fb3e
.
* fake news
Co-authored-by: Ameer Haj Ali <ameerhajali@ameers-mbp.lan>
Co-authored-by: Alex Wu <alex@anyscale.io>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
2021-02-10 17:59:08 -08:00
architkulkarni
6f9d39fb3e
Revert "[Autoscaler] Monitor refactor for backward compatability. ( #13970 )" ( #14046 )
...
This reverts commit 7a6f8054d1
.
2021-02-10 12:16:52 -08:00
fangfengbin
1754359281
[Core]Fix ray.kill doesn't cancel pending actor bug ( #14025 )
2021-02-10 15:30:21 +08:00
Dmitri Gekhtman
8ca0a32819
HotFix k8s autoscaling ( #14024 )
2021-02-09 22:34:24 -08:00
Eric Liang
8b7cf7cab9
Add tip on how to disable Ray OOM handler ( #14017 )
2021-02-09 21:52:22 -08:00
Ameer Haj Ali
7a6f8054d1
[Autoscaler] Monitor refactor for backward compatability. ( #13970 )
2021-02-09 21:41:50 -08:00
Eric Liang
7f342eb371
Update example shuffle script ( #14021 )
2021-02-09 20:47:41 -08:00
Clark Zinzow
79c7c181f3
[dask-on-ray] Add multiple return DataFrame shuffle optimization. ( #13951 )
2021-02-09 15:39:48 -08:00
Simon Mo
f51c26bae6
Revert "[Core]Fix ray.kill doesn't cancel pending actor bug ( #13254 )" ( #14013 )
...
This reverts commit 2092b097ea
.
2021-02-09 11:36:38 -08:00