Gekho457
|
bff50cfc37
|
[k8s] Read gpu resources properly (#12942)
* Read gpu resources properly
* Comments and docstrings
* Comment formatting
|
2020-12-18 01:32:12 -08:00 |
|
Kai Fricke
|
426f8a8d15
|
[tune] Fix tutorial training on GPU (#12914)
|
2020-12-18 01:31:40 -08:00 |
|
DK.Pino
|
6404f1e609
|
[Placement Group][New scheduler] New scheduler pg implementation (#12910)
|
2020-12-18 11:56:45 +08:00 |
|
Farzan Taj
|
53378170e0
|
[tune] Change pickle to ray.cloudpickle -- support large models (#12958)
* Change pickle to ray.cloudpickle
* Change pickle import to ray.cloudpickle
|
2020-12-17 19:17:08 -08:00 |
|
Kai Fricke
|
3d72000826
|
[tune] Add points_to_evaluate to BasicVariantGenerator (#12916)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-17 19:16:03 -08:00 |
|
Edward Oakes
|
c7a59b239f
|
Remove unused endpoints_to_remove (#12946)
|
2020-12-17 15:04:11 -06:00 |
|
Gekho457
|
82f9c7014e
|
[K8s] Retry getting home directory in command runner. (#12925)
|
2020-12-17 09:41:48 -08:00 |
|
Yi Cheng
|
40032541dc
|
[core] Introduce fetch_local to ray.wait (#12526)
|
2020-12-16 23:44:28 -08:00 |
|
SangBin Cho
|
057687e534
|
[New Scheduler] Fix test_failure.py by supporting infeasible tasks (#12738)
* Fix the first issue.
* ip
* In Progress.
* In progress.
* done.
* Remove unnecessary logs.
* Addressed code review + fix some test failures.
* Try fixing issues.
* Fix issues.
* Fix test issues.
* Fix issues.
* done.
|
2020-12-16 21:27:50 -08:00 |
|
Philipp Moritz
|
ad036fd564
|
Fix continue for debugger (#12862)
|
2020-12-16 16:09:13 -08:00 |
|
Amog Kamsetty
|
dd522a71a1
|
[SGD] Disable Elastic Training by default when using with Tune (#12927)
|
2020-12-16 15:37:44 -08:00 |
|
Alex Wu
|
8b783ecafa
|
Fix pull manager retry (#12907)
|
2020-12-16 14:18:43 -08:00 |
|
Ameer Haj Ali
|
c677b9e201
|
[autoscaler] Fix flaky autoscaler test (#12918)
|
2020-12-16 14:18:27 -08:00 |
|
Edward Oakes
|
fdb4c6eb1c
|
Better message for too little /dev/shm memory (#12896)
|
2020-12-16 10:30:20 -06:00 |
|
fangfengbin
|
91878d18b5
|
[PlacementGroup]Fix placement group wait api disorder bug (#12827)
* [PlacementGroup]Fix placment group wait api disorder bug
* fix review comment
* fix review comment
* fix review comment
* fix review comments
* increase num_heartbeats_timeout
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
|
2020-12-16 18:45:53 +08:00 |
|
Richard Liaw
|
a7caa14d3d
|
[k8s] avoid bad error messages (#12871)
|
2020-12-15 15:00:02 -08:00 |
|
Edward Oakes
|
f4b5a8b2f7
|
[serve] Re-enable test_failure.py (#12891)
|
2020-12-15 16:02:04 -06:00 |
|
Richard Liaw
|
87cf1a97e5
|
[core] recover startup logs (#12876)
|
2020-12-15 13:49:45 -08:00 |
|
Edward Oakes
|
6795d7c75c
|
[serve] Fix flaky test_api.py::test_backend_user_config (#12892)
|
2020-12-15 15:35:30 -06:00 |
|
Kai Fricke
|
ea1228074d
|
[tune] enable points_to_eval for all search algorithms (#12790)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-15 11:51:53 -08:00 |
|
Simon Mo
|
fdd85e3af4
|
[Serve] Add benchmark for async handles (#12858)
|
2020-12-15 11:21:51 -08:00 |
|
Alex Wu
|
0031723ace
|
[New scheduler] Object spilling (#12857)
|
2020-12-15 11:05:38 -08:00 |
|
architkulkarni
|
ba12fb1451
|
Fix for RLIMIT patch (#12882)
Implement new soft limit introduced by https://github.com/ray-project/ray/pull/12853.
|
2020-12-15 10:38:46 -08:00 |
|
Max Fitton
|
e077bc4206
|
[Release] Bump master to 1.2.0 for 1.1.0 release (#12856)
|
2020-12-15 09:40:26 -08:00 |
|
Simon Mo
|
b291dd4486
|
[Metrics] Call GetMeasureDoubleByName to prevent override (#12860)
|
2020-12-15 09:39:39 -08:00 |
|
Gekho457
|
5a142d5bd6
|
Use nightly images in all kubernetes examples. (#12868)
|
2020-12-14 20:49:41 -08:00 |
|
Simon Mo
|
b56db5a22f
|
[Serve] Wait for actor name to be cleaned up (#12215)
|
2020-12-14 15:09:43 -08:00 |
|
architkulkarni
|
231518e86f
|
[Serve] Support basic Starlette response types (#12811)
|
2020-12-14 17:03:56 -06:00 |
|
Eric Liang
|
1eb4ac12b1
|
Clip RLIMIT_NOFILE increase to avoid redis failing to start on Big Sur
|
2020-12-14 14:05:19 -08:00 |
|
SangBin Cho
|
69b0bc2132
|
[Logging] Use file handle temporalily (#12839)
|
2020-12-14 11:42:44 -08:00 |
|
Gekho457
|
11ce1dc743
|
Ray cluster CRD and example CR + multi-ray-cluster operator (#12098)
|
2020-12-14 10:26:01 -06:00 |
|
Tao Wang
|
35f7d84dbe
|
Revert heartbeat interval to keep ci stable (#12836)
* Revert heartbeat interval to keep ci stable
* fix missing one
|
2020-12-14 16:58:40 +08:00 |
|
Eric Squires
|
22c1968d62
|
Runing -> Running (#12826)
|
2020-12-13 22:23:48 -08:00 |
|
Ameer Haj Ali
|
aaa11941f6
|
[autoscaler] Fix flaky autoscaler test (#12829)
|
2020-12-13 17:09:30 -08:00 |
|
DK.Pino
|
153b24746c
|
[Placement Group] Refactor pg resource constrain in node manager (#12538)
* first version by pointer
* second version reference
* clean up
* add cpp ut
* lint
* extract LocalPlacementGroupManagerInterface
* lint
* fix commemt
* add idempotency test
* lint
* fix pg ut
* fix pg ut
* python lint
* fix pg ut timeout
* python lint
* fix comment
* lint
* lint
|
2020-12-12 23:32:15 -08:00 |
|
Eric Liang
|
bdc6624da8
|
Revert "[PlacementGroup]Add PlacementGroup wait python api (#12601)" (#12825)
This reverts commit 401d342602 .
|
2020-12-12 12:13:48 -08:00 |
|
Richard Liaw
|
2f2bd884a3
|
[tune] upgrade gpytorch, bump default pytorch to 1.7.0 (#12776)
* upgrade gpytorch
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* pin
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* version-torch
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix-build
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-12 10:35:33 -08:00 |
|
Richard Liaw
|
7e09f1d934
|
remove-xgboost-build (#12822)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-12 10:34:56 -08:00 |
|
Kai Fricke
|
5f04ade6ef
|
[tune] add more stoppers and stopper documentation (#12750)
* Add new stoppers & docs
* Add tests for maximum iteration stopper and trial plateau stopper
* Update python/ray/tune/stopper.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update doc/source/tune/api_docs/stoppers.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update doc/source/tune/api_docs/stoppers.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Apply suggestions from code review
* Apply suggestions from code review
* Update python/ray/tune/stopper.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-12 01:47:19 -08:00 |
|
Kai Fricke
|
905652cdd6
|
[tune] migrate xgboost callback api (#12745)
* Migrate to new-style xgboost callbacks
* Fix flaky progress reporter test
* Fix import error
* Take last value (not first)
|
2020-12-12 01:42:20 -08:00 |
|
Kai Fricke
|
42c70be073
|
[tune] Hyperopt: Directly accept category variables instead of indices (#12715)
* [tune] Hyperopt: Directly accept category variables instead of indices
* Fix interrupt test
* Update python/ray/tune/suggest/hyperopt.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Apply suggestions from code review
* Update python/ray/tune/suggest/hyperopt.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* lint
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-12 01:40:53 -08:00 |
|
Hao Zhang
|
0b1fbc5e83
|
[PR 1/6] Collective in Ray (#12637)
Co-authored-by: YLJALDC <dal177@ucsd.edu>
|
2020-12-12 01:26:36 -08:00 |
|
Alex Wu
|
aa64cd4534
|
[New scheduler] Fix test_global_state (#12586)
|
2020-12-11 21:47:01 -08:00 |
|
Edward Oakes
|
03d869d51c
|
Hold GIL while submitting (actor) tasks (#12803)
|
2020-12-11 21:47:16 -06:00 |
|
Edward Oakes
|
aec5c9879e
|
Add tests for atexit handler behavior (#12808)
|
2020-12-11 21:47:05 -06:00 |
|
Edward Oakes
|
6262ee1f76
|
Clarify docs for atexit behavior when using ray.kill (#12807)
|
2020-12-11 21:45:39 -06:00 |
|
Eric Liang
|
1ce745cf44
|
Add automatic local GC and plasma debug logs every 10 minutes by default (#12804)
|
2020-12-11 17:09:58 -08:00 |
|
Simon Mo
|
3d8c1cbae6
|
[Serve] Fix Serve Release Tests (#12777)
|
2020-12-11 11:53:47 -08:00 |
|
fangfengbin
|
9ded69fdaa
|
[Hotfix] Fix python client lint error (#12783)
|
2020-12-11 10:15:53 -08:00 |
|
Simon Mo
|
68d7fa2137
|
Fix exit_actor in asyncio mode (#12693)
|
2020-12-11 09:35:17 -08:00 |
|