Amog Kamsetty
900a48c19c
[Tune] Better warnings/exceptions for fail_fast='raise' ( #11842 )
2020-11-06 15:01:55 -08:00
Aaron Miller
045fed5cd2
[examples] comment out rsync_
settings for K8S ( #11862 )
2020-11-06 14:35:21 -08:00
Simon Mo
871cde989a
Re-Revert: [Serialization] Update CloudPickle to 1.6.0 ( #9694 ) ( #11837 )
2020-11-06 12:24:36 -08:00
Kishan Sagathiya
c5e6c90e1e
[Core] Add name of actor in the result of ray.actors()
( #11828 )
...
Added name field to `actor_info`
Fixes #11112
2020-11-06 10:45:44 -08:00
Philipp Moritz
28e7439cf0
[doc] Add documentation for Ray debugger ( #11815 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-05 16:25:27 -08:00
Barak Michener
27c810a97e
Basic protos for ray client ( #11762 )
2020-11-05 16:23:54 -08:00
Eric Liang
f86c4f992c
Fix RAY_ENABLE_NEW_SCHEDULER=1 pytest test_advanced_2.py::test_zero_cpus_actor ( #11817 )
2020-11-05 16:02:04 -08:00
architkulkarni
347e871409
[Serve] Add dependency management ( #11743 )
2020-11-05 16:39:37 -06:00
Kai Yang
ffc267f94b
[Test] Ignore setproctitle for local mode ( #11819 )
2020-11-05 11:07:34 -08:00
Kai Fricke
603accf1c2
[tune] logger refactor part 3: Add ExperimentLogger class ( #11749 )
2020-11-05 08:55:38 -08:00
Richard Liaw
f6717b8b03
[autoscaler] Support empty node list for kill node ( #11810 )
2020-11-04 22:40:07 -08:00
Richard Liaw
efa07d5403
Revert "Revert "[tune] PB2 ( #11466 )" ( #11795 )" ( #11812 )
2020-11-04 20:47:12 -08:00
Eric Liang
69145d6215
[hotfix] Bazel candidates not found due to raising too early
2020-11-04 16:08:51 -08:00
Ian Rodney
22bbbc3171
[wheel] Fix Manylinux2014 Build ( #11811 )
2020-11-04 14:50:38 -08:00
Amog Kamsetty
92718de40c
[SGD] Better support for custom DDP ( #11771 )
2020-11-04 13:58:51 -08:00
Ameer Haj Ali
ebdf8ba3fa
[autoscaler] Support legacy cluster configs with the new resource demand scheduler ( #11751 )
2020-11-04 12:05:48 -08:00
Kai Yang
31598338b3
[Core] Fix ray start failure to due to bug of redis address detection ( #11735 )
...
* Fix ray start failure to due redis address detection bug
* Address comment
2020-11-04 12:04:44 -08:00
Alex Wu
53aac55739
[autoscaler] Autoscaler simulator ( #11690 )
2020-11-04 12:04:11 -08:00
Akash Patel
b7531fb4f5
[redis-py] change redis-py deprecated hmset usage to hset ( #11776 )
2020-11-03 22:23:02 -08:00
Amog Kamsetty
7248d5f4ae
Revert "[tune] PB2 ( #11466 )" ( #11795 )
...
This reverts commit e7aafd7d24
.
2020-11-03 21:05:00 -08:00
Kai Fricke
007634fd1b
[tune] logger refactor part 2: Add SyncerCallback ( #11748 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-03 21:04:40 -08:00
Barak Michener
05c4e3fb2a
[build] Build wheels with manylinux2014 ( #11621 )
...
* necessary changes
* Split bazel install
* manylinux2014
* change references to manylinux2014
* Fix lint
* port alex's docker build changes
* fix config issue
* remove extra manylinux2010 requirement script
* revert SHA overwrite
* wip
* incompatible_linklibs
* fix nits
2020-11-03 19:36:32 -08:00
Ian Rodney
9527220a86
[serve] Fix Controller Crashes on Win ( #11792 )
2020-11-03 16:54:16 -08:00
Ian Rodney
c3074f559c
[serve] Split out metadata for checkpointing ( #11533 )
2020-11-03 12:41:24 -08:00
Philipp Moritz
39ce0eadbe
Ray PDB support ( #11739 )
2020-11-03 09:49:23 -08:00
Ameer Haj Ali
08e0e8311a
[autoscaler] Fixing AWS instance types autofill ( #11758 )
2020-11-03 09:34:14 -08:00
Kai Fricke
f7b19c41e3
[tune] logger refactor part 1: move classes and utilities to own files ( #11746 )
...
* [tune] logger refactor part 1: move classes and utilities to own files
* Fix circular dependency
* Remove uneeded pretty print copy
* Apply suggestions from code review
2020-11-03 07:48:09 -08:00
Maksim Smolin
0a6d24a727
[cli] Remove the deprecated old_style
logging calls ( #10776 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-02 23:40:18 -08:00
Stephanie Wang
0ba777af99
[Object spilling] Add policy to automatically spill objects on OutOfMemory ( #11673 )
2020-11-02 12:42:02 -08:00
Ameer Haj Ali
8d74a04a42
[autoscaler] Flag flip for resource_demand_scheduler should take into account queue ( #11615 )
2020-11-02 12:41:22 -08:00
Ian Rodney
171e02c684
[serve] re-enable serve-controller-crash test ( #11579 )
2020-11-02 11:22:09 -08:00
Eric Liang
48dee789b3
Add random actor placement; fix cancellation callback; update test skips ( #11684 )
2020-10-30 18:36:35 -07:00
DK.Pino
b10871a1f5
[Core]Fix get workrer table bug ( #11516 )
...
* fix get_worker_table bug
* fix lint
* fix comment
* remove actor table
* fix comment
* fix get alive worker
* remove unused python import
2020-10-30 14:48:29 -07:00
SangBin Cho
71c5089854
[Object Spilling] Initial Iteration of S3 adapter. ( #11379 )
...
* Finished the first iteration.
* Removed unnecessary code.
* Smartopen impl.
* Make sure tests passed.
* Addressed code review.
* Addressed code review.
* Fix issues.
* Fix issues.
2020-10-30 14:47:07 -07:00
Ameer Haj Ali
7aade469d0
[autoscaler] fix the autoscaling bug for continuously launching failed nodes ( #11714 )
2020-10-30 14:12:06 -07:00
Gekho457
8816d34541
Kubernetes rsync verbosity fixed ( #11716 )
2020-10-30 14:03:42 -07:00
Alan Guo
3c109b45aa
Disable validation of cluster config on the cluster to allow for cluster configs with new properties. ( #11693 )
2020-10-30 14:02:00 -07:00
Eric Liang
f9f372c327
[autoscaler] Clean up monitoring loop code ( #11677 )
2020-10-30 13:48:43 -07:00
SangBin Cho
6e2a1eac36
[Placement Group] Placement group automatic cleanup. ( #11546 )
...
* In progress. Done with all placement group manager code.
* It is working with job.
* Finished detached actor implementation.
* Fix minor issue.
* In progress.
* Addressed code review.
* Addressed code review.
* Addressed code reivew.
* Fix a build error.
2020-10-30 10:55:43 -07:00
architkulkarni
4175569d96
[Core] Add option to override environment variables for tasks and actors ( #11619 )
2020-10-29 14:22:44 -05:00
Simon Mo
e82ff08b0c
Fix asyncio plasma integration in cluster mode ( #11665 )
2020-10-29 11:53:10 -07:00
Simon Mo
46afec5660
Mute asyncio warning for Serve ( #11682 )
2020-10-28 17:05:42 -07:00
Kai Fricke
ba63ded311
[tune] better error when metric
or mode
unset in search algorithms ( #11646 )
2020-10-28 13:17:59 -07:00
Richard Liaw
58891551d3
[tune] make tests faster + fix flaky test ( #10264 )
2020-10-28 13:14:54 -07:00
Gekho457
9e63f7ccc3
[autoscaler/k8s] ray up 409 error fix ( #11660 )
2020-10-28 14:19:57 -05:00
Tao Wang
1d5694ddea
[GCS]Use direct getting instead of pub-sub to update load metrics in monitor.py ( #11339 )
2020-10-28 11:23:18 -07:00
Eric Liang
c933477915
[new scheduler] Pass test_basic and add CI builds with flag on ( #11635 )
2020-10-28 11:02:43 -07:00
Richard Liaw
70ea1fbe30
[sgd] pin ptl to 1.0.3 ( #11664 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2020-10-28 00:29:01 -07:00
fyrestone
05ad4c7499
[Dashboard] Optimize dashboard datacenter ( #11391 )
...
* Optimize dashboard datacenter
* Fix tests
* Fix tests
* Fix
* Fix CI
* python/build-wheel-macos.sh
Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: Max Fitton <maxfitton@anyscale.com>
2020-10-27 23:49:31 -07:00
yncxcw
c3e246818a
[Core] Fix doc string for ray.init() ( #11657 )
2020-10-27 18:27:22 -07:00