Alex Wu
31d89be926
[Workflow] Basic event support ( #19239 )
...
* basics
* .
* .
* a test
* a test
* tests
* cleanup
* concepts page
* docs
* polish
* fix sleep
* fix yi things
* lint
* fix
* .
* .
* .
* fix?
* .
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-10-22 15:27:33 -07:00
Edward Oakes
c9258aff0f
Revert "[ActorGroup] Add ActorGroup
( #18960 )" ( #19655 )
...
This reverts commit 4f05bac8fb
.
2021-10-22 14:55:17 -07:00
shrekris-anyscale
cfae64ebe8
[multiprocessing] Modify Ray's map_async() to match Multiprocessing's map_async() behavior ( #19403 )
2021-10-22 16:31:34 -05:00
Gagandeep Singh
2f8da8f8c8
Bumped timeout due to slow test times in Windows ( #19595 )
2021-10-22 13:48:15 -07:00
Jiao
f0be4cb390
[jobs] Add job manager class for simple jobs python APIs ( #19567 )
2021-10-22 14:18:11 -05:00
Jiajun Yao
43b8f8e522
Revert "Revert "[Test] Fix flaky test_gpu test ( #19524 )" ( #19562 )" ( #19643 )
...
This reverts commit 7daf28f348
.
2021-10-22 11:48:57 -07:00
Edward Oakes
0760fe869d
[runtime_env] Clean up working dir tests, add more test cases ( #19597 )
2021-10-22 12:35:27 -05:00
Amog Kamsetty
4f05bac8fb
[ActorGroup] Add ActorGroup
( #18960 )
...
* move
* fix
* Revert "fix"
This reverts commit 532660fc334ae96a0ff34c8ab1288488312300a3.
* Revert "move"
This reverts commit 54321f4a539c2ee873f17d988da5627588aeff97.
* add
* wip
* wip
* wip
* wip
* address comments
* wip
* add to build
* fix
* fix
* fix
2021-10-22 10:22:31 -07:00
Simon Mo
1eb142b57c
[Serve] Fix shutdown protocol again ( #19609 )
2021-10-22 09:27:32 -07:00
Jiajun Yao
256bf0bf3a
[Release] Bump up dask to latest compatible version 2021.9.1 ( #19592 )
...
* Bump up dask to latest compatible version 2021.9.1
* Bump up dask to latest compatible version 2021.9.1
2021-10-22 09:16:28 -07:00
architkulkarni
030acf3857
[Serve] [Serve Autoscaler] Add upscale and downscale delay ( #19290 )
2021-10-22 10:33:28 -05:00
xwjiang2010
a632cb439f
[Tune] Remove queue_trials. ( #19472 )
2021-10-22 09:24:54 +01:00
Stephanie Wang
499d6e9fc1
Turn on reconstruction tests in CI ( #19497 )
2021-10-21 22:34:44 -07:00
Eric Liang
50e305e799
[data] Add take_all() and raise error if to_pandas() drops records ( #19619 )
2021-10-21 22:23:50 -07:00
SangBin Cho
9a050c666d
[Test] Add a stronger resource leak check to pg unit tests. ( #19586 )
...
* Add a stronger check to unit tests.
* .
2021-10-21 21:40:00 -07:00
Edward Oakes
11b6019fb5
[ray client] Fix connecting to a cluster without available CPUs ( #19604 )
2021-10-21 21:21:50 -05:00
Jiajun Yao
920384f34e
[Doc] Fix Dataset __annotations__ ( #19599 )
2021-10-21 17:33:55 -07:00
SangBin Cho
cea7fda41a
Revert "Revert "[Dashboard] Disable unnecessary event messages. ( #19490 )" ( #19574 )" ( #19577 )
...
This reverts commit 699c5aeac6
.
2021-10-21 15:36:22 -07:00
SangBin Cho
19e3280824
[Core] Fix shutdown Core worker crash when pg is removed. ( #19549 )
...
* fix core worker crash
* remove file
* done
2021-10-21 14:30:54 -07:00
Simon Mo
30d9f8fbae
[Doc] [Serve] Fix code cutoff and broken linkes in deployment.rst ( #19573 )
2021-10-21 13:47:55 -07:00
Simon Mo
03805d4064
[Serve] Good error message when Serve not installed and ensure Serve installs ray[default] ( #19570 )
2021-10-21 13:47:29 -07:00
xwjiang2010
3e31526445
[tune] Print warning msg when TrialExecutor is directly inherited. ( #17654 )
2021-10-21 21:25:38 +01:00
Ian Rodney
0cdf4ae8d0
[AWS] Stop Round Robining AZs ( #19051 )
...
* round robin on failure to launch
* still round-robin spot instances
* prioritize first AZ
* no more round-robining
* doc updates
* Order subnets by AZ
* add spot instance advisor link
* ensure we try all AZs
* fix typos
2021-10-21 12:06:44 -07:00
Kai Fricke
7d8ea5e724
[tune] Remove magic results (e.g. config) before calculating trial result metrics ( #19583 )
2021-10-21 19:36:14 +01:00
Kai Fricke
15cdffe0ff
[tune] Only try to sync driver if sync_to_driver is actually enabled ( #19589 )
2021-10-21 19:35:35 +01:00
Oscar Knagg
15ca575078
Account for Windows return characters ( #19590 )
2021-10-21 10:05:20 -07:00
Travis Addair
c6e2161dbc
[Train] Fixed HorovodBackend to automatically detect network interfaces ( #19533 )
...
* Moved Horovod into package
* Move in Ludwig fix
* Undo git mv
* Cleanup
* Cleanup
* flake8
* Update python/ray/train/backends/horovod.py
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
* Whitespace
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2021-10-21 09:13:11 -07:00
Amog Kamsetty
f1f334c348
[Train] Backwards Compatibility for TrainingCallback
( #19566 )
2021-10-21 09:11:34 -07:00
SangBin Cho
9000f41aa6
[Nightly Test] Support memory profiling on Ray + implement memory monitor for nightly tests ( #19539 )
...
* random fixes
* Done
* done
* update the doc
* doc lint fix
* .
* .
2021-10-21 07:37:05 -07:00
Tobias Kaymak
0e50701bbe
[Serve] Typo in kv_store.py ( #19454 )
...
Fixing typo in init of RayS3KVStore class
2021-10-21 07:24:34 -07:00
Qing Wang
048e7f7d5d
[Core] Port concurrency groups with asyncio ( #18567 )
...
## Why are these changes needed?
This PR aims to port concurrency groups functionality with asyncio for Python.
### API
```python
@ray.remote(concurrency_groups={"io": 2, "compute": 4})
class AsyncActor:
def __init__(self):
pass
@ray.method(concurrency_group="io")
async def f1(self):
pass
@ray.method(concurrency_group="io")
def f2(self):
pass
@ray.method(concurrency_group="compute")
def f3(self):
pass
@ray.method(concurrency_group="compute")
def f4(self):
pass
def f5(self):
pass
```
The annotation above the actor class `AsyncActor` defines this actor will have 2 concurrency groups and defines their max concurrencies, and it has a default concurrency group. Every concurrency group has an async eventloop and a pythread to execute the methods which is defined on them.
Method `f1` will be invoked in the `io` concurrency group. `f2` in `io`, `f3` in `compute` and etc.
TO BE NOTICED, `f5` and `__init__` will be invoked in the default concurrency.
The following method `f2` will be invoked in the concurrency group `compute` since the dynamic specifying has a higher priority.
```python
a.f2.options(concurrency_group="compute").remote()
```
### Implementation
The straightforward implementation details are:
- Before we only have 1 eventloop binding 1 pythread for an asyncio actor. Now we create 1 eventloop binding 1 pythread for every concurrency group of the asyncio actor.
- Before we have 1 fiber state for every caller in the asyncio actor. Now we create a FiberStateManager for every caller in the asyncio actor. And the FiberStateManager manages the fiber states for concurrency groups.
## Related issue number
#16047
2021-10-21 21:46:56 +08:00
Antoni Baum
a04b02e2e8
[tune] Better bad Stopper type message ( #19496 )
2021-10-21 14:31:27 +01:00
Kai Fricke
44fb7d09df
[tune] sync_client: Fix delete template formatting ( #19553 )
2021-10-21 10:59:54 +01:00
Patrick Ames
20d47873c9
[data] Add pickle support for PyArrow CSV WriteOptions ( #19378 )
2021-10-21 00:46:52 -07:00
Matti Picus
bacd5f92e2
MAINT: cleanups for windows ( #19430 )
...
* dead processes should increment total_stopped
* use psutil in testing to check pid
* remove unneeded repititions
2021-10-20 23:32:35 -07:00
Oscar Knagg
5a05e89267
[Core] Add TLS/SSL support to gRPC channels ( #18631 )
2021-10-20 22:39:11 -07:00
heng2j
6d23fb1ff1
[Tune] Support custom tags in MLflow logger callback ( #19532 )
...
* Added Food Collector support to rllib/env/unity3d_env.py
* feat(mlflow): added parameter tags to MLflowLoggerCallback
* fix(unit_test): added tags tests in test_integration_mlflow.MLflowTest()
* chore: lint the changes in this PR
* update
* Update python/ray/tune/integration/mlflow.py
* fix
* copy
* fix
Co-authored-by: zla0368 <zhongheng.li@stresearch.com>
Co-authored-by: Li, Zhongheng <zhongheng.li@str.us>
Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2021-10-20 22:31:33 -07:00
SangBin Cho
085162c68e
[Log] Print actor & task name upon stderr messages. ( #19542 )
2021-10-20 22:16:32 -07:00
Eric Liang
699c5aeac6
Revert "[Dashboard] Disable unnecessary event messages. ( #19490 )" ( #19574 )
...
This reverts commit 7fb681a35d
.
2021-10-20 20:17:57 -07:00
Eric Liang
48ecb1f88a
[data] Fix O(n^2) issues in simple_block sort ( #19543 )
2021-10-20 18:26:20 -07:00
SangBin Cho
7fb681a35d
[Dashboard] Disable unnecessary event messages. ( #19490 )
...
* Disable unnecessary event messages.
* use warning
* Fix tests
2021-10-20 17:40:25 -07:00
Edward Oakes
bcf584294f
[runtime_env] Refactor working dir packaging code into runtime_env.packaging module ( #19112 )
2021-10-20 18:38:50 -05:00
Eric Liang
7daf28f348
Revert "[Test] Fix flaky test_gpu test ( #19524 )" ( #19562 )
...
This reverts commit 39e54cd276
.
2021-10-20 12:21:19 -07:00
Clark Zinzow
88c5fcde8c
[Datasets] Unrevert Arrow table copy method change. ( #19534 )
2021-10-20 11:57:36 -07:00
Jiao
c51f79bca6
[runtime_env] Support remote s3 package in runtime env ( #19315 )
2021-10-20 10:41:54 -05:00
Jiajun Yao
39e54cd276
[Test] Fix flaky test_gpu test ( #19524 )
2021-10-19 22:36:34 -07:00
Simon Mo
59eef6521b
[Serve] Use regular dict for handle caching ( #19162 )
2021-10-19 21:27:01 -07:00
Jiajun Yao
4fc5b11c68
Simple block dataset groupBy ( #19435 )
2021-10-19 19:53:13 -07:00
Eric Liang
eacfbf8be2
[data] Don't shuffle during repartition by default ( #19379 )
2021-10-19 19:46:22 -07:00
SangBin Cho
3222d39fb8
[Dashboard] Dashboard memory improvement ( #19385 )
...
* many ppo profiling
* completed
* improve memory usage lint
* revert temporarily
* Addressed code review
* Fix a test
2021-10-19 19:34:42 -07:00