Commit graph

1652 commits

Author SHA1 Message Date
Alex Wu
31d89be926
[Workflow] Basic event support (#19239)
* basics

* .

* .

* a test

* a test

* tests

* cleanup

* concepts page

* docs

* polish

* fix sleep

* fix yi things

* lint

* fix

* .

* .

* .

* fix?

* .

Co-authored-by: Alex Wu <alex@anyscale.com>
2021-10-22 15:27:33 -07:00
Simon Mo
30d9f8fbae
[Doc] [Serve] Fix code cutoff and broken linkes in deployment.rst (#19573) 2021-10-21 13:47:55 -07:00
Simon Mo
32e648e5fa
[Serve][Doc] Add Failure Recovery Doc (#19166) 2021-10-21 13:32:42 -07:00
Ameer Haj Ali
923adb6512
Update docs to make sure user does ssh port forwarding from another terminal (#19367) 2021-10-21 13:17:08 -07:00
Simon Mo
03406706b3
[Serve] [Doc] Add Autoscaling Documentation (#19559) 2021-10-21 13:11:29 -07:00
Ian Rodney
0cdf4ae8d0
[AWS] Stop Round Robining AZs (#19051)
* round robin on failure to launch

* still round-robin spot instances

* prioritize first AZ

* no more round-robining

* doc updates

* Order subnets by AZ

* add spot instance advisor link

* ensure we try all AZs

* fix typos
2021-10-21 12:06:44 -07:00
SangBin Cho
9000f41aa6
[Nightly Test] Support memory profiling on Ray + implement memory monitor for nightly tests (#19539)
* random fixes

* Done

* done

* update the doc

* doc lint fix

* .

* .
2021-10-21 07:37:05 -07:00
matthewdeng
b3b739266e
[docs] add dask compatibility for 1.8.0 (#19578) 2021-10-21 07:26:07 -07:00
Qing Wang
048e7f7d5d
[Core] Port concurrency groups with asyncio (#18567)
## Why are these changes needed?
This PR aims to port concurrency groups functionality with asyncio for Python.

### API
```python
@ray.remote(concurrency_groups={"io": 2, "compute": 4})
class AsyncActor:
    def __init__(self):
        pass

    @ray.method(concurrency_group="io")
    async def f1(self):
        pass

    @ray.method(concurrency_group="io")
    def f2(self):
        pass

    @ray.method(concurrency_group="compute")
    def f3(self):
        pass

    @ray.method(concurrency_group="compute")
    def f4(self):
        pass

    def f5(self):
        pass
```
The annotation above the actor class `AsyncActor` defines this actor will have 2 concurrency groups and defines their max concurrencies, and it has a default concurrency group.  Every concurrency group has an async eventloop and a pythread to execute the methods which is defined on them.

Method `f1` will be invoked in the `io` concurrency group. `f2` in `io`, `f3` in `compute` and etc.
TO BE NOTICED, `f5` and `__init__` will be invoked in the default concurrency.

The following method `f2` will be invoked in the concurrency group `compute` since the dynamic specifying has a higher priority.
```python
a.f2.options(concurrency_group="compute").remote()
```

### Implementation
The straightforward implementation details are:
 - Before we only have 1 eventloop binding 1 pythread for an asyncio actor. Now we create 1 eventloop binding 1 pythread for every concurrency group of the asyncio actor.
- Before we have 1 fiber state for every caller in the asyncio actor. Now we create a FiberStateManager for every caller in the asyncio actor. And the FiberStateManager manages the fiber states for concurrency groups.


## Related issue number
#16047
2021-10-21 21:46:56 +08:00
Oscar Knagg
5a05e89267
[Core] Add TLS/SSL support to gRPC channels (#18631) 2021-10-20 22:39:11 -07:00
Jiajun Yao
4fc5b11c68
Simple block dataset groupBy (#19435) 2021-10-19 19:53:13 -07:00
Simon Mo
30c8c073a2
[Doc] Generate sitemap (#19375) 2021-10-19 14:14:17 -07:00
Edward Oakes
a596d59863
[serve] Modify serve debugger example to use current APIs (#19513) 2021-10-19 13:21:56 -07:00
Duarte OC
5af6152e76
[Serve] [Doc] Update docs with import missing (#19469) 2021-10-19 11:23:50 -07:00
Alex Wu
a819e417ac
Revert "[Hotfix] Revert "[Workflow] workflow.delete"" (#19248)
* Revert "Revert "[Workflow] workflow.delete (#19178)" (#19247)"

This reverts commit b59317520d.

* fix

* .

* .

* .

* Revert "."

This reverts commit 423b9b8e7e83f07cb0942b04e568e37ea0c62ba8.

* .

* .

* done?

* 4real

Co-authored-by: Alex <alex@anyscale.com>
2021-10-19 09:47:56 -07:00
matthewdeng
4674c78050
[Train] Rename Ray SGD v2 to Ray Train (#19436) 2021-10-18 22:27:46 -07:00
Guyang Song
46b4c7464d
runtime env eager install by default (#19449) 2021-10-19 11:31:14 +08:00
Jiajun Yao
4d9585773f
[Release] Remove release process doc (#19312) 2021-10-18 11:24:03 -07:00
Eric Liang
13d4ad6100
[data] Preserve epoch by default when using rewindow() (#19359) 2021-10-14 09:17:36 -07:00
Antoni Baum
e9df253f5d
[CI/docs] Remove [default] from xgboost-ray (#19186)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-10-14 16:29:55 +01:00
Eric Liang
430a5f4a21
[doc] Bump dataset to beta for 1.8 and add backlink to SGD (#19332) 2021-10-12 18:32:29 -07:00
Jasha10
53e791d136
[Docs] Fix Typo in walkthrough (#19335)
There is one backtick too many in walkthrough.rst, it's causing a formatting issue.
2021-10-12 17:47:28 -07:00
Eric Liang
9f1cd9e867
[docs] Document fake multi-node autoscaler (#19329) 2021-10-12 15:59:07 -07:00
Amog Kamsetty
f6f2435b91
[SGD] Sgd v2 Dataset Integration (#17626)
* wip

* wip

* wip

* draft

* disable tf autosharding

* wip

* wip

* wip

* wip

* add example

* wip

* wip

* wip

* use dataset.split

* add unit tests

* add linear example

* concatenate tensors and fix example

* WIP tune example

* add tensorflow example

* wip

* random_shuffle_each_window

* fault tolerance test

* GPU, examples, CI

* formatting

* fix

* Update python/ray/util/sgd/v2/tests/test_trainer.py

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* wip

* type hints

* wip

* update user guide

* fix

* fix immediate issues

* update example

* update

* fix tune gpu test

* fix resources for smoke test - 1 CPU for dataset tasks

* update tests, docs, examples

* Apply suggestions from code review

Co-authored-by: Clark Zinzow <clarkzinzow@gmail.com>

* address comments

* add warning

* fix tests

* minor doc updates

* update example in doc

* configure tests

* Update doc/source/raysgd/v2/user_guide.rst

Co-authored-by: Clark Zinzow <clarkzinzow@gmail.com>

* Update python/ray/data/dataset.py

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* fix docstring

Co-authored-by: Matthew Deng <matthew.j.deng@gmail.com>
Co-authored-by: matthewdeng <matt@anyscale.com>
Co-authored-by: Clark Zinzow <clarkzinzow@gmail.com>
2021-10-12 14:03:10 -07:00
Eric Liang
0ab6749602
Support iter_epochs for Datasets (#19217) 2021-10-12 11:05:00 -07:00
Wansoo Kim
0f6d4661d7
[tune] Port all MNIST examples to specify data_dir (#19033)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-10-12 15:36:06 +01:00
architkulkarni
1ee3b4136c
[Serve] [Doc] Serve fix tracing snippet (#19296) 2021-10-11 16:59:04 -07:00
Chen Shen
c740aae54c
[Core][Dataset] adding example for large scale data ingestion (#18998) 2021-10-11 15:37:09 -07:00
Matti Picus
9ca34c7192
add dependencies to BUILD.bazel and update windows bazel to 4.2.1 (#19132)
* add dependencies to BUILD.bazel and update windows bazel to 4.2.1

* fixes from review
2021-10-11 10:25:19 -07:00
Guyang Song
bae543c956
[runtime env] support eager_install in runtime env (#17949) 2021-10-09 17:59:57 +08:00
Eric Liang
b59317520d
Revert "[Workflow] workflow.delete (#19178)" (#19247)
This reverts commit 7ea512f343.
2021-10-08 19:12:55 -07:00
Alex Wu
7ea512f343
[Workflow] workflow.delete (#19178)
Why are these changes needed?
This PR implements workflow.delete which allows users to delete the information in storage related to a workflow. (This assumes the workflow isn't currently running).

Related issue number
Closes #18848
2021-10-08 16:11:59 -07:00
Antoni Baum
c7d6f838f6
[tune] Optional forcible trial cleanup, return default autofilled metrics even if Trainable doesn't report at least once (#19144) 2021-10-08 18:16:26 +01:00
xwjiang2010
7ffd9cbed1
[Tune] Fix column width in doc. (#19159) 2021-10-07 18:16:21 +01:00
Antoni Baum
27b8633198
[docs] Remove outdated note in Tune docs (#19110) 2021-10-07 15:42:11 +01:00
Edward Oakes
0f33aaf933
Revert "[Doc] Document existing runtime env's container support (#19076)" (#19160)
This reverts commit 4beba3f727.
2021-10-07 08:55:30 -05:00
Eric Liang
86cbe3e833
[data] Add support for repeating and re-windowing a DatasetPipeline (#19091) 2021-10-06 20:13:43 -07:00
Simon Mo
4beba3f727
[Doc] Document existing runtime env's container support (#19076) 2021-10-06 10:25:57 -05:00
architkulkarni
281fcaa91a
[Serve] [Doc] Add note about serving multiple deployments defined by the same class (#19118) 2021-10-06 10:24:42 -05:00
Amog Kamsetty
db0483a29a
[SGD] SGD Namespace Consistency (#19048)
* wip

* update

* add callbacks

* fix

* fix

* update

* add

* address comments
2021-10-05 15:56:42 -07:00
matthewdeng
3fbe135a24
[docs] add modin_xgboost and dask_xgboost notebook tutorials (#18775)
* Add xgboost-dask golden notebook

* [examples] add modin-xgboost Jupyter notebook

* Add xgboost dast gn

* update modin notebook to sphinx-gallery compatible python file

* fix build file

* fix test

* fix test

* Add modin notebook anyscale connect test

* Add missing file

* add dask_xgboost notebook

* Add the new modin golden notebook to CI

* fix lint and filter out tests with py37

* Update release/golden_notebook_tests_new/golden_notebook_tests.yaml

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Add dask, wait for cluster client, remove pytest

* Replace folder

* Fix

* Update dask_xgboost_app_config.yaml

* Update modin_xgboost_app_config.yaml

* comment on filtered out tests

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2021-10-05 09:17:33 -07:00
Dmitri Gekhtman
beaba4782a
[k8s][doc] Fix service name in K8s static deployment example (#19065) 2021-10-04 20:23:54 -05:00
Jiajun Yao
7ccf737f97
Add compatible dask version for ray 1.6.0 and 1.7.0 (#19080) 2021-10-05 10:23:06 +09:00
Eric Liang
032a420ee6
Rename Dataset.pipeline to Dataset.window (#19050) 2021-10-01 19:55:29 -07:00
Clark Zinzow
d22f838795
[Datasets] Delineate between ref and raw APIs for the Pandas/Arrow integrations. (#18992) 2021-10-01 13:08:25 -07:00
Antoni Baum
cc3199b814
[docs] Provide information about resource deadlocks, early stopping in Tune docs (#18947)
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-10-01 13:52:47 +01:00
architkulkarni
8af9646cb0
[Doc] [runtime env] Remove delta caching remark and state Client+@remote limitation (#19010) 2021-09-30 13:29:50 -05:00
architkulkarni
0f0b161ea1
Revert "Revert "[Serve] [doc] Improve runtime env doc"" (#18943)
* Revert "Revert "[Serve] [doc] Improve runtime env doc (#18782)" (#18935)"

This reverts commit e4f4c79252.
2021-09-30 13:28:44 -05:00
Amog Kamsetty
98ac3f601c
[SGD] v1 to v2 Migration Guide (#18887)
* wip

* add guide

* fix test

* address comments

* add to docs

* fix

* remove markdown

* add warning to all pages

* formatting

* fix

* links

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update doc/source/raysgd/v2/migration-guide.rst

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* address comments

* address comments

* fix

* address comments

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2021-09-30 09:15:21 -07:00
architkulkarni
bf6e50813c
[runtime env] Parse local pip/conda requirements files locally upon task/actor definition (#18988) 2021-09-30 09:47:15 -05:00