Commit graph

5804 commits

Author SHA1 Message Date
Balaji Veeramani
43a9e95dc0
[CI] Add support for Black formatting (#21281) 2022-01-03 10:06:41 -08:00
Balaji Veeramani
4e8f90aca2
[Train] Replace abc.ABCMeta with abc.ABC in callbacks (#21262)
Inheriting from `abc.ABC` is more readable than setting the meta class to `abc.ABCMeta`.

Relevant snippet from the Python 3.4 release notes:
> New class ABC has ABCMeta as its meta class. Using ABC as a base class has essentially the same effect as specifying metaclass=abc.ABCMeta, but is simpler to type and easier to read. (Contributed by Bruno Dupuis in bpo-16049.)

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Matthew Deng <matthew.j.deng@gmail.com>
2022-01-03 09:25:44 -08:00
Balaji Veeramani
fa4e41c5b2
[Train] Monkeypatch environment variables in test_json (#21260)
If we use `os.environ` to set environment variables in tests, then our tests become coupled. By using `monkeypatch`, we can safely set environment variables while ensuring our tests remain decoupled. 

For more information, see the [monkeypatching documentation](https://docs.pytest.org/en/6.2.x/monkeypatch.html#monkeypatching-environment-variables).
2022-01-03 09:12:44 -08:00
Antoni Baum
7ce22b72ed
[datasets] Expand to_torch's functionality (#21117)
Expands the `to_torch` method for Datasets with:
* An ability to choose to output a list/dict of feature tensors instead of just one (through setting `feature_columns` to be a list of lists or a dict of lists)
* An ability to choose whether the label should be unsqueezed or not
* An ability to pass `None` as the label (for prediction).

Furthermore, this changes how the `feature_column_dtypes` argument works. Previously, it took a list of dtypes for each feature. However, as the tensor was concatenated in the end, only one dtype mattered (the biggest one). Now, this argument expects a single dtype which will be applied to the features tensor (or a list/dict if `feature_columns` is a list of list/dict of lists).

Unit tests for all cases are included.

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2022-01-03 09:03:50 -08:00
xwjiang2010
c18caa4db3
[tune] remove TrialExecutor.resume_trial. (#21225)
This removes unused code.
2022-01-03 16:38:40 +00:00
Antoni Baum
6a2dedb41d
[tune] Fix dtype coercion in tune.choice (#21270)
When a list with mixed types is passed to tune.choice, they will be coerced to a single dtype during sampling (due to numpy.choice converting to an array internally). This behaviour is unintentional and surprising. This PR fixes this issue.
2022-01-03 16:32:30 +00:00
Kai Fricke
489e6945a6
Revert "[RLlib] Updated pettingzoo wrappers, env versions, urls (#20113)" (#21338)
This reverts commit 327eb84154.
2022-01-03 10:21:25 +00:00
Benjamin Black
327eb84154
[RLlib] Updated pettingzoo wrappers, env versions, urls (#20113) 2022-01-02 21:29:09 +01:00
Balaji Veeramani
fae5b9b1af
[Core] Disable formatting in test_add_min_workers_nodes (#21322)
Black errors while formatting `test_resource_demand_scheduler.py`. The issue is caused by the [assertions](https://github.com/ray-project/ray/blob/master/python/ray/tests/test_resource_demand_scheduler.py#L383-L428) at the end of `test_add_min_workers_nodes`. 



To prevent `format.sh` from erroring once we switch to Black, I've disabled formatting around the assertions.
2022-01-01 18:16:33 -08:00
WanXing Wang
412cd6be76
[Core]Add RAY_REDIS_ADDRESS environment to specify external address. (#20966)
Support RAY_REDIS_ADDRESS environment variable option when ray start.
2021-12-31 16:12:56 +08:00
mwtian
20ca1d85c2
[GCS][Bootstrap 2/n] Fix tests to enable using GCS address for bootstrapping (#21288)
This PR contains most of the fixes @iycheng made in #21232, to make tests pass with GCS bootstrapping by supporting both Redis and GCS address as the bootstrap address. The main change is to use address_info["address"] to obtain the bootstrap address to pass to ray.init(), instead of using address_info["redis_address"]. In a subsequent PR, address_info["address"] will return the Redis or GCS address depending on whether using GCS to bootstrap.
2021-12-29 19:25:51 -07:00
Jiajun Yao
9776e21842
Revert "Round robin during spread scheduling (#19968)" (#21293)
This reverts commit 60388b2834.
2021-12-30 10:33:06 +09:00
mwtian
5377832383
[GCS][Bootstrap 1/n] Support bootstrapping with GCS in node.py (#21267) 2021-12-28 08:14:38 -07:00
Philipp Moritz
4b9e865fd7
Unskip remaining tests in test_basic.py on Windows (#21273) 2021-12-27 21:20:45 -08:00
Matti Picus
3de18d2ada
WINDOWS: enable passing/skipping tests (#21136) 2021-12-27 11:59:00 -08:00
Israël Hallé
59209d695b
Includes .pyi files in package data. (#21247) 2021-12-27 11:50:02 -08:00
Matti Picus
fcb952e1bc
WINDOWS: unskip passing runtime_env tests (#21252) 2021-12-26 20:49:02 -08:00
Akash Patel
cbcd03b779
Upgrade cython to 0.29.26 for py310 (#21244) 2021-12-26 20:26:08 -08:00
xwjiang2010
0b9cdb1eae
[tune] Have one canonical way of stopping trial. (#21021)
This PR is introducing a canonical impl for stopping trials by collecting scattered logic from process_trial_result back into stop_trial. This way, we know what is expected (e.g. what callbacks are invoked and when they are invoked).
This PR will correct the current wrong logic that on_trial_complete callback is invoked before on_trial_checkpoint, which is the source of Syncer clean up issues.
2021-12-25 10:13:30 +01:00
Gagandeep Singh
c5c5fec22b
Unskip test_standalone from ci.sh (#21235) 2021-12-25 00:21:58 -08:00
Yi Cheng
0d537c5d70
[5/gcs] Bootstrap default worker and update pubsub unit test (#21211)
This PR passes gcs address to worker and also update pubsub unit test.

Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com>
Co-authored-by: Mingwei Tian <mwtian@anyscale.com>
2021-12-23 07:57:14 -07:00
Jiajun Yao
60388b2834
Round robin during spread scheduling (#19968) 2021-12-22 20:27:34 -08:00
Yi Cheng
11ab412db1
[4/gcs] Bootstrap global accessor from gcs (#21195)
This is part of redis removal. This PR enable global accessor to be able to start from gcs

Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com>
Co-authored-by: Mingwei Tian <mwtian@anyscale.com>
2021-12-22 01:27:25 -08:00
Gagandeep Singh
92bf609a08
Unskip tests in `test_basic_3.py` (#20433) 2021-12-22 00:09:32 -08:00
Yi Cheng
0c786b1109
[3/gcs] Bootstrap log monitor and monitor from gcs (#21194)
This is part of redis removal. This PR enable log monitor and monitor to bootstrap from gcs

Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com>
Co-authored-by: Mingwei Tian <mwtian@anyscale.com>
2021-12-21 23:15:55 -08:00
Sidhartha Parhi
5d6409fe2e
[Train] Remove run_dir param from BackendExecutor (#21231)
The run_dir argument in ray.train.backend.BackendExecutor.start_training isn't used but is causing the following error: if your host computer and job cluster use different OS, then you get a pathlib error because, for e.g., you can't instantiate a pathlib.WindowsPath in a Linux system.

Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
2021-12-21 19:54:43 -08:00
Amog Kamsetty
57db4640ca
[Train] [Tune] Refactor MLflow (#20802)
Pulls out Tune's MLflow logging logic to a shared MLflow util.
Adds an MLflow logger callback to Ray Train

Closes #20642
2021-12-21 17:17:52 -08:00
Yi Cheng
09421a4ca6
[2/gcs] Bootstrap dashboard for gcs ha (#21179)
This is part of gcs ha project. This PR try to bootstrap dashboard with gcs address instead of redis.

Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com>
2021-12-21 16:58:03 -08:00
Eric Liang
1db03862a7
Isolate function exports by job in separate queues (#20882) 2021-12-21 16:19:00 -08:00
Gagandeep Singh
5dc0f90ada
[Windows] Unskipped tests in test_standalone.py (#21213) 2021-12-21 11:37:23 -08:00
Yi Cheng
f62faca04c
[1/gcs] gcs ha bootstrap for raylet (#21174)
This is part of #21129

This PR tries to cover the cpp/ray part of the bootstrap, some updates there:

remove the unused function/tests
some API updates

Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com>
2021-12-21 08:50:42 -08:00
SangBin Cho
5d3042ed9d
[Internal Observability] Record Raylet Gauge (#21049)
* Revert "[Please revert] Remove new metrics temporarily"

This reverts commit baf7846daa3d1dad50dbedac19b7afbae3e197fc.

* Addressed code review.

* [Please revert] Revert plasma stats for the next PR

* improve grammar

* Addressed code review v1.

* Addressed code review.

* Add code owner.

* Fix tests.

* Add code owner to metric_defs.cc
2021-12-21 00:34:48 -08:00
Dmitri Gekhtman
c9cf912a15
[autoscaler] Pass on provider.internal_ip() exceptions during scale down (#21204)
Treats failures of provider.internal_ip during node drain as non-fatal.
For example, if a node is deleted by a third party between the time it's scheduled for termination and drained, there will now be no error on GCP.

Closes #21151
2021-12-20 22:23:17 -08:00
Qing Wang
94251fbcc4
[Core] Fix invalid to specify concurrency group at runtime. (#21191)
We fix the issue that it's unable to specify the concurrency group name of an actor task at runtime with the following usage:
```python
a.f2.options(concurrency_group="compute").remote()
```
2021-12-21 10:47:47 +08:00
Linsong Chu
61bbecdb7d
[Workflow]add doc for metadata (#20156)
This PR adds documentation for Workflow Metadata, which we recently added support in https://github.com/ray-project/ray/pull/19372.

Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
2021-12-20 17:24:07 -08:00
Hankpipi
ae5bb34f60
[Serve Autoscaler] Raise warning if max_concurrent_queries < target_num_ongoing_requests (#21184) 2021-12-20 16:07:19 -08:00
iasoon
1c93beb490
[serve] use true nulls in snapshot (#21062) 2021-12-20 16:07:09 -08:00
architkulkarni
5cc1308c66
[runtime env] [doc] [test] Add docs and tests for RAY_runtime_env_skip_local_gc environment variable (#21163) 2021-12-20 10:34:59 -08:00
SangBin Cho
5959669a70
[Core] Remove task table. (#21188)
Remove task table that's not used anymore.
2021-12-20 06:22:01 -08:00
mwtian
06ec07057c
Revert "[Core] Unrevert #21115, fix auto address env (#21158)" (#21189)
This reverts commit 968f08607b.

It is breaking e2e tests where worker nodes cannot start. e.g.

```
Traceback (most recent call last):
  File "/home/ray/anaconda3/bin/ray", line 8, in <module>
    sys.exit(main())
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/scripts/scripts.py", line 1961, in main
    return cli()
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/autoscaler/_private/cli_logger.py", line 808, in wrapper
    return f(*args, **kwargs)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/scripts/scripts.py", line 733, in start
    address_ip, password=redis_password)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/services.py", line 593, in create_redis_client
    _, redis_ip_address, redis_port = validate_bootstrap_address(redis_address)
  File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/_private/services.py", line 494, in validate_bootstrap_address
    raise ValueError("Malformed address. Expected '<host>:<port>'.")
ValueError: Malformed address. Expected '<host>:<port>'.
```
2021-12-20 00:22:12 -08:00
Oliver Mannion
8d9e0fca61
fix: data not exported (#20887)
* fix: data not exported

* empty commit
2021-12-18 22:33:34 -08:00
Clark Zinzow
968f08607b
[Core] Unrevert #21115, fix auto address env (#21158)
This PR unreverts #21115, fixing the handling of an `"auto"` address in the `RAY_ADDRESS` environment variable.

Co-authored-by: Mingwei Tian <mwtian@anyscale.com>
2021-12-18 07:45:00 -08:00
Jun Gong
c98d4fe2f3
[ci] Change build-wheel-macos-arm64.sh to be executable. (#21164)
So the script can be simply executed. All the other build-wheels-xxx.sh are executable.
2021-12-17 17:23:10 -08:00
Clark Zinzow
c3d68fa0c1
[Dask-on-Ray] Add Dask config helper, set task-based shuffle by default. (#21114)
Dask default's to a disk-based shuffle even thought we're using a distributed scheduler, which appears to be resulting in dropped data since the filesystem isn't shared across nodes. Dask Distributed manually sets the shuffle algorithm in the global config to the task-based shuffle, which the Dask-on-Ray scheduler should probably do as well.

This PR adds a Dask config helper, `enable_dask_on_ray`, that sets Dask-on-Ray as the default scheduler along with changing the default shuffle to a task-based shuffle. The shuffle method can still be overridden by the user by manually specifying `df.set_index(shuffle="disk")`.
2021-12-17 13:16:37 -08:00
Chen Shen
d99f699e3d
Revert "[Core][GCS] Use port and address flags to configure GCS server / client in GCS bootstrapping mode (#21115)" (#21157)
This reverts commit 0e7c0b491b.
2021-12-17 11:48:40 -08:00
xwjiang2010
ce81ad21f3
Revert "[tune] Elongate test_trial_scheduler_pbt timeout. (#21120)" (#21155) 2021-12-17 11:32:00 -08:00
Gagandeep Singh
14fc023cb6
Bump timeout value for test_worker_capping.py::test_zero_cpu_scheduling (#21035) 2021-12-17 10:51:54 -08:00
Hankpipi
04ecdee9db
[Serve] Fix serve metrics test (#21140) 2021-12-17 10:23:17 -08:00
shrekris-anyscale
7e15a8199e
[Serve] Reduce test_cluster flakiness by increasing timeout (#21146) 2021-12-17 10:22:56 -08:00
mwtian
0e7c0b491b
[Core][GCS] Use port and address flags to configure GCS server / client in GCS bootstrapping mode (#21115)
This change adds support for parsing `--address` as bootstrap address, and treating `--port` as GCS port, when using GCS for bootstrapping.

Not launching Redis in GCS bootstrapping mode, and using GCS to fetch initial cluster information, will be implemented in a subsequent change.

Also made some cleanups.
2021-12-16 15:11:05 -08:00