Sven Mika
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). ( #19693 )
2021-10-25 15:00:00 +02:00
Kai Fricke
6e455e59d8
[tune] Verbosely/gracefully handle empty experiment checkpoints ( #19641 )
2021-10-25 13:41:18 +01:00
Kai Fricke
0cfa267fde
[tune] Fix shim error message for scheduler ( #19642 )
2021-10-25 11:16:16 +01:00
gjoliver
89fbfc00f8
[RLlib] Some minor cleanups (buffer buffer_size -> capacity and others). ( #19623 )
2021-10-25 09:42:39 +02:00
roireshef
9b0352f363
[RLlib] Added LearningRateSchedule and EntropyCoeffSchedule to TF and Torch versions of A3C and PPO ( #19276 )
2021-10-25 09:39:35 +02:00
gjoliver
c3c42278e4
[RLlib] clean up all the SampleBatch['is_training'] deprecation warnings ( #19652 )
...
* [RLlib] clean up all the SampleBatch['is_training'] deprecation warnings.
* wip
2021-10-25 09:38:56 +02:00
Renos Zabounidis
41dd037ae9
[RLlib; Docs] Correcting documentation with respect to postprocess_trajectory ( #19672 )
...
postprocess_trajectory is referred to incorrectly in the rllib-environments documentation. When defining a custom policy, a user never directly modifies Policy.postprocess_trajectory, they define postprocess_fn, which is in turn called by postprocess_trajectory.
2021-10-25 09:37:58 +02:00
Jiajun Yao
f6a0165286
Add dependabot for data processing ( #19682 )
2021-10-24 20:49:43 -07:00
SangBin Cho
aa9eb6499c
[Test] skip pg restart test ( #19670 )
2021-10-24 16:53:29 -07:00
Philipp Moritz
22eef65134
[Windows] Suppress 'Windows fatal exception: access violation' ( #19561 )
...
* suppress 'Windows fatal exception: access violation'
* lint
* update
* Update python/ray/_private/log_monitor.py
Co-authored-by: Matti Picus <matti.picus@gmail.com>
* fixE
* re-introduce mattip's fix again
* update
* .
Co-authored-by: Matti Picus <matti.picus@gmail.com>
Co-authored-by: Alex <alex@anyscale.com>
2021-10-24 11:23:23 -07:00
dependabot[bot]
0cd05403b0
Bump pillow from 7.2.0 to 8.3.2 in /doc ( #18422 )
...
Bumps [pillow](https://github.com/python-pillow/Pillow ) from 7.2.0 to 8.3.2.
- [Release notes](https://github.com/python-pillow/Pillow/releases )
- [Changelog](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst )
- [Commits](https://github.com/python-pillow/Pillow/compare/7.2.0...8.3.2 )
---
updated-dependencies:
- dependency-name: pillow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-23 18:36:14 -07:00
dependabot[bot]
5ed1530170
[tune](deps): Bump starlette in /python/requirements/ml ( #18691 )
...
Bumps [starlette](https://github.com/encode/starlette ) from 0.14.2 to 0.16.0.
- [Release notes](https://github.com/encode/starlette/releases )
- [Changelog](https://github.com/encode/starlette/blob/master/docs/release-notes.md )
- [Commits](https://github.com/encode/starlette/compare/0.14.2...0.16.0 )
---
updated-dependencies:
- dependency-name: starlette
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-23 18:34:14 -07:00
dependabot[bot]
55ab8da3c8
[tune](deps): Bump accelerate in /python/requirements/ml ( #19057 )
...
Bumps [accelerate](https://github.com/huggingface/accelerate ) from 0.3.0 to 0.5.1.
- [Release notes](https://github.com/huggingface/accelerate/releases )
- [Commits](https://github.com/huggingface/accelerate/compare/v0.3.0...v0.5.1 )
---
updated-dependencies:
- dependency-name: accelerate
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-23 18:33:01 -07:00
dependabot[bot]
9201687b34
[tune](deps): Bump pytorch-lightning in /python/requirements/ml ( #19059 )
...
Bumps [pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning ) from 1.4.5 to 1.4.9.
- [Release notes](https://github.com/PyTorchLightning/pytorch-lightning/releases )
- [Changelog](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/CHANGELOG.md )
- [Commits](https://github.com/PyTorchLightning/pytorch-lightning/compare/1.4.5...1.4.9 )
---
updated-dependencies:
- dependency-name: pytorch-lightning
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-23 18:30:59 -07:00
dependabot[bot]
bea802cb80
[tune](deps): Bump wandb in /python/requirements/ml ( #19646 )
...
Bumps [wandb](https://github.com/wandb/client ) from 0.10.29 to 0.12.5.
- [Release notes](https://github.com/wandb/client/releases )
- [Changelog](https://github.com/wandb/client/blob/master/CHANGELOG.md )
- [Commits](https://github.com/wandb/client/compare/v0.10.29...v0.12.5 )
---
updated-dependencies:
- dependency-name: wandb
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-10-23 18:23:52 -07:00
Amog Kamsetty
97878162f4
[ActorGroup] Retry ActorGroup ( #19658 )
2021-10-23 16:37:29 -05:00
Edward Oakes
445fb0ee99
[runtime_env] Deflake test_runtime_env_working_dir.py ( #19665 )
2021-10-23 16:35:42 -05:00
Eric Liang
875d19f838
[data] Fix inconsistent naming of to_refs() methods, remove to_arrow() ( #19620 )
2021-10-23 12:20:23 -07:00
Jiao
e53fecfbd5
[jobs] Initial http jobs server on head node ( #19657 )
2021-10-23 12:48:16 -05:00
mwtian
d656b3a6d7
[Doc] Update instruction on starting Ray cluster for Ray client ( #19653 )
2021-10-22 19:14:07 -07:00
Philipp Moritz
dbd61b9e6b
[Dashboard] Include the dashboard in Windows wheels ( #19575 )
2021-10-22 17:57:36 -07:00
Jiajun Yao
a7b219fea1
[Core] Don't unpickle and run functions exported by other jobs ( #19576 )
2021-10-22 17:13:20 -07:00
Gagandeep Singh
358aa57474
Fixed usage of `cv_.wait_for
` ( #19582 )
...
* Fixed usage of cv.wait_for
* Changed method to calculate remaining time out
* Modify timeout_ms -> remaining_timeout_ms
2021-10-22 16:23:13 -07:00
Edward Oakes
b4673daac6
[ray client] Add test that ray.init doesn't require resources to connect ( #19635 )
2021-10-22 18:21:53 -05:00
Alex Wu
31d89be926
[Workflow] Basic event support ( #19239 )
...
* basics
* .
* .
* a test
* a test
* tests
* cleanup
* concepts page
* docs
* polish
* fix sleep
* fix yi things
* lint
* fix
* .
* .
* .
* fix?
* .
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-10-22 15:27:33 -07:00
Edward Oakes
c9258aff0f
Revert "[ActorGroup] Add ActorGroup
( #18960 )" ( #19655 )
...
This reverts commit 4f05bac8fb
.
2021-10-22 14:55:17 -07:00
shrekris-anyscale
cfae64ebe8
[multiprocessing] Modify Ray's map_async() to match Multiprocessing's map_async() behavior ( #19403 )
2021-10-22 16:31:34 -05:00
Gagandeep Singh
2f8da8f8c8
Bumped timeout due to slow test times in Windows ( #19595 )
2021-10-22 13:48:15 -07:00
Jiao
f0be4cb390
[jobs] Add job manager class for simple jobs python APIs ( #19567 )
2021-10-22 14:18:11 -05:00
Jiajun Yao
43b8f8e522
Revert "Revert "[Test] Fix flaky test_gpu test ( #19524 )" ( #19562 )" ( #19643 )
...
This reverts commit 7daf28f348
.
2021-10-22 11:48:57 -07:00
Yi Cheng
48fb86a978
[core] Fix the spilling back failure in case of node missing ( #19564 )
...
## Why are these changes needed?
When ray spill back, it'll check whether the node exists or not through gcs, so there is a race condition and sometimes raylet crashes due to this.
This PR filter out the node that's not available when select the node.
## Related issue number
#19438
2021-10-22 11:22:07 -07:00
mwtian
530f2d7c5e
[Pubsub] Wrap Redis-based publisher in GCS to allow incrementally switching to the GCS-based publisher ( #19600 )
...
## Why are these changes needed?
The most significant change of the PR is the `GcsPublisher` wrapper added to `src/ray/gcs/pubsub/gcs_pub_sub.h`. It forwards publishing to the underlying `GcsPubSub` (Redis-based) or `pubsub::Publisher` (GCS-based) depending on the migration status, so it allows incremental migration by channel.
- Since it was decided that we want to use typed ID and messages for GCS-based publishing, each member function of `GcsPublisher` accepts a typed message.
Most of the modified files are from migrating publishing logic in GCS to use `GcsPublisher` instead of `GcsPubSub`.
Later on, `GcsPublisher` member functions will be migrated to use GCS-based publishing.
This change should make no functionality difference. If this looks ok, a similar change would be made for subscribers in GCS client.
## Related issue number
2021-10-22 10:52:36 -07:00
Edward Oakes
0760fe869d
[runtime_env] Clean up working dir tests, add more test cases ( #19597 )
2021-10-22 12:35:27 -05:00
Amog Kamsetty
4f05bac8fb
[ActorGroup] Add ActorGroup
( #18960 )
...
* move
* fix
* Revert "fix"
This reverts commit 532660fc334ae96a0ff34c8ab1288488312300a3.
* Revert "move"
This reverts commit 54321f4a539c2ee873f17d988da5627588aeff97.
* add
* wip
* wip
* wip
* wip
* address comments
* wip
* add to build
* fix
* fix
* fix
2021-10-22 10:22:31 -07:00
Simon Mo
1eb142b57c
[Serve] Fix shutdown protocol again ( #19609 )
2021-10-22 09:27:32 -07:00
Jiajun Yao
256bf0bf3a
[Release] Bump up dask to latest compatible version 2021.9.1 ( #19592 )
...
* Bump up dask to latest compatible version 2021.9.1
* Bump up dask to latest compatible version 2021.9.1
2021-10-22 09:16:28 -07:00
architkulkarni
030acf3857
[Serve] [Serve Autoscaler] Add upscale and downscale delay ( #19290 )
2021-10-22 10:33:28 -05:00
xwjiang2010
a632cb439f
[Tune] Remove queue_trials. ( #19472 )
2021-10-22 09:24:54 +01:00
Qing Wang
580b58a68f
[Java] Update CodeOwners for Java worker. ( #19594 )
...
Since some pom.xml files were removed before, let me update the CodeOwners about that.
2021-10-22 16:17:05 +08:00
Stephanie Wang
499d6e9fc1
Turn on reconstruction tests in CI ( #19497 )
2021-10-21 22:34:44 -07:00
Eric Liang
50e305e799
[data] Add take_all() and raise error if to_pandas() drops records ( #19619 )
2021-10-21 22:23:50 -07:00
Yi Cheng
59b2f1f3f2
[gcs] Update select nodes to save cpu utilization ( #19608 )
...
## Why are these changes needed?
Recently we found that gcs is using a lot of CPU in scheduling actors and it's because the code is not well organized. This PR improved the SelectNodes function. From profiling, for many nodes actor test, 50% of CPU is wasted and could be saved here.
## Related issue number
2021-10-21 22:15:17 -07:00
SangBin Cho
9a050c666d
[Test] Add a stronger resource leak check to pg unit tests. ( #19586 )
...
* Add a stronger check to unit tests.
* .
2021-10-21 21:40:00 -07:00
Edward Oakes
11b6019fb5
[ray client] Fix connecting to a cluster without available CPUs ( #19604 )
2021-10-21 21:21:50 -05:00
Jiajun Yao
920384f34e
[Doc] Fix Dataset __annotations__ ( #19599 )
2021-10-21 17:33:55 -07:00
SangBin Cho
cea7fda41a
Revert "Revert "[Dashboard] Disable unnecessary event messages. ( #19490 )" ( #19574 )" ( #19577 )
...
This reverts commit 699c5aeac6
.
2021-10-21 15:36:22 -07:00
SangBin Cho
19e3280824
[Core] Fix shutdown Core worker crash when pg is removed. ( #19549 )
...
* fix core worker crash
* remove file
* done
2021-10-21 14:30:54 -07:00
Simon Mo
30d9f8fbae
[Doc] [Serve] Fix code cutoff and broken linkes in deployment.rst ( #19573 )
2021-10-21 13:47:55 -07:00
Simon Mo
03805d4064
[Serve] Good error message when Serve not installed and ensure Serve installs ray[default] ( #19570 )
2021-10-21 13:47:29 -07:00
Simon Mo
32e648e5fa
[Serve][Doc] Add Failure Recovery Doc ( #19166 )
2021-10-21 13:32:42 -07:00