Raphael CHEN
343ebf8ea7
[tune] Checkpoint according to nested metric ( #14379 )
2021-03-01 17:14:39 +01:00
Qing Wang
f7f64e90ed
[Minor] Remove unused field. ( #14382 )
...
Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
2021-03-01 19:35:28 +08:00
dependabot[bot]
cda4ad044a
[tune](deps): Bump mlflow from 1.13.1 to 1.14.0 in /python/requirements ( #14396 )
...
Bumps [mlflow](https://github.com/mlflow/mlflow ) from 1.13.1 to 1.14.0.
- [Release notes](https://github.com/mlflow/mlflow/releases )
- [Changelog](https://github.com/mlflow/mlflow/blob/master/CHANGELOG.rst )
- [Commits](https://github.com/mlflow/mlflow/compare/v1.13.1...v1.14.0 )
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-03-01 12:28:15 +01:00
dependabot[bot]
c925e8d14c
[tune](deps): Bump ax-platform in /python/requirements ( #14398 )
...
Bumps [ax-platform](https://github.com/facebook/Ax ) from 0.1.19 to 0.1.20.
- [Release notes](https://github.com/facebook/Ax/releases )
- [Changelog](https://github.com/facebook/Ax/blob/master/CHANGELOG.md )
- [Commits](https://github.com/facebook/Ax/compare/0.1.19...v0.1.20 )
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-03-01 12:27:45 +01:00
Kai Fricke
7f9340bb2f
[tune] Add leading zeros to checkpoint directory ( #14152 )
...
* [tune] Add leading zeros to checkpoint directory
* Fix exp analysis tests/support string indices
* Fix tests
* RLLib tests
2021-03-01 12:12:19 +01:00
Kai Fricke
8572774304
[tune] Lookup flat key first before trying to split ( #14388 )
2021-03-01 12:11:03 +01:00
qicosmos
277b6f5d3c
Support arbitrary arguments for c++ worker normal tasks and actor tasks ( #14233 )
2021-03-01 16:27:03 +08:00
niole
be9a584a94
[Docs] Remove version reference in dashboard proxy docs ( #14359 )
2021-02-27 21:06:25 -08:00
Kai Yang
e0e8918d60
[Core] Raylet to pick the node manager port ( #14349 )
2021-02-27 20:27:09 +08:00
Ian Rodney
8cfaea5fc5
[Docker] Make Docker Build Python file easier to use! ( #14223 )
2021-02-26 15:23:02 -08:00
Kai Fricke
b1d0aa9798
Add unit test for ray cluster-dump ( #14389 )
2021-02-26 14:40:09 -08:00
architkulkarni
f9364b1d5c
[Serve] Add logger with backend and replica tags ( #14251 )
2021-02-26 12:46:19 -08:00
SangBin Cho
2b5b0dd3fc
[Core] Fix the issue with duplicated args ( #14329 )
2021-02-26 12:42:58 -08:00
Clark Zinzow
17ae694405
Consolidate Bazel build and test action_env configs to prevent analysis cache discarding. ( #14362 )
2021-02-26 11:14:02 -08:00
Clark Zinzow
6b37720c6a
[Core] Locality-aware leasing: Milestone 4 - Borrowed refs. ( #14296 )
...
* Adds locality-aware leasing for borrowed refs.
* Added tests.
2021-02-26 10:36:12 -08:00
Simon Mo
af085ed8aa
[Serve] Add Perf Tuning Doc ( #14334 )
...
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: architkulkarni <architkulkarni@users.noreply.github.com>
2021-02-26 10:28:02 -08:00
Ian Rodney
e1117ebc8d
[Autoscaler] Fix GCP User Inconsistency ( #14364 )
2021-02-26 10:12:46 -08:00
Amog Kamsetty
09bfcb2a0a
make experiment name configurable ( #14373 )
2021-02-26 08:45:52 -08:00
Raphael CHEN
8cedd16f44
[tune] Correctly validate nested metrics ( #14375 )
...
* [tune] Correctly validate nested metrics
Before:
- Nested metrics couldn't pass validation process, since the nested result was used to validate metrics
After:
- Flattened result is used to validate metrics
* Fix BO test and lint
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-02-26 14:00:06 +01:00
Kai Fricke
4014168928
[tune] Introduce durable()
wrapper to convert trainables into durable trainables ( #14306 )
...
* [tune] Introduce `durable()` wrapper to convert trainables into durable trainables
* Fix wrong check
* Improve docs, add FAQ for tackling overhead
* Fix bugs in `tune.with_parameters`
* Update doc/source/tune/api_docs/trainable.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update doc/source/tune/_tutorials/_faq.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-26 13:59:28 +01:00
Simon Mo
f1c8c8d12f
Bump protobuf to the latest version ( #14365 )
2021-02-25 20:59:18 -08:00
Richard Liaw
3e9ff91218
Revert the reverted heartbeat factor PR (check windows build) ( #14341 )
2021-02-25 20:52:12 -08:00
Clark Zinzow
b844548b57
[dask-on-ray] Adds support for dask.persist() with inlined Ray futures. ( #14294 )
...
* Adds support for dask.persist() with inlined Ray futures.
* Update persist test.
* Add patched dask.persist() documentation.
2021-02-25 17:48:47 -08:00
Xianyang Liu
34a9714dda
[docker] Fix docker 'development' build failure ( #13289 )
2021-02-25 14:57:30 -08:00
Richard Liaw
a2d2275ee1
Revert "[RLlib + Tune] Add placement group support to RLlib. ( #14289 )" ( #14360 )
...
This reverts commit 6cd0cd3bd9
.
2021-02-25 14:27:35 -08:00
Sven Mika
4cd5c1da2c
[RLlib] Remove flaky test case for mixed (tf+torch) policies trainer. ( #14357 )
2021-02-25 14:07:05 -08:00
architkulkarni
ba4b7ccfe8
[Serve] [Doc] Add basic Serve tutorial ( #14256 )
2021-02-25 14:10:08 -06:00
Guy Khazma
e3f3269b15
[doc] Fixes to RayDP docs ( #14309 )
...
* minor fix to raydp docs
* fix pytorch and tensorflow samples
* fix: minor fixes
2021-02-25 11:23:10 -08:00
Sven Mika
6cd0cd3bd9
[RLlib + Tune] Add placement group support to RLlib. ( #14289 )
2021-02-25 16:01:31 +01:00
Sven Mika
8000258333
[RLlib] R2D2 Implementation. ( #13933 )
2021-02-25 12:18:11 +01:00
SangBin Cho
4357055305
[Shuffle] Emulate multi node in shuffle.py ( #14331 )
...
* done.
* Formatting.
* done.
* Addressed code review.
* Addressed code review 2.
2021-02-24 23:49:29 -08:00
Kai Fricke
d9e5d5f47a
[RLlib] Cast fcnet_hiddens to list for DQN models (list vs tuple mismatch error) ( #14308 )
2021-02-25 08:06:08 +01:00
Eric Liang
adbdacae58
add more io workers ( #14330 )
2021-02-24 22:00:31 -08:00
Clark Zinzow
c1a1be1da6
[Core] Locality-aware leasing: Milestone 2 - Owned refs, cached locations ( #14282 )
...
* Adds locality-aware leasing for cached owned refs.
* Add tests for locality-aware leasing on cached owned refs.
2021-02-24 21:24:10 -08:00
Hao Zhang
11e721c9b3
[Collective] Address some comments and minor updates before merging multistream ( #14302 )
2021-02-24 20:43:42 -08:00
Kathryn Zhou
456d9aab47
Add Cypress test for Ray Dashboard ( #14253 )
2021-02-24 20:41:52 -08:00
Richard Liaw
80657e5dfe
Revert "[Core]Pull off timers out of heartbeat in raylet ( #13963 )" ( #14319 )
2021-02-24 19:44:31 -08:00
ZhuSenlin
be28e8fae4
use iterator to instead of operator[] to avoid garbage ( #14275 )
2021-02-25 11:37:36 +08:00
niole
488f63efe3
[Dashboard] Make requests sent by the dashboard reverse proxy compatible ( #14012 )
2021-02-24 18:31:59 -08:00
architkulkarni
ef96193b8b
fix servehandle docstring for sync/async ( #14312 )
2021-02-24 16:41:15 -08:00
Kai Fricke
021ed92e8a
Add debug_state.txt to cluster dump ( #14310 )
2021-02-24 22:47:26 +01:00
dependabot[bot]
aa36a6622d
[tune](deps): Bump xgboost in /python/requirements ( #14225 )
...
Bumps [xgboost](https://github.com/dmlc/xgboost ) from 1.3.0.post0 to 1.3.3.
- [Release notes](https://github.com/dmlc/xgboost/releases )
- [Changelog](https://github.com/dmlc/xgboost/blob/master/NEWS.md )
- [Commits](https://github.com/dmlc/xgboost/commits/v1.3.3 )
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-02-24 13:43:19 -08:00
Richard Liaw
4dd5c9e541
[tune] fix placement group timeout ( #14313 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-24 13:35:13 -08:00
Richard Liaw
fd128a4533
disable object-spilling test ( #14318 )
2021-02-24 12:22:25 -08:00
Clark Zinzow
c867054f0c
Skip GCS fault-tolerance test on Windows. ( #14311 )
2021-02-24 11:44:41 -08:00
Eric Liang
4bae0c9228
[client] Allow ignoring version mismatch with env var for debugging ( #14295 )
2021-02-24 11:36:16 -08:00
Ameer Haj Ali
5155673404
set STATUS_UNINITIALIZED TAG launching head node ( #14293 )
...
* prepare for head node
* move command runner interface outside _private
* remove space
* Eric
* flake
* min_workers in multi node type
* fixing edge cases
* eric not idle
* fix target_workers to consider min_workers of node types
* idle timeout
* minor
* minor fix
* test
* lint
* eric v2
* eric 3
* min_workers constraint before bin packing
* Update resource_demand_scheduler.py
* Revert "Update resource_demand_scheduler.py"
This reverts commit 818a63a2c86d8437b3ef21c5035d701c1d1127b5.
* reducing diff
* make get_nodes_to_launch return a dict
* merge
* weird merge fix
* auto fill instance types for AWS
* Alex/Eric
* Update doc/source/cluster/autoscaling.rst
* merge autofill and input from user
* logger.exception
* make the yaml use the default autofill
* docs Eric
* remove test_autoscaler_yaml from windows tests
* lets try changing the test a bit
* return test
* lets see
* edward
* Limit max launch concurrency
* commenting frac TODO
* move to resource demand scheduler
* use STATUS UP TO DATE
* Eric
* make logger of gc freed refs debug instead of info
* add cluster name to docker mount prefix directory
* grrR
* fix tests
* moving docker directory to sdk
* move the import to prevent circular dependency
* smallf fix
* ian
* fix max launch concurrency bug to assume failing nodes as pending and consider only load_metric's connected nodes as running
* small fix
* huh?
* set initialized status for head when launching head node
* test
* patch
* fix lint
Co-authored-by: Ameer Haj Ali <ameerhajali@ameers-mbp.lan>
Co-authored-by: Alex Wu <alex@anyscale.io>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
2021-02-24 18:34:05 +02:00
dependabot[bot]
94d9e0f35d
[tune](deps): Bump torchvision from 0.8.1 to 0.8.2 in /python/requirements ( #14226 )
...
* [tune](deps): Bump torchvision in /python/requirements
Bumps [torchvision](https://github.com/pytorch/vision ) from 0.8.1 to 0.8.2.
- [Release notes](https://github.com/pytorch/vision/releases )
- [Commits](https://github.com/pytorch/vision/compare/v0.8.1...v0.8.2 )
Signed-off-by: dependabot[bot] <support@github.com>
* Update requirements_tune.txt
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2021-02-24 16:36:12 +01:00
fangfengbin
482a00278b
[GCS]Fix flaky testcase: ServiceBasedGcsClientTest ( #14248 )
2021-02-24 20:35:30 +08:00
Tao Wang
6af0291347
[Core]Pull off timers out of heartbeat in raylet ( #13963 )
2021-02-24 11:59:13 +08:00