Sven Mika
7318439c3d
[RLlib] DQN native_ratio (for training intensity) incorrect (discussion 1763). ( #15436 )
...
Thanks @Manuscrit !
2021-04-22 11:06:29 +02:00
Jialing He
5403021430
Fix incorrect call function WorkerID::FromBinary ( #15449 )
2021-04-22 15:44:49 +08:00
Ian Rodney
810a02b3f2
[Azure][Autoscaler] Allow current user to use Docker ( #15380 )
2021-04-22 00:30:30 -07:00
Ameer Haj Ali
978199ceba
[autoscaler] Update azure pip packages in the cluster yaml ( #15274 )
2021-04-22 08:23:05 +03:00
Alex Wu
ede377bc26
ray health-check ( #15429 )
...
* .
* done?
* .
* .
* less yelling
* fixed?
* lint
* skip on windows'
* remove extra print
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-04-21 21:49:55 -07:00
Edward Oakes
71a670c471
[serve] Make fastapi wrapper a normal serve backend ( #15441 )
2021-04-21 16:06:33 -05:00
Yi Cheng
0fa6bae104
[dev] Enable gitpod ( #15420 )
2021-04-21 13:26:46 -07:00
Yi Cheng
b63e493c04
[runtime_env] Fix the some bugs related with runtime_env ( #15286 )
2021-04-21 13:31:21 -05:00
lanlin
c7f6ffb70c
[Tune] Fix max len trial name ( #15293 )
...
* check TUNE_MAX_LEN_IDENTIFIER when use it
* fix format
2021-04-21 10:48:24 -07:00
Fabien Couthouis
fe06642df0
[RLlib] Report mean losses instead of sum in IMPALA (discussion 1709) ( #15427 )
2021-04-21 10:59:06 +02:00
Frank Luan
7ff436e1f3
Fix restore_spilled_objects() for external object spilling ( #15426 )
...
* Fix deserializer in metrics.Counter
* Fix restore_spilled_objects() for external object spilling
2021-04-20 16:33:44 -07:00
Yi Cheng
dbba3a456f
[core] Fixing of actor creation failure ( #15411 )
...
* Fix
* fix
* format
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* format
* fix comments
2021-04-20 15:27:45 -07:00
Kai Fricke
d7e31c0d13
[tune] Return normalized checkpoint path ( #15296 )
...
* Return normalized checkpoint path
* Lint
2021-04-20 13:36:40 -07:00
Yi Cheng
9b3ea7c32b
[core] Take care of object spilling failure ( #14703 )
...
* fix spilling failure
* format
* unittests added
* format
* format
* format
* fix
* add comment
* fix some comments
* add test cases
* format
* format
2021-04-20 10:28:48 -07:00
Eric Liang
a482034916
Flaky test builder for tests tagged "flaky" ( #15408 )
2021-04-20 00:19:07 -07:00
Sven Mika
7ff27dfe07
[RLlib] Remove atari dependency for RLlib (in favor of detailed error message). ( #15292 )
2021-04-20 08:46:58 +02:00
Sven Mika
41968512ca
[RLlib] Partial GPU examples (for learner and workers). ( #15334 )
2021-04-20 08:46:05 +02:00
architkulkarni
3bda2812fa
[Serve] Remove old ImportedBackend factory ( #15376 )
2021-04-19 16:25:59 -07:00
Edward Oakes
fbe510cd47
[serve] Clean up route prefixing behavior for deployments ( #15193 )
2021-04-19 12:50:46 -05:00
fangfengbin
ade684ac03
[Test] Fix gcs flaky testcase ( #15391 )
...
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2021-04-19 10:21:39 -07:00
Jiaxin Shan
86468ce59f
[kubernetes] Remove unrelated fields in manifest file ( #15243 )
2021-04-19 10:54:33 -05:00
DK.Pino
b0a813baad
[Placement Group] Fix PlacementGroup ready when specify memory resource ( #15189 )
...
* fix placement group ready when memory specified
* lint
* add memory resource check in suppressed
* fix lint
* update comment
* fix lint
* delete unrelated code
* update comment
* lint
* fix ut
2021-04-17 22:21:05 -07:00
Alex Wu
805b8a10a3
Move scalability envelope back down to 250 nodes ( #15381 )
...
* .
* done?
* .
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-04-16 19:39:24 -07:00
SangBin Cho
5f74d0e40d
[Test] Fix flaky test failure ( #15326 )
...
* Fix trial.
* unskip test.
* Mock commit
2021-04-16 18:09:02 -07:00
Dmitri Gekhtman
e6864523cf
[autoscaler] Do not divide by zero in resource demand scheduler ( #15323 )
...
* Do not divide by zero
* Don't take min or mean of an empty list
* max workers 0 for head node in distributed benchmark
* test
* Correct the type annotation
* comment grammar tweak
* message
* docs
* test
* Move test cli to large tests.
2021-04-16 10:20:05 -07:00
Edward Oakes
822a83055e
[Buildkite] split up some tune and rllib tests ( #15343 )
2021-04-16 10:16:12 -07:00
Risto Vuorio
dcda4a3d60
[tune] escaping paths before globbing in TrainableUtil.get_checkpoints_paths ( #15368 )
...
* Fixes 15367 by escaping paths before globbing in TrainableUtil.get_checkpoints_paths
* Adds a test testGetTrialCheckpointsPathsByPathWithSpecialCharacters for fix_15367
2021-04-16 09:41:02 -07:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. ( #14709 )
2021-04-16 09:16:24 +02:00
fangfengbin
0e3bbbeba3
[Test] Try deflaking gcs server test by adding log ( #15332 )
...
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2021-04-15 21:16:09 -07:00
Richard Liaw
dc80d9f42a
[flaky] fix mnist ptl data cache ( #15344 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-04-15 16:24:17 -07:00
SangBin Cho
a54d69f535
[Test] Split long runtime env tests. ( #15340 )
...
* [Test] Split long runtime env tests.
* Addressed code review.
2021-04-15 14:28:28 -07:00
SangBin Cho
1d87e4447d
[Test] increase the test size of test io that consistenly times out ( #15341 )
2021-04-15 14:02:41 -07:00
Siyuan (Ryans) Zhuang
4de1f35b3e
run_function_on_all_workers
only once in the driver (#15203 )
2021-04-15 13:58:36 -07:00
Richard Liaw
eaa3ce3f40
Fix release test -- client remote put ( #15325 )
...
* fix-test
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* Update python/ray/util/client/server/dataservicer.py
* Update python/ray/util/client/server/dataservicer.py
* Update python/ray/_private/ray_client_microbenchmark.py
2021-04-15 13:30:38 -07:00
Kai Fricke
1c783e2eeb
[tune] Allow 0 CPU head bundles in for placement group factories ( #15338 )
2021-04-15 20:21:35 +01:00
SangBin Cho
4dd4756c09
[Test] skip flaky pg tests. ( #15337 )
2021-04-15 11:55:19 -07:00
SangBin Cho
df9329160e
[Tests] Dask on ray release test ( #15256 )
...
* done.
* Linting.
* Update readme
* Update.
* Fix issues.
2021-04-15 10:30:17 -07:00
Sven Mika
8b3554e37e
[RLlib] Remove all (already soft-deprecated) SampleBatch.data
from code. ( #15335 )
2021-04-15 19:19:51 +02:00
Sven Mika
c90de315e5
[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. ( #15295 )
2021-04-15 16:50:42 +01:00
Sven Mika
e961d2f4b2
[RLlib] Improve example scripts for attention nets, CartPole LSTM, and custom RNN-models. ( #15329 )
2021-04-15 16:11:34 +02:00
Sven Mika
45d6560759
[RLlib] Fix flakey custom_fast_model_torch/tf tests. ( #15330 )
2021-04-15 16:10:29 +02:00
Ameer Haj Ali
981fa5829a
[client] Enable ClientObjectRef Comparisons ( #15320 )
2021-04-15 16:46:44 +03:00
SangBin Cho
c2e240e866
[Doc] Update object spilling doc ( #15301 )
2021-04-14 23:38:04 -07:00
Simon Mo
57b6053cda
[Buildkite] Turn off Travis linux builds (except wheels) ( #15316 )
...
* [Buildkite] Turn off Travis linux builds (except wheels)
* naming
2021-04-14 20:37:37 -07:00
Siyuan (Ryans) Zhuang
b81d805f40
[Doc] fix ray client doc ( #15308 )
2021-04-14 20:35:15 -07:00
Yi Cheng
a9402c21e6
Revert "Revert "[runtime_env] Add support of exclusion ( #15241 )" ( #15303 )" with fixing ( #15310 )
...
* Revert "Revert "[runtime_env] Add support of exclusion (#15241 )" (#15303 )"
This reverts commit 775deca5ad
.
* fix
2021-04-14 20:34:53 -07:00
Stephanie Wang
6b2da7eda8
[core] Log warning on bad max task args value ( #15314 )
2021-04-14 20:34:08 -07:00
SangBin Cho
27ab0c7633
[Test] Skip the failing rllib example test. ( #15321 )
2021-04-14 20:19:44 -07:00
Simon Mo
5f0be94989
[Buildkite] Use the build link for Travis Tracker ( #15317 )
2021-04-14 18:58:23 -07:00
SangBin Cho
d0e83c43ca
[Release Test] Modify parameter to reduce stress ( #15048 )
...
* Fix.
* Fix.
2021-04-14 18:27:20 -07:00