Alex Wu
d9cd3800c7
Dataset speed up read ( #17435 )
2021-08-01 18:03:46 -07:00
Ivorius
6703091cdc
[Docs] Update example-full.yaml for ulimits as supported by docker. ( #17408 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-08-01 01:36:16 -07:00
matthewdeng
3a1aed28b7
[torch] fix process group timedelta ( #17468 )
2021-07-30 15:47:33 -07:00
SangBin Cho
9a696cc66a
Pin aioredis version ( #17472 )
2021-07-30 12:04:14 -07:00
Kai Fricke
b0f00b1b4b
[default] pin aioredis < 2 ( #17465 )
2021-07-30 17:57:17 +01:00
Eric Liang
cd13059691
[dataset] Implement random_shuffle() and split(equal=True) ( #17448 )
2021-07-30 09:51:21 -07:00
Patrick Ames
131710f9f9
[autoscaler] Add support for EC2 launch templates. ( #17236 )
2021-07-30 08:05:59 -07:00
wanxing
705248f4ee
[CoreWorker]Remove plasma_objects_only parameter ( #17384 )
2021-07-30 14:48:36 +08:00
matthewdeng
58c4fe727c
[SGD] TrainerV2 API interface ( #17447 )
...
Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
2021-07-29 19:39:39 -07:00
Eric Liang
0373c54b3e
Add warning if get_gpu_ids() is called on the driver. ( #17436 )
2021-07-29 19:39:22 -07:00
Siyuan (Ryans) Zhuang
17c25345d0
[Workflow] Virtual actor writer - Part 2 ( #17336 )
...
* virtual actor writer
pass step_type around
simplify readonly actor
return different thing for a virtual actor
return state and output
WorkflowExecutionResult
simplify workflow execution
initial virtual actor writer
workflow_ref deeper integration
resume a step of a workflow
cache step output
Support dynamic workflow ref
* fix recovery tests
* fix
* fix get_output
* better error message
* pressure test
* fix
* verbose error message
* verbose error message
* fix get_cached_step issue
* update tests
* simplify readonly virtual actor
* fix storage tests
* workflow.resume returns state of an actor
* fix verbose
* fix comment
* make it more clear by renaming
* comment
* test init error in virtual actor
* update docs
* update docs
* update test_actor_manager/list_all
* fix comment
2021-07-29 19:29:28 -07:00
Amog Kamsetty
ff04a923ea
[SGD] v2 prototype: BackendExecutor
and TorchBackend
implementation ( #17357 )
...
* wip
* formatting
* increase timeouts
* wip
* address comments
* comments
* fix
* address comments
* Update python/ray/util/sgd/v2/worker_group.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update python/ray/util/sgd/v2/worker_group.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* address comments
* formatting
* fix
* wip
* finish
* fix
* formatting
* remove reporting
* split TorchBackend
* fix tests
* address comments
* add file
* more fixes
* remove default value
* update run method doc
* add comment
* minor doc fixes
* lint
* add args to BaseWorker.execute
* address comments
* remove extra parentheses
* properly instantiate backend
* fix some of the tests
* fix torch setup
* fix type hint
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2021-07-29 14:38:44 -07:00
Kai Fricke
44d209dd5f
[tune] re-enable tensorboardx without torch installed ( #17403 )
2021-07-29 10:39:38 +01:00
xwjiang2010
93d12b1b5e
[CLI] Fix ray submit
when --stop is supplied. ( #17385 )
...
* [cli] Fix `ray submit` when --stop is supplied.
* syntax sugar.
2021-07-29 00:01:25 -07:00
Eric Liang
7ed62ea0ad
Initial implementation of Dataset pipelining and docs ( #17309 )
2021-07-28 21:12:01 -07:00
Eric Liang
9b4bcb3bc2
[hotfix] Fix merge conflict that caused test_dataset to failed.
2021-07-28 14:58:31 -07:00
Edward Oakes
7007c6271d
[runtime_env] Gracefully fail tasks when an environment fails to be set up ( #17249 )
2021-07-28 15:25:02 -05:00
Yi Cheng
72abf81900
[gcs] Fix GCS related issues: ByteSizeLong and redis connection ( #17373 )
2021-07-28 13:01:54 -07:00
Eric Liang
4ffa549041
Support schema on read for csv/json ( #17354 )
2021-07-28 10:59:52 -07:00
Simon Mo
db126b24b9
[Serve] Fix response_model for class based view routes as well ( #17376 )
2021-07-28 09:31:02 -07:00
Antoni Baum
1f35470560
[autoscaler] GCP TPU VM autoscaler ( #17278 )
2021-07-27 21:24:29 -07:00
Amog Kamsetty
d01e1c15c8
[SGD] v2 prototype: `WorkerGroup
` implementation ( #17330 )
...
* wip
* formatting
* increase timeouts
* address comments
* comments
* fix
* address comments
* Update python/ray/util/sgd/v2/worker_group.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update python/ray/util/sgd/v2/worker_group.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* address comments
* formatting
* fix
* avoid race condition
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-27 17:36:38 -07:00
Simon Mo
4a4210a083
Support streaming output of runtime env setup to logger/driver ( #17306 )
2021-07-27 16:39:15 -07:00
Edward Oakes
7225f28fff
[serve] Add Ray API stability annotations ( #17295 )
2021-07-27 16:00:15 -05:00
DK.Pino
2699b0f3ab
[Placement Group] Fix resource index assignment between with bundle index and without bundle index pg ( #17318 )
2021-07-27 13:51:02 -07:00
Alex Wu
5879e3132e
[Dataset] Support compressed files ( #17355 )
...
* .
* lint
* .
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-07-27 12:35:16 -07:00
Eric Liang
e70d84953e
[hotfix] Dataset tests accidentally disabled
2021-07-27 10:40:15 -07:00
Frank Luan
a6e8497dc9
[Dataset] Sort ( #17142 )
2021-07-27 01:53:53 -07:00
fyrestone
57b9b1bb0f
[Dashboard] Use a dedicated RPC to check the GCS is alive ( #16330 )
...
* Dashboard check gcs is alive
* Fix dashboard hangs at exit
* ray health-check call GCS CheckAlive
* Minor fixes
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-27 14:05:44 +08:00
Richard Liaw
597dc08dfe
Revert "Revert "[core] remove opencensus/prometheus_exporter dependencies"" ( #17254 )
...
* Revert "Revert "[core] remove opencensus/prometheus_exporter dependencies" (#17251 )"
This reverts commit 7b44dd8ecb
.
* Lint
* Fix more imports
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-07-26 21:09:25 -07:00
Dmitri Gekhtman
d0e58af075
[autoscaler] Avoid race in no-updaters logic ( #17328 )
...
* Extra logic and test
* anglish
2021-07-26 16:05:33 -04:00
dependabot[bot]
4bf377ee4b
[tune](deps): Bump gym[atari] in /python/requirements/tune ( #17199 )
...
Bumps [gym[atari]](https://github.com/openai/gym ) from 0.18.0 to 0.18.3.
- [Release notes](https://github.com/openai/gym/releases )
- [Commits](https://github.com/openai/gym/compare/0.18.0...0.18.3 )
---
updated-dependencies:
- dependency-name: gym[atari]
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-07-26 10:53:41 -07:00
architkulkarni
756a4e7a90
[Core] [runtime env] update tests to use ray.init(runtime_env=...) and add e2e test ( #17232 )
2021-07-26 11:21:30 -05:00
Tao Wang
d98ec7fc4d
Remove libray_redis_module ( #17283 )
2021-07-25 23:15:29 -07:00
matthewdeng
fdbeef6046
[SGD] RaySGD v2 skeleton code ( #17300 )
...
* [SGD] RaySGD v2 skeleton code
* add build file
* move file
* empty
* rename
* address comments
* add method interfaces
* move BUILD file out of tests dir
Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
2021-07-25 17:39:24 -07:00
Yi Cheng
93be44eebf
[workflow] Fix more usability issues ( #17305 )
...
* up
* up
* up
* up
* up
* up
* fix test error
* up
2021-07-23 20:26:44 -07:00
Edward Oakes
2142abae57
[Serve] Properly support runtime_env working_dir ( #16480 )
2021-07-23 17:35:32 -07:00
Kai Fricke
8db61569f1
[tune] Fix HDFS sync down template ( #17291 )
2021-07-23 13:01:14 -07:00
Yi Cheng
29352e7fa3
[workflow] Fix some usability issues ( #17284 )
2021-07-23 11:39:49 -07:00
Eric Liang
df7fe8dd6d
[data] Cleanup Block type by dropping Generic[T] ( #17276 )
...
* wip
* update
* update
* quotes
2021-07-23 09:23:06 -07:00
Dmitri Gekhtman
e701ded54f
[autoscaler] Tweaks to support remote (K8s) operators ( #17194 )
...
* node provider hooks
* disable node updaters
* pending means not completed
* draft wip
* add flag to autoscaler initialization
* Explain
* terminate unhealthy nodes
* fix, add event summarizer message
* Revert node provider
* remove hooks from autoscaler.py
* avert indent apocalypse
* wip
* copy-node-termination-logic
* Added a test
* Finish tests
* test cleanup
* Move disable node updaters to config yaml
* fix
* Drop arg
2021-07-23 11:30:18 -04:00
Edward Oakes
811eb4b092
[debugger] Enable attaching to breakpoints on remote nodes (off by default) ( #17275 )
2021-07-23 09:37:40 -05:00
Siyuan (Ryans) Zhuang
57b2328e7b
[workflow] Virtual actor writer - Part I ( #17256 )
...
* update readonly virtual actor
use signature module
refactoring workflow
new execution interface
advance progress of a workflow
update storage
last_step_of_workflow
prevent setting dynamic output of "output.json" in workflow directory
use alternative exception
* fix
* fix comments
* better step names
* add TODO
* fix comments
* log errors when retry
* fix storage test
2021-07-22 22:53:04 -07:00
Clark Zinzow
1ab4f0def7
[Datasets] Port read_binary_files to Datasource API. ( #17225 )
2021-07-22 19:03:10 -07:00
Yi Cheng
5f4d9085d2
[workflow] workflow ci enable ( #17255 )
...
* Enable workflow tests
* update
* Fix one bug
2021-07-22 17:59:24 -07:00
Simon Mo
b9b79cd5f4
[Runtime Env] Support per task/actor uri override job_config ( #17252 )
2021-07-22 16:37:43 -07:00
Simon Mo
aaf8afb78d
[Runtime Env] Add a test for working_dir inheritance ( #17245 )
2021-07-22 10:48:25 -07:00
Yi Cheng
760b11263a
[workflow] Workflow manager API ( #17226 )
2021-07-22 09:30:52 -07:00
Richard Liaw
a78a2263e5
[RLlib] Fix reverted RockPaperScissors Pettingzoo example ( #16896 )
2021-07-22 10:55:07 -04:00
xwjiang2010
f3a31a3b94
[tune] Add test for flatten_dict. ( #17241 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-21 22:01:01 -07:00