* Enforce per-node-type max workers
* type annonation
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
* cleanup. comments. type annotations
* additional type annotation
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
* additional cleanup. comments. type annotations
* _get_nodes_needed_for_request_resources to use FrozenSet
* comments
* whitespace
* [Placement Group] Fix resource index assignment between with bundle index and without bundle index pg (#17318)
* [serve] Add Ray API stability annotations (#17295)
* Support streaming output of runtime env setup to logger/driver (#17306)
* [SGD] v2 prototype: ``WorkerGroup`` implementation (#17330)
* wip
* formatting
* increase timeouts
* address comments
* comments
* fix
* address comments
* Update python/ray/util/sgd/v2/worker_group.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update python/ray/util/sgd/v2/worker_group.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* address comments
* formatting
* fix
* avoid race condition
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* [RLlib] Discussion 3001: Fix comment on internal state shape (must be [B x S=state dim]). (#17341)
* [autoscaler] GCP TPU VM autoscaler (#17278)
* [Rllib] set self._allow_unknown_config (#17335)
Co-authored-by: Sven Mika <sven@anyscale.io>
* [RLlib] Discussion 2294: Custom vector env example and fix. (#16083)
* [docs] Link broken in Tune's page (#17394) (#17407)
* [Serve] Fix response_model for class based view routes as well (#17376)
* [serve] Fix single deployment nightly test (#17368)
* [RLlib] SAC tuple observation space fix (#17356)
* Support schema on read for csv/json (#17354)
* [RLlib] New and changed version of parametric actions cartpole example + small suggested update in policy_client.py (#15664)
* [gcs] Fix GCS related issues: ByteSizeLong and redis connection (#17373)
* [runtime_env] Gracefully fail tasks when an environment fails to be set up (#17249)
* [docs] update docs with pip requirements (#17317)
* removed nodes_to_keep. cleanup
* formatting
* +comment
* treat max_workers=0 as 0 workers (as opposed to unlimited)
* fix wrong comment
* warning for inconsistent config
* terminate nodes with no matching node type right away
* quotes
* special handling for head node when enforcing max_workers per type. tests. cleanup
* cleanup comments and prints
* comments
* cleanup. removed special handling of head node.
* adding an eplicit non-None check in schedule_node_termination
* raise the exception
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
Co-authored-by: DK.Pino <loushang.ls@antfin.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Rohan138 <66227218+Rohan138@users.noreply.github.com>
Co-authored-by: amavilla <takashi.tameshige.jj@hitachi.com>
Co-authored-by: Jiao <sophchess@gmail.com>
Co-authored-by: Julius Frost <33183774+juliusfrost@users.noreply.github.com>
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: kk-55 <63732956+kk-55@users.noreply.github.com>
Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
Co-authored-by: matthewdeng <matt@anyscale.com>
* virtual actor writer
pass step_type around
simplify readonly actor
return different thing for a virtual actor
return state and output
WorkflowExecutionResult
simplify workflow execution
initial virtual actor writer
workflow_ref deeper integration
resume a step of a workflow
cache step output
Support dynamic workflow ref
* fix recovery tests
* fix
* fix get_output
* better error message
* pressure test
* fix
* verbose error message
* verbose error message
* fix get_cached_step issue
* update tests
* simplify readonly virtual actor
* fix storage tests
* workflow.resume returns state of an actor
* fix verbose
* fix comment
* make it more clear by renaming
* comment
* test init error in virtual actor
* update docs
* update docs
* update test_actor_manager/list_all
* fix comment