hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 10:01:43 -05:00

Author	SHA1	Message	Date
Alan Guo	50b20809b8	[Dashboard] Stop caching logs in memory. Use state observability api to fetch on demand. (#26818 ) Signed-off-by: Alan Guo <aguo@anyscale.com> ## Why are these changes needed? Reduces memory footprint of the dashboard. Also adds some cleanup to the errors data. Also cleans up actor cache by removing dead actors from the cache. Dashboard UI no longer allows you to see logs for all workers in a node. You must click into each worker's logs individually. <img width="1739" alt="Screen Shot 2022-07-20 at 9 13 00 PM" src="https://user-images.githubusercontent.com/711935/180128633-1633c187-39c9-493e-b694-009fbb27f73b.png"> ## Related issue number fixes #23680 fixes #22027 fixes #24272	2022-07-26 03:10:57 -07:00
Eric Liang	43aa2299e6	[api] Annotate as public / move ray-core APIs to _private and add enforcement rule (#25695 ) Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.	2022-06-21 15:13:29 -07:00
Kai Yang	4a999777fa	[Core] Allow accepting gRPC HTTP proxy via env variable (#23526 )	2022-05-10 11:30:46 +08:00
SangBin Cho	73ed67e9e6	[State API] State api limit + Removing unnecessary modules (#24098 ) This PR does Move all routes into the same module, state_head.py Support a limit feature.	2022-04-22 15:59:46 -07:00
SangBin Cho	30ab5458a7	[State Observability] Tasks and Objects API (#23912 ) This PR implements ray list tasks and ray list objects APIs. NOTE: You can ignore the merge conflict for now. It is because the first PR was reverted. There's a fix PR open now.	2022-04-21 18:45:03 -07:00
SangBin Cho	1c3329fa38	Revert "Revert "[State Observability] Basic functionality for central… (#23933 ) …ized data (#23744)" (#23918)" This reverts commit `fb14e82`.	2022-04-18 21:15:43 -07:00
Amog Kamsetty	fb14e82242	Revert "[State Observability] Basic functionality for centralized data (#23744 )" (#23918 ) This reverts commit `51a4a1a802`. breaking tune multinode tests and kuberay:test_autoscaling_e2e	2022-04-14 14:28:42 -07:00
SangBin Cho	51a4a1a802	[State Observability] Basic functionality for centralized data (#23744 ) Support listing actor/pg/job/node/workers Design doc: https://docs.google.com/document/d/1IeEsJOiurg-zctOcBjY-tQVbsCmURFSnUCTkx_4a7Cw/edit#heading=h.9ub9e6yvu9p2 Note that this PR doesn't contain any output except ids. I will update them in the follow-up PRs.	2022-04-14 07:33:18 -07:00
Tao Wang	6aefe9b36e	[Core]Save task spec in separate table (#22650 ) This is a rebase version of #11592. As task spec info is only needed when gcs create or start an actor, so we can remove it from actor table and save the serialization time and memory/network cost when gcs clients get actor infos from gcs. As internal repository varies very much from the community. This pr just add some manual check with simple cherry pick. Welcome to comment first and at the meantime I'll see if there's any test case failed or some points were missed.	2022-04-12 12:24:26 -07:00
mwtian	72ef9f91aa	[Remove Redis Pubsub 1/n] Remove `enable_gcs_pubsub()` (#23189 ) GCS pubsub has been the default for awhile. There is little chance that we would need to revert back to Redis pubsub in future. This is the step in removing Redis pubsub, by first removing the `enable_gcs_pubsub()` feature guard.	2022-03-15 23:56:15 -07:00
Yi Cheng	11bbf00338	[dashboard] Remove redis in dashboard (#22788 ) As we are turning redisless ray by default, dashboard doesn't need to talk with redis anymore. Instead it should talk with gcs and gcs can talk with redis.	2022-03-04 12:32:17 -08:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Yi Cheng	7d2237bc9f	[dashboard] Remove unused fields in dashboard actor table for better memory footprint (#21919 )	2022-01-26 22:48:17 -08:00
SangBin Cho	e62c0052a0	[Dashboard] Agent in minimal ray installation (#21817 ) This is the second part of https://docs.google.com/document/d/12qP3x5uaqZSKS-A_kK0ylPOp0E02_l-deAbmm8YtdFw/edit#. After this PR, dashboard agents will fully work with minimal ray installation. Note that this PR requires to introduce "aioredis", "frozenlist", and "aiosignal" to the minimal installation. These dependencies are very small (or will be removed soon), and including them to minimal makes thing very easy. Please see the below for the reasoning.	2022-01-26 04:03:54 -08:00
SangBin Cho	1ae14ec513	[Dashboard] Make dashboard / agent work in minimal ray installation 1/3. (#21774 ) This is the doc that explains how to achieve this: https://docs.google.com/document/d/12qP3x5uaqZSKS-A_kK0ylPOp0E02_l-deAbmm8YtdFw/edit?usp=sharing The fully working e2e prototype is here (it passes all tests): `cdad913883` This PR is pure refactoring. Basically it moves some of util functions that require optional_deps to `optional_utils` so that optional deps' util functions are not used in the minimal installation. Look below to see the steps. <img width="693" alt="Screen Shot 2022-01-21 at 4 38 44 AM" src="https://user-images.githubusercontent.com/18510752/150528494-c3cdedf4-3a66-4557-b540-61436b1dbab6.png">	2022-01-23 21:11:32 -08:00
mwtian	e8ce01c525	[Dashboard] offload blocking work to a thread pool (#21762 ) Currently, GCS KV client only has blocking API. Calling them from dashboard event loop can block other operations for many seconds, leading to failures such as taking too long (> 2min) to submit a job and making nightly tests fail (#21699). This PR offloads the blocking work to a separate thread. Implementing async GCS KV API will be done in the future.	2022-01-21 17:55:11 -08:00
mwtian	20ca1d85c2	[GCS][Bootstrap 2/n] Fix tests to enable using GCS address for bootstrapping (#21288 ) This PR contains most of the fixes @iycheng made in #21232, to make tests pass with GCS bootstrapping by supporting both Redis and GCS address as the bootstrap address. The main change is to use address_info["address"] to obtain the bootstrap address to pass to ray.init(), instead of using address_info["redis_address"]. In a subsequent PR, address_info["address"] will return the Redis or GCS address depending on whether using GCS to bootstrap.	2021-12-29 19:25:51 -07:00
Yi Cheng	09421a4ca6	[2/gcs] Bootstrap dashboard for gcs ha (#21179 ) This is part of gcs ha project. This PR try to bootstrap dashboard with gcs address instead of redis. Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com>	2021-12-21 16:58:03 -08:00
mwtian	6871a72a5c	[Core][Dashboard Pubsub 3/n] Migrate pubsub usages in dashboard to GCS pubsub (#20860 ) Add support for Ray pubsub in dashboard. https://github.com/ray-project/ray/pull/20954 is the prerequisite, and contains more complete change under src/.	2021-12-10 14:36:57 -08:00
Lixin Wei	b7e35acf14	[RuntimeEnv] Raise RuntimeEnvSetupError when Actor Creation Failed due to It (#19888 ) * ray_pkg passed * fix * fix typo * fix test * fix test * fix test * fix * draft * compile OK * lint * fix * lint * fix ci * Update src/ray/gcs/gcs_server/gcs_actor_manager.cc Co-authored-by: SangBin Cho <rkooo567@gmail.com> * remove comment * rename * resolve conflict * use unique ownership * use DestroyActor instead of ReconstructActor * fix sigment fault * fix crash in debug log * Revert "fix crash in debug log" This reverts commit 8f0e3d37f062b664d8d0e07c6c1a9a715b8ba1ee. Co-authored-by: SangBin Cho <rkooo567@gmail.com>	2021-11-15 07:43:35 -08:00
mwtian	875b0aea0a	fallback to grpc.experimental.aio when importing grpc.aio (#20287 )	2021-11-13 15:59:57 +09:00
Oscar Knagg	5a05e89267	[Core] Add TLS/SSL support to gRPC channels (#18631 )	2021-10-20 22:39:11 -07:00
Qing Wang	6f1d3f94db	Publish actor state PENDING_CREATION for dashboard showing. (#18666 )	2021-09-18 15:44:58 +08:00
Edward Oakes	7736cdd91d	[dashboard] Rename "new_dashboard" -> "dashboard" (#18214 )	2021-09-15 11:17:15 -05:00
Nikita Vemuri	a9c731edd3	[serve] Remove requirement to specify namespace for serve.start(detached=True) (#17470 )	2021-08-25 10:39:32 -05:00
Clark Zinzow	d958457d07	[Core] Second pass at privatizing APIs. (#17885 ) * gcs_utils * resource_spec * profiling * ray_perf and ray_cluster_perf * test_utils	2021-08-18 20:56:33 -07:00
fyrestone	dfadf33a94	[Dashboard] Reorganize dashboard modules - node (#16217 )	2021-06-07 19:50:46 -07:00
fyrestone	c53893cb13	[Dashboard] Reorganize dashboard modules - actor (#16170 )	2021-06-02 06:58:30 -07:00

28 commits