hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Kai Fricke	8608b64885	[ci/release] Remove old OSS release test infrastructure (#23134 ) Now that we've migrated all OSS release tests to the new infrastructure, we can remove old config files and infra scripts.	2022-03-14 15:10:52 +00:00
SangBin Cho	2c2d96eeb1	[Nightly tests] Improve k8s testing (#23108 ) This PR improves broken k8s tests. Use exponential backoff on the unstable HTTP path (getting job status sometimes has broken connection from the server. I couldn't really find the relevant logs to figure out why this is happening, unfortunately). Fix benchmark tests resource leak check. The existing one was broken because the job submission uses 0.001 node IP resource, which means the cluster_resources can never be the same as available resources. I fixed the issue by not checking node IP resources K8s infra doesn't support instances < 8 CPUs. I used m5.2xlarge instead of xlarge. It will increase the cost a bit, but it wouldn't be very big.	2022-03-14 03:49:15 -07:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Yi Cheng	90093769df	[nightly] Add more many tasks tests (#21727 ) This PR add four tests for many tasks: many short tasks send from the single node many short tasks send from multiple nodes many long tasks send from multiple nodes many long tasks send from the single node TODO: migrate many nodes actor tests to this one. scheduling envelop should contain: (tasks): scheduling_test_many_xx_tasks_yy_nodes (actors):many_nodes_actor_test (to be combined with this one) (shuffle): pipelined_ingestion_1500_gb_15_windows (shuffle): dask_on_ray_1tb_sort	2022-01-20 14:52:26 -08:00
SangBin Cho	44320aba3b	[Nightly Test] Fix broken scalability test #21201 I added memory monitor to the scalability tests. This broke the tests because creating a memory monitor requires the node resources (to be scheduled on a head node), and that broke "resource leak" check. Ideally, this resource leak check should be more robust, but I fix the issue in an easier way for now. In the sooner future, memory monitor will become a fixture, and in that case, we should fix resource leak function code.	2021-12-20 14:58:39 -08:00
SangBin Cho	1c1430ff5c	Add memory monitor to scalability tests. (#21102 ) This adds memory monitoring to scalability envelope tests so that we can compare the peak memory usage for both nonHA & HA. NOTE: the current way of adding memory monitor is not great, and we should implement fixture to support this better, but that's not in progress yet.	2021-12-15 01:31:38 -08:00
Alex Wu	ca86098680	Revert "[core] Refactor test_many_tasks (#18169 )" (#18216 ) This reverts commit `eb6fd20d53`.	2021-08-30 10:35:23 -07:00
Stephanie Wang	eb6fd20d53	[core] Refactor test_many_tasks (#18169 ) * Improve test test * lint	2021-08-30 10:33:23 -07:00
Kai Fricke	089dd9b949	[release] Add release logs for 1.6.0 (#18067 )	2021-08-26 12:13:15 +02:00
Clark Zinzow	d958457d07	[Core] Second pass at privatizing APIs. (#17885 ) * gcs_utils * resource_spec * profiling * ray_perf and ray_cluster_perf * test_utils	2021-08-18 20:56:33 -07:00
Alex Wu	af880378da	Lower threshold on scalability envelope many tasks (#17511 )	2021-08-02 11:50:08 -07:00
Alex Wu	9e79301d35	Split scalability envelope + smoke tests (#17455 ) * . * done? * done? * sang comments * . Co-authored-by: Alex Wu <alex@anyscale.com>	2021-07-30 10:20:19 -07:00
SangBin Cho	63ebfe2f2d	Revert back to ray.init (#17047 )	2021-07-13 14:36:27 -07:00
Alex Wu	b08795582b	Disable runtime envs in scalability envelope (#16978 ) Co-authored-by: Alex Wu <alex@anyscale.com>	2021-07-11 09:53:15 -07:00
Alex Wu	ba9fd06f87	Integrate scalability envelope with releaser (#16417 ) * . * . * . * . * . * . * . * success Co-authored-by: Alex Wu <alex@anyscale.com>	2021-06-15 10:42:55 -07:00
Clark Zinzow	ca68bf1e93	[Release] Update release test configs for 1.4 release. (#16292 ) * Updated scalability envelope tests for 1.4. * Update data processing release test for 1.4.	2021-06-08 00:15:25 -07:00
Kai Fricke	1d52ab819f	[release] release 1.3.0 results and test updates (#15366 ) Convert a number of release tests and add logs for release 1.3.0	2021-05-04 22:10:04 +01:00
Alex Wu	805b8a10a3	Move scalability envelope back down to 250 nodes (#15381 ) * . * done? * . Co-authored-by: Alex Wu <alex@anyscale.com>	2021-04-16 19:39:24 -07:00
Dmitri Gekhtman	e6864523cf	[autoscaler] Do not divide by zero in resource demand scheduler (#15323 ) * Do not divide by zero * Don't take min or mean of an empty list * max workers 0 for head node in distributed benchmark * test * Correct the type annotation * comment grammar tweak * message * docs * test * Move test cli to large tests.	2021-04-16 10:20:05 -07:00
SangBin Cho	b1e0409447	[Test] Improve scalability envelope (#14406 ) * fixed. * fix. * Update the result. * Addressed code review.	2021-03-01 18:36:52 -08:00
Alex Wu	840987c7af	Scalability Envelope Tests (#13464 )	2021-01-25 18:48:31 -08:00

21 commits