hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Maksim Smolin	eace94d2dc	[cli] Fix issues with logging unescaped strings (#9960 )	2020-08-06 14:44:39 -07:00
Max Fitton	fc9fc342cf	Make dashboard disk stat not break on non-Unix machines (note it will not correctly report stat still) (#9962 ) Co-authored-by: Max Fitton <max@semprehealth.com>	2020-08-06 15:00:26 -05:00
Max Fitton	3f7eb64063	The required resources field is not always present, leading the logical view to crash when it is not present. (#9959 ) Co-authored-by: Max Fitton <max@semprehealth.com>	2020-08-06 14:59:52 -05:00
Simon Mo	c8da5555ab	[Serve] Add preliminary middleware support (#9940 )	2020-08-06 12:49:31 -07:00
Barak Michener	21994c594b	python/test: Faster tests and better BUILD (#9791 )	2020-08-06 10:58:42 -07:00
Max Fitton	6a1acce791	[Dashboard] Fix Bug in Machine View when Unsorted with Multiple Machines (#9938 ) * Patch issue where when the Machine view was unsorted and grouped, it would crash. * lint Co-authored-by: Max Fitton <max@semprehealth.com>	2020-08-06 10:38:01 -07:00
Eric Liang	7d4f204aa8	[Placement Group] Allow scheduling a task on any bundle (-1, default) (#9885 ) * wip * wip * fix tests * wip * wip * wip * wip * wip * add test * update * update * remov debug * comments	2020-08-06 00:05:21 -07:00
Ian Rodney	45597e3158	Fixing basic messup (#9947 )	2020-08-05 22:09:21 -07:00
Richard Liaw	51bad84423	[cli] Maintain "old-style" for abort (#9943 )	2020-08-05 20:37:29 -07:00
Edward Oakes	38408574c4	[serve] Basic autoscaling policy (#9845 )	2020-08-05 21:11:35 -05:00
Alex Wu	12d75784a4	[Core] test_advanced_3.py::test_logging_to_driver (round 2) (#9916 )	2020-08-05 15:04:36 -05:00
Ian Rodney	b6da7dcef9	Upgrade to v30 and 18.04 on default yamls (#9917 )	2020-08-05 13:01:14 -07:00
Max Fitton	538ad04e96	[Dashboard] Update ActorState in dashboard to support new actor states (#9855 ) * Update ActorState in dashboard to support new actor states * Update dashboard documentation for new states * Add missing state to doc Co-authored-by: Max Fitton <max@semprehealth.com>	2020-08-05 10:35:18 -07:00
Ian Rodney	dba999b6f6	[docker] Fix rsyncs & filesync for docker (#9920 )	2020-08-05 10:13:03 -07:00
SangBin Cho	685182923c	[Core] Fix detached actor local mode when gcs actor management is on. (#9839 ) * Fix local mode detached actor. * Revert changes.	2020-08-05 09:04:24 -07:00
Amog Kamsetty	5af7d24f66	[Tune] Transformer blog example (#9789 ) Co-authored-by: Kai Fricke <kai@anyscale.com>	2020-08-04 22:05:01 -07:00
Henk Tillman	ead8b86372	Create ~/.ssh if it doesn't already exist (#9880 )	2020-08-04 14:55:15 -07:00
Edward Oakes	55146d222f	[serve] Detect node updates (#9828 )	2020-08-04 15:57:21 -05:00
Max Fitton	ef190f358b	[Dashboard] Fix Memory Page Crash with Front-end Pagination (#9593 )	2020-08-04 14:16:07 -05:00
Clark Zinzow	6fded582ff	[Dask] Dask-Ray scheduler MVP. (#9857 )	2020-08-04 11:45:25 -07:00
krfricke	ef717ecda6	[tune] Prevent leak of magic keys in trial config (#9903 ) Co-authored-by: Kai Fricke <kai@anyscale.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-08-04 11:24:01 -07:00
chaokunyang	3323ad9d59	[HOTFIX] Fix master build with missing placement group argument (#9868 ) * fix common task submit default placement group * fix java_function	2020-08-04 11:19:15 -07:00
Ameer Haj Ali	6c9ec10540	Support not specifying an SSH key in local node provider (#9864 )	2020-08-04 10:23:43 -05:00
Kai Yang	27cd323ce1	[Core] Multi-tenancy: Job isolation & implement per job config (except for env variables) (#9500 )	2020-08-04 15:51:29 +08:00
kisuke95	28b1f7710c	[Core] Error info pubsub (Remove ray.errors API) (#9665 )	2020-08-04 14:04:29 +08:00
Richard Liaw	c6404e8cf6	[tune] Search alg checkpointing during training (#9803 ) Co-authored-by: krfricke <krfricke@users.noreply.github.com>	2020-08-03 15:07:31 -07:00
Alex Wu	20671bdc12	[Core] Fix test_logging_to_driver (#9829 ) * Fixed * . * . Co-authored-by: Ubuntu <ubuntu@ip-172-31-7-236.us-west-2.compute.internal>	2020-08-03 12:52:24 -07:00
Richard Liaw	b5068d08bf	[tune] Fix restoration for function API PBT (#9853 )	2020-08-03 12:36:17 -07:00
sanderland	323bc23c21	Fix copy-paste error in queue.empty (#9757 )	2020-08-03 14:14:18 -05:00
krfricke	c741d1cf9c	[tune] stdout/stderr logging redirection (#9817 ) * Add `log_to_file` parameter, pass to Trainable config, redirect stdout/stderr. * Add logging handler to root ray logger * Added test for `log_to_file` parameter * Added logs, reuse test * Revert debug change * Update logdir on reset, flush streams after each train() step * Remove magic keys from visible config Co-authored-by: Kai Fricke <kai@anyscale.com>	2020-08-03 11:18:34 -07:00
Ameer Haj Ali	9089fab0ef	[cluster] On Prem Server First PR (#9663 ) * on prem server first commit * minor fix * verify error on autoscaling in on prem mode * lint * lint * Tests complete * add tests to check for backward compatibility * Fixing comments and autoscaling * minor fixes * coordinating server mode * tests * lint * remove unnecessary import * Resolving Comments * seperating coordinator and local node provider Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>	2020-08-03 10:38:44 -07:00
Alex Wu	5b96a88cd7	[Core] Gpu type detection (#9695 ) * . * . * . * . * . * . * . * . * Test cases * detection only * . * Done? * . * . * Done * added test case * . * . * . * . * . * . * Update python/ray/ray_constants.py Co-authored-by: Eric Liang <ekhliang@gmail.com> * . * . Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-08-01 11:43:56 -07:00
Stephanie Wang	37a9c5783c	[core] Report resource load by shape (#9806 ) * Report and aggregate resource load by shape * python test * python test * x * update	2020-07-31 16:57:30 -07:00
Alan Guo	3506910c5d	[autoscaler] Create worker_file_mounts config (#9762 )	2020-07-31 14:33:27 -07:00
Eric Liang	b73080c85f	Allow tasks to be used with placement groups (#9738 )	2020-07-31 10:51:37 -07:00
Richard Liaw	a47121476f	[tune] Remove accidentally added files (#9835 )	2020-07-30 21:47:27 -07:00
mehrdadn	a7b97b6f8a	Add shellcheck support (#8574 )	2020-07-30 18:39:28 -05:00
SangBin Cho	940617d092	Make test failure large. (#9822 )	2020-07-30 13:11:51 -07:00
krfricke	619e44e54a	[tune] Added WandbLogger (#9725 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Kai Fricke <kai@anyscale.com>	2020-07-30 13:09:03 -07:00
Barak Michener	68f3fec744	: Centralize requirements.txt and unify dependency versions (#9759 ) python_test: fix cython_examples in doc/ and tests/ * update setup.py to parse the bazel version string better * all: centralize all python deps into stackable requirements files in python/ * format * Move cython test into the proper package * Add cross-reference dependency comments for requirements and setup.py * re-enable version pinning on CI, fix formatting * fix up torchvision version * fix case in shell	2020-07-30 11:22:56 -07:00
Richard Liaw	0c3b9ebeef	[tune/sgd] Document func_trainable and add checkpoint context (#9739 ) Co-authored-by: krfricke <krfricke@users.noreply.github.com> Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>	2020-07-30 09:46:37 -07:00
SangBin Cho	826f14c824	[Stats] Fix harvestor threads + Fix flaky stats shutdown. (#9745 )	2020-07-29 18:57:59 -05:00
Alex Wu	6e294dd90f	[Core] Custom socket name (#9766 ) * fix issues * hot fixes * test * test * socket name change only	2020-07-29 13:19:41 -07:00
Alex Wu	e6696b2533	Fixed stderr logging (9765)	2020-07-29 13:19:04 -07:00
fangfengbin	a484947742	Fix leased worker leak bug if lease worker requests that are still waiting to be scheduled when GCS restarts (#9719 )	2020-07-29 14:16:03 +08:00
mehrdadn	fb5280f21b	Fix some Windows CI issues (#9708 ) Co-authored-by: Mehrdad <noreply@github.com>	2020-07-28 18:10:23 -07:00
Alex Wu	21af0ceb0c	Register function race (#9346 )	2020-07-28 13:51:34 -07:00
SangBin Cho	7e3ba289dc	[Stats] Basic Metrics Infrastructure (Metrics Agent + Prometheus Exporter) (#9607 )	2020-07-28 10:28:01 -07:00
Ian Rodney	b1c2983c97	Run _with_interactive in Docker (#9747 )	2020-07-28 08:57:04 -07:00
Alan Guo	5831737287	Introduce file_mounts_sync_continuously cluster option (#9544 ) * Separate out file_mounts contents hashing into its own separate hash Add an option to continuously sync file_mounts from head node to worker nodes: monitor.py will re-sync file mounts whenver contents change but will only run setup_commands if the config also changes * add test and default value for file_mounts_sync_continuously * format code * Update comments * Add param to skip setup commands when only file_mounts content changed during monitor.py's update tick Fixed so setup commands run when ray up is run and file_mounts content changes * Refactor so that runtime_hash retains previous behavior runtime_hash is almost identical as before this PR. It is used to determine if setup_commands need to run file_mounts_contents_hash is an additional hash of the file_mounts content that is used to detect when only file syncing has to occur. Note: runtime_hash value will have changed from before the PR because we hash the hash of the contents of the file_mounts as a performance optimization * fix issue with hashing a hash * fix bug where trying to set contents hash when it wasn't generated * Fix lint error Fix bug in command_runner where check_output was no longer returning the output of the command * clear out provider between tests to get rid of flakyness * reduce chance of race condition from node_launcher launching a node in the middle of an autoscaler.update call	2020-07-28 00:02:08 -07:00

1 2 3 4 5 ...

2693 commits