hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-10 05:16:49 -04:00

Author	SHA1	Message	Date
Kai Fricke	976ece4bc4	[tune] Add test for heterogeneous resource request deadlocks (#21397 ) This adds a test for potential resource deadlocks in experiments with heterogeneous PGFs. If the PGF of a later trial becomes ready before that of a previous trial, we could run into a deadlock. This is currently avoided, but untested, flagging the code path for removal in #21387.	2022-01-06 10:44:30 +00:00
Qing Wang	132e2b2a96	[Core] Remove unused flag put_small_object_in_memory_store (#21284 ) Since we have not been using `put_small_object_in_memory_store` flag for a long time, it's should be removed.	2022-01-06 14:46:58 +08:00
Archit Kulkarni	fd02065ce5	[CI] [docker] Fix docker image name regex matching (#21409 )	2022-01-05 18:59:10 -08:00
Qing Wang	3c68370fcf	[Core] Cache job_configs instead of ray_namespace. (#21279 ) We need to get not only ray_namespace config of a job. In this PR, we cache the job_configs instead of ray_namespaces, so that we can use it for other PR(For example, this PR #21249 needs the num_java_worker_pre_process item). Also, before this PR, ray_namespaces_ cache will not be cleared, and we clear the cache in this PR.	2022-01-05 17:48:06 -08:00
xwjiang2010	9528ac62cd	[tune] remove unused return_or_clean_cached_pg. (#21403 ) Unused code path.	2022-01-05 23:20:43 +00:00
Clark Zinzow	da4cc26449	[CI] Disable Java log rotation test. (#21394 )	2022-01-05 14:51:27 -08:00
Gagandeep Singh	62c9fc95ea	[CI] [Serve] Unskipped test and bumped wait time to avoid race condition in test_deploy.py (#21382 )	2022-01-05 14:28:42 -08:00
Ian Rodney	1b42a49e71	[CI] [Docker Build] Allow Branches with Double digits in regex matching(#21401 )	2022-01-05 14:19:19 -08:00
Simon Mo	f16b422062	[CI] Migrate Windows Wheels to Buildkite (#21388 )	2022-01-05 12:49:19 -08:00
Jiajun Yao	76b91efd9b	Fix wrong many_nodes_actor_test app config (#21404 ) RAY_GCS_ACTOR_SCHEDULING_ENABLED is wrong should be RAY_gcs_actor_scheduling_enabled. Since gcs based actor scheduling is not enabled yet so I just removed this flag.	2022-01-05 11:52:13 -08:00
Yi Cheng	72c9fef5f3	[nightly] Enable GCS HA nightly test with bootstrap (#21389 ) After https://github.com/ray-project/ray/pull/21232 we are able to start ray without redis. We need to bake the test for a while before turning on the flag by default. This PR add tests for this.	2022-01-05 10:53:07 -08:00
mwtian	24da654d90	[Test] Shard "Small & Large" tests (#21351 )	2022-01-05 10:49:14 -08:00
Sven Mika	853d10871c	[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376 )	2022-01-05 18:22:33 +01:00
Lixin Wei	64a2ba47d3	[Core] Rename PublisherService to SubscriberService (#20666 ) `PublisherClient` is a more reasonable name than `SubscriberClient` since XClient means ‘client used to access X’, like GcsClient. Besides, in the current codebase we already called this client `publisher_client`(line 329/333), while the actual class name is `SubscriberClient`, this is inconsistent. `a8d7897a56/src/ray/pubsub/subscriber.cc (L326-L339)`	2022-01-05 05:40:45 -08:00
Sven Mika	9e6b871739	[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330 )	2022-01-05 11:29:44 +01:00
SangBin Cho	94af7ccc92	[Actor exception message improvement] Unify the schema + improve error messages. (#21219 ) This PR is added to handle this comment; https://github.com/ray-project/ray/pull/20903#discussion_r772635662 The PR - Unifies the multiple actor died error to a single schema. (cannot unify runtime env or creation task exception) - Improve each of actor error message to include more metadata. - Include actor information to actor death cause.	2022-01-04 23:22:57 -08:00
mwtian	70db5c5592	[GCS][Bootstrap n/n] Do not start Redis in GCS bootstrapping mode (#21232 ) After this change in GCS bootstrapping mode, Redis no longer starts and `address` is treated as the GCS address of the Ray cluster. Co-authored-by: Yi Cheng <chengyidna@gmail.com> Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>	2022-01-04 23:06:44 -08:00
Philip Pilgerstorfer	8884cf0f4f	[Java] Bump log4j 2.17.0 to 2.17.1 (#21373 ) New log4j version fixes vulnerability: * https://nvd.nist.gov/vuln/detail/CVE-2021-44832	2022-01-05 09:58:48 +08:00
Qing Wang	240e6efe21	[Java] Try to fix flaky NamespaceTest (#21370 )	2022-01-05 09:01:34 +08:00
Gagandeep Singh	819e034023	Unskipped `test_reconfigure_with_exception` & `test_deploy_handle_validation` (#21374 ) These two tests pass without issues on my Windows machine. Rest time out or fail.	2022-01-04 12:58:11 -08:00
Antoni Baum	3632494ce0	[train] Fix `start_training` in logging callbacks (#21357 ) Fixes outdated `start_training` definitions and calls in Train logging callbacks & abstract classes.	2022-01-04 12:46:39 -08:00
xwjiang2010	fc22200af8	[tune] deflake pbt. (#21366 ) We use `trial.checkpoint` to restore a perturbed trial. Currently trial.checkpoint is looking at both in-memory and persistent checkpoints to find the most recent one. The definition of "the most recent one" is based on iteration. This may no longer be a valid assumption in PBT case, considering `trial_low_quantile` may have an iter=2_persistent_checkpoint as well as a iter=1_in_memory_checkpoint (perturbed from `trial_upper_quantile`).	2022-01-04 20:33:17 +00:00
shrekris-anyscale	e45383793f	[Serve] Clean up router.py (#21344 )	2022-01-04 09:46:33 -08:00
Sven Mika	c01245763e	[RLlib] Revert "Revert "updated pettingzoo wrappers, env versions, urls"" (#21339 )	2022-01-04 18:30:26 +01:00
Kai Fricke	94242e3e6e	[ci/repro] Add SYS_PTRACE to docker container, use unique name (#21377 ) This will start repro docker containers with SYS_PTRACE capabilities to enable debugging e.g. via py-spy. Additionally, default instance name tags for instance re-use will be generated using the buildkite build id and job id.	2022-01-04 16:59:12 +00:00
Jiajun Yao	5aa00ba5eb	[doc] Fix typos in serve documentation (#21379 )	2022-01-04 10:56:07 -06:00
Kai Fricke	aa35045b6f	[ci/release] Update to recent anyscale API changes (#21149 ) Recent changes in the anyscale API rendered the current e2e script incompatible. This PR resolves these subtle API changes.	2022-01-04 11:21:47 +00:00
Sven Mika	abd3bef63b	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
mwtian	8cc268096c	[GCS][Bootstrap 3/n] Refactor to support GCS bootstrap (#21295 ) This PR refactors several components to support switching to GCS address bootstrapping later: - Treat address from `ray.init()` and `ray` CLI as bootstrap address instead of assuming it is Redis address. - Ray client servers support `--address` flag instead of `--redis-address`. - A few other miscellaneous cleanup. Also, add a test for starting non-head node with `ray start`.	2022-01-03 23:52:12 -08:00
Jiao	6e77b3945d	[Serve] [nit] Remove unreachable line in `ActorReplicaWrapper`(#21361 )	2022-01-03 17:08:58 -08:00
Simon Mo	e60a5f52eb	[Serve] Fix iterator-and-mutate bug in FastAPI view (#21362 )	2022-01-03 17:02:31 -08:00
Tao Wang	b9106483af	[Core]Clear the unnecessary fields before broadcasting (#20965 ) Only `resource_avaialbe` and `resource_total` are used in raylet, so let's clear the rest before broadcasting.	2022-01-03 15:56:41 -08:00
Balaji Veeramani	7efe1bef11	[Train] Add `PrintCallback` (#21261 ) Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>	2022-01-03 14:03:04 -08:00
Chen Shen	704404d408	[BigDataTraining] Fix test script introduced by API change (#21347 ) * fix * fix test failure * Update release/nightly_tests/dataset/ray_sgd_training.py Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>	2022-01-03 12:14:36 -08:00
Archit Kulkarni	4581baa7dc	Revert "WINDOWS: unskip passing runtime_env tests (#21252 )" (#21352 ) This reverts commit `fcb952e1bc`.	2022-01-03 11:07:17 -08:00
Balaji Veeramani	43a9e95dc0	[CI] Add support for Black formatting (#21281 )	2022-01-03 10:06:41 -08:00
Balaji Veeramani	4e8f90aca2	[Train] Replace `abc.ABCMeta` with `abc.ABC` in callbacks (#21262 ) Inheriting from `abc.ABC` is more readable than setting the meta class to `abc.ABCMeta`. Relevant snippet from the Python 3.4 release notes: > New class ABC has ABCMeta as its meta class. Using ABC as a base class has essentially the same effect as specifying metaclass=abc.ABCMeta, but is simpler to type and easier to read. (Contributed by Bruno Dupuis in bpo-16049.) Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Matthew Deng <matthew.j.deng@gmail.com>	2022-01-03 09:25:44 -08:00
Balaji Veeramani	fa4e41c5b2	[Train] Monkeypatch environment variables in `test_json` (#21260 ) If we use `os.environ` to set environment variables in tests, then our tests become coupled. By using `monkeypatch`, we can safely set environment variables while ensuring our tests remain decoupled. For more information, see the [monkeypatching documentation](https://docs.pytest.org/en/6.2.x/monkeypatch.html#monkeypatching-environment-variables).	2022-01-03 09:12:44 -08:00
Antoni Baum	7ce22b72ed	[datasets] Expand `to_torch`'s functionality (#21117 ) Expands the `to_torch` method for Datasets with: * An ability to choose to output a list/dict of feature tensors instead of just one (through setting `feature_columns` to be a list of lists or a dict of lists) * An ability to choose whether the label should be unsqueezed or not * An ability to pass `None` as the label (for prediction). Furthermore, this changes how the `feature_column_dtypes` argument works. Previously, it took a list of dtypes for each feature. However, as the tensor was concatenated in the end, only one dtype mattered (the biggest one). Now, this argument expects a single dtype which will be applied to the features tensor (or a list/dict if `feature_columns` is a list of list/dict of lists). Unit tests for all cases are included. Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>	2022-01-03 09:03:50 -08:00
xwjiang2010	c18caa4db3	[tune] remove TrialExecutor.resume_trial. (#21225 ) This removes unused code.	2022-01-03 16:38:40 +00:00
Antoni Baum	6a2dedb41d	[tune] Fix dtype coercion in tune.choice (#21270 ) When a list with mixed types is passed to tune.choice, they will be coerced to a single dtype during sampling (due to numpy.choice converting to an array internally). This behaviour is unintentional and surprising. This PR fixes this issue.	2022-01-03 16:32:30 +00:00
Kai Fricke	10290eeb2f	[ci] Pin manylinux docker image (#21341 )	2022-01-03 14:36:21 +00:00
Kai Fricke	489e6945a6	Revert "[RLlib] Updated pettingzoo wrappers, env versions, urls (#20113 )" (#21338 ) This reverts commit `327eb84154`.	2022-01-03 10:21:25 +00:00
Benjamin Black	327eb84154	[RLlib] Updated pettingzoo wrappers, env versions, urls (#20113 )	2022-01-02 21:29:09 +01:00
Ishant Mrinal	ec34185771	[RLlib] RE3 documentation (#21199 )	2022-01-02 17:31:53 +01:00
Carlo Grisetti	ff768ea9d4	[RLlib] Change deprecated `rllib/utils/tf_ops.py` import (#20978 )	2022-01-02 17:29:37 +01:00
Balaji Veeramani	c263008c07	[RLlib] Move `__grouping_doc_end__` (#21321 ) These changes are needed for two reasons. `__grouping_doc_end__` is in the wrong place If you look at the part of the Ray documentation where the tag is referenced, you'll read > You can use the MultiAgentEnv.with_agent_groups() method to define these groups: However, if you look at the code snippet below, you'll see the implementation of `to_base_env` in addition to the implementation of `with_agent_groups`. To remove `to_base_env` from the code snippet, we need to move `__grouping_doc__end__`. Black cannot format `multi_agent_env.py` For some reason, Black errors while formatting `multi_agent_env.py`. However, if we move `__grouping_doc_end__` up, the issue is resolved.	2022-01-01 20:11:06 -08:00
Balaji Veeramani	fae5b9b1af	[Core] Disable formatting in `test_add_min_workers_nodes` (#21322 ) Black errors while formatting `test_resource_demand_scheduler.py`. The issue is caused by the [assertions](https://github.com/ray-project/ray/blob/master/python/ray/tests/test_resource_demand_scheduler.py#L383-L428) at the end of `test_add_min_workers_nodes`. To prevent `format.sh` from erroring once we switch to Black, I've disabled formatting around the assertions.	2022-01-01 18:16:33 -08:00
Balaji Veeramani	416bce6378	Ignore E731 in `worker_set.py` and `sampler.py` (#21320 )	2022-01-01 18:05:14 -08:00
Qing Wang	340fbf53c0	[Java] Support actor handle reference counting. (#21249 )	2022-01-01 10:26:22 +08:00

... 2 3 4 5 6 ...

11005 commits