hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 02:21:39 -05:00

Author	SHA1	Message	Date
Kai Fricke	489e6945a6	Revert "[RLlib] Updated pettingzoo wrappers, env versions, urls (#20113 )" (#21338 ) This reverts commit `327eb84154`.	2022-01-03 10:21:25 +00:00
Benjamin Black	327eb84154	[RLlib] Updated pettingzoo wrappers, env versions, urls (#20113 )	2022-01-02 21:29:09 +01:00
Ishant Mrinal	ec34185771	[RLlib] RE3 documentation (#21199 )	2022-01-02 17:31:53 +01:00
Carlo Grisetti	ff768ea9d4	[RLlib] Change deprecated `rllib/utils/tf_ops.py` import (#20978 )	2022-01-02 17:29:37 +01:00
Balaji Veeramani	c263008c07	[RLlib] Move `__grouping_doc_end__` (#21321 ) These changes are needed for two reasons. `__grouping_doc_end__` is in the wrong place If you look at the part of the Ray documentation where the tag is referenced, you'll read > You can use the MultiAgentEnv.with_agent_groups() method to define these groups: However, if you look at the code snippet below, you'll see the implementation of `to_base_env` in addition to the implementation of `with_agent_groups`. To remove `to_base_env` from the code snippet, we need to move `__grouping_doc__end__`. Black cannot format `multi_agent_env.py` For some reason, Black errors while formatting `multi_agent_env.py`. However, if we move `__grouping_doc_end__` up, the issue is resolved.	2022-01-01 20:11:06 -08:00
Balaji Veeramani	fae5b9b1af	[Core] Disable formatting in `test_add_min_workers_nodes` (#21322 ) Black errors while formatting `test_resource_demand_scheduler.py`. The issue is caused by the [assertions](https://github.com/ray-project/ray/blob/master/python/ray/tests/test_resource_demand_scheduler.py#L383-L428) at the end of `test_add_min_workers_nodes`. To prevent `format.sh` from erroring once we switch to Black, I've disabled formatting around the assertions.	2022-01-01 18:16:33 -08:00
Balaji Veeramani	416bce6378	Ignore E731 in `worker_set.py` and `sampler.py` (#21320 )	2022-01-01 18:05:14 -08:00
Qing Wang	340fbf53c0	[Java] Support actor handle reference counting. (#21249 )	2022-01-01 10:26:22 +08:00
Kai Fricke	14ed7cfaaa	[ci] Add repro-ci.py script to automatically setup Buildkite-runner-like instances to debug CI runs (#21292 ) Create an AWS instance to reproduce Buildkite CI builds. This script will take a Buildkite build URL as an argument and create an AWS instance with the same properties running the same Docker container as the original Buildkite runner. The user is then attached to this instance and can reproduce any builds commands as if they were executed within the runner. This utility can be used to reproduce and debug build failures that come up on the Buildkite runner instances but not on a local machine.	2021-12-31 10:31:50 +00:00
WanXing Wang	412cd6be76	[Core]Add RAY_REDIS_ADDRESS environment to specify external address. (#20966 ) Support RAY_REDIS_ADDRESS environment variable option when ray start.	2021-12-31 16:12:56 +08:00
Tao Wang	a78baf4075	[Java]Init gcs client in runtime only if necessary (#21072 ) There's a redis connection in gcs client, but most time the gcs client is never used in worker. We can make the initialization lazy to reduce redis connections. After that, the number of redis connections reduces from 2 to 1 in one core worker.	2021-12-30 15:44:06 +08:00
Shawn	4f9aceb3a6	[Java] Native memory support (#21256 ) This PR povided universal native memory access support in java worker mentioned in #21234, which will also be the foundation for later zero-copy and serialization. The main changes include: * Native memory operations based on `sun.misc.Unsafe` * Little-Endian based Native memory buffer. * Native memory based IO operations: * InputStream/OutputStream * ReadChannel/WriteChannel * MockReadChannel/MockWriteChannel	2021-12-30 15:31:22 +08:00
mwtian	20ca1d85c2	[GCS][Bootstrap 2/n] Fix tests to enable using GCS address for bootstrapping (#21288 ) This PR contains most of the fixes @iycheng made in #21232, to make tests pass with GCS bootstrapping by supporting both Redis and GCS address as the bootstrap address. The main change is to use address_info["address"] to obtain the bootstrap address to pass to ray.init(), instead of using address_info["redis_address"]. In a subsequent PR, address_info["address"] will return the Redis or GCS address depending on whether using GCS to bootstrap.	2021-12-29 19:25:51 -07:00
Jiajun Yao	9776e21842	Revert "Round robin during spread scheduling (#19968 )" (#21293 ) This reverts commit `60388b2834`.	2021-12-30 10:33:06 +09:00
mwtian	0b3fed5ef3	Revert "[Nightly Test] Add a team column to each test config. (#21198 )" (#21289 ) This reverts commit `b5b11b2d06`.	2021-12-30 06:44:51 +09:00
Qing Wang	8d2f53e25b	[Java] Add dependency reduced pom file to gitignore. (#21282 )	2021-12-29 21:49:06 +08:00
mwtian	5377832383	[GCS][Bootstrap 1/n] Support bootstrapping with GCS in node.py (#21267 )	2021-12-28 08:14:38 -07:00
Qing Wang	663e14b232	[Java] Fix namespace test case. (#21280 ) Since we've supported lifetime in Java, we should set the DETACHED for the detached actors in test.	2021-12-28 22:31:51 +08:00
WanXing Wang	e5920dee8e	[Core]Refine StealTasks rpc. (#21258 ) It seems that the `StealTasks` rpc has no different from other common rpc methods, should be implemented by `VOID_RPC_CLIENT_METHOD` macro. We find this when merge code into our internal codebase.	2021-12-28 14:17:25 +08:00
Philipp Moritz	4b9e865fd7	Unskip remaining tests in test_basic.py on Windows (#21273 )	2021-12-27 21:20:45 -08:00
SangBin Cho	b5b11b2d06	[Nightly Test] Add a team column to each test config. (#21198 ) Please review e2e.py and test_suite belonging to your team! This is the first part of https://docs.google.com/document/d/16IrwerYi2oJugnRf5hvzukgpJ6FAVEpB6stH_CiNMjY/edit# This PR adds a team name to each test suite. If the name is not specified, it will be reported as unspecified. If you are running a local test, and if the new test suite doesn't have a team name specified, it will raise an exception (in this way, we can avoid missing team names in the future). Note that we will aggregate all of test config into a single file, nightly_test.yaml.	2021-12-27 14:42:41 -08:00
Matti Picus	3de18d2ada	WINDOWS: enable passing/skipping tests (#21136 )	2021-12-27 11:59:00 -08:00
Israël Hallé	59209d695b	Includes .pyi files in package data. (#21247 )	2021-12-27 11:50:02 -08:00
Philipp Moritz	583744ab57	Graduate Ray on Windows from experimental to beta (#21268 )	2021-12-27 00:19:48 -08:00
Matti Picus	fcb952e1bc	WINDOWS: unskip passing runtime_env tests (#21252 )	2021-12-26 20:49:02 -08:00
Akash Patel	cbcd03b779	Upgrade cython to 0.29.26 for py310 (#21244 )	2021-12-26 20:26:08 -08:00
xwjiang2010	0b9cdb1eae	[tune] Have one canonical way of stopping trial. (#21021 ) This PR is introducing a canonical impl for stopping trials by collecting scattered logic from process_trial_result back into stop_trial. This way, we know what is expected (e.g. what callbacks are invoked and when they are invoked). This PR will correct the current wrong logic that on_trial_complete callback is invoked before on_trial_checkpoint, which is the source of Syncer clean up issues.	2021-12-25 10:13:30 +01:00
Gagandeep Singh	c5c5fec22b	Unskip `test_standalone` from ci.sh (#21235 )	2021-12-25 00:21:58 -08:00
Yi Cheng	0d537c5d70	[5/gcs] Bootstrap default worker and update pubsub unit test (#21211 ) This PR passes gcs address to worker and also update pubsub unit test. Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com> Co-authored-by: Mingwei Tian <mwtian@anyscale.com>	2021-12-23 07:57:14 -07:00
Qing Wang	2df27a5f87	[Java] Support ActorLifetime (#21074 ) We add a enum class ActorLifetime to indicate the lifetime of an actor. In this PR, we also add the necessary API to create an actor with specifying lifetime. Currently, it has 2 values: detached and default.	2021-12-23 19:48:56 +08:00
Qing Wang	e653d47533	[Java] Shade some widely used dependencies in bazel_jar_jar rule. (#21237 ) These dependencies are widely used: - com.google.common - com.google.protobuf - com.google.thirdparty So that we need to shade them to avoid being conflict with jars introduced by user. In this PR, we introduce a `bazel_jar_jar` rule for doing these and also shade them in maven pom files.	2021-12-23 16:54:31 +08:00
Jiajun Yao	60388b2834	Round robin during spread scheduling (#19968 )	2021-12-22 20:27:34 -08:00
SangBin Cho	99693096d6	[gRPC] Improve blocking call Placement group (#21130 ) Use Sync methods with timeout for placement group RPCs	2021-12-22 17:21:56 -08:00
Yi Cheng	11ab412db1	[4/gcs] Bootstrap global accessor from gcs (#21195 ) This is part of redis removal. This PR enable global accessor to be able to start from gcs Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com> Co-authored-by: Mingwei Tian <mwtian@anyscale.com>	2021-12-22 01:27:25 -08:00
Gagandeep Singh	92bf609a08	Unskip tests in ``test_basic_3.py`` (#20433 )	2021-12-22 00:09:32 -08:00
Yi Cheng	0c786b1109	[3/gcs] Bootstrap log monitor and monitor from gcs (#21194 ) This is part of redis removal. This PR enable log monitor and monitor to bootstrap from gcs Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com> Co-authored-by: Mingwei Tian <mwtian@anyscale.com>	2021-12-21 23:15:55 -08:00
Simon Mo	cfe0897d05	[CI] Migrate Windows tests to Buildkite (#21227 )	2021-12-21 20:16:34 -08:00
Sidhartha Parhi	5d6409fe2e	[Train] Remove `run_dir` param from `BackendExecutor` (#21231 ) The run_dir argument in ray.train.backend.BackendExecutor.start_training isn't used but is causing the following error: if your host computer and job cluster use different OS, then you get a pathlib error because, for e.g., you can't instantiate a pathlib.WindowsPath in a Linux system. Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>	2021-12-21 19:54:43 -08:00
Amog Kamsetty	57db4640ca	[Train] [Tune] Refactor MLflow (#20802 ) Pulls out Tune's MLflow logging logic to a shared MLflow util. Adds an MLflow logger callback to Ray Train Closes #20642	2021-12-21 17:17:52 -08:00
Yi Cheng	09421a4ca6	[2/gcs] Bootstrap dashboard for gcs ha (#21179 ) This is part of gcs ha project. This PR try to bootstrap dashboard with gcs address instead of redis. Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com>	2021-12-21 16:58:03 -08:00
Eric Liang	1db03862a7	Isolate function exports by job in separate queues (#20882 )	2021-12-21 16:19:00 -08:00
Jiajun Yao	7d861a2c58	[Test] Add ray wheel sanity check (#21223 )	2021-12-21 14:24:02 -08:00
Gagandeep Singh	5dc0f90ada	[Windows] Unskipped tests in test_standalone.py (#21213 )	2021-12-21 11:37:23 -08:00
Yi Cheng	f62faca04c	[1/gcs] gcs ha bootstrap for raylet (#21174 ) This is part of #21129 This PR tries to cover the cpp/ray part of the bootstrap, some updates there: remove the unused function/tests some API updates Co-authored-by: mwtian <81660174+mwtian@users.noreply.github.com>	2021-12-21 08:50:42 -08:00
SangBin Cho	5d3042ed9d	[Internal Observability] Record Raylet Gauge (#21049 ) * Revert "[Please revert] Remove new metrics temporarily" This reverts commit baf7846daa3d1dad50dbedac19b7afbae3e197fc. * Addressed code review. * [Please revert] Revert plasma stats for the next PR * improve grammar * Addressed code review v1. * Addressed code review. * Add code owner. * Fix tests. * Add code owner to metric_defs.cc	2021-12-21 00:34:48 -08:00
Sven Mika	62dbf26394	[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984 )	2021-12-21 08:39:05 +01:00
Dmitri Gekhtman	c9cf912a15	[autoscaler] Pass on provider.internal_ip() exceptions during scale down (#21204 ) Treats failures of provider.internal_ip during node drain as non-fatal. For example, if a node is deleted by a third party between the time it's scheduled for termination and drained, there will now be no error on GCP. Closes #21151	2021-12-20 22:23:17 -08:00
qicosmos	d1a27487a3	[C++ Worker] fix uninit ray runtime instance (#21125 ) In some compiler, the static ray runtime in ray runtime holder maybe a new un-init instance in dynamic library, so we need to init ray time holder in dynamic library to make sure the new instance valid.	2021-12-21 12:07:59 +08:00
Qing Wang	94251fbcc4	[Core] Fix invalid to specify concurrency group at runtime. (#21191 ) We fix the issue that it's unable to specify the concurrency group name of an actor task at runtime with the following usage: ```python a.f2.options(concurrency_group="compute").remote() ```	2021-12-21 10:47:47 +08:00
Linsong Chu	61bbecdb7d	[Workflow]add doc for metadata (#20156 ) This PR adds documentation for Workflow Metadata, which we recently added support in https://github.com/ray-project/ray/pull/19372. Co-authored-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>	2021-12-20 17:24:07 -08:00

1 2 3 4 5 ...

10813 commits