Commit graph

9434 commits

Author SHA1 Message Date
qicosmos
ba0084e9c7
[C++ Worker]Add gcs global state accessor (#17976) 2021-09-09 12:08:08 +08:00
Lixin Wei
df803cee98
Revert "Revert "[Core] Fix ServerCall Leaking (#17863)" (#18410)" (#18424) 2021-09-08 19:55:06 -07:00
architkulkarni
5affb074aa
[Test] deflake test_runtime_env.py::test_no_spurious_worker_startup (#17809) 2021-09-08 16:35:08 -07:00
Clark Zinzow
c0ea2755a0
Fix iter_batches dropping batches when prefetching. (#18441) 2021-09-08 15:37:38 -07:00
Clark Zinzow
6fc91fd47e
Create directory on write if it doesn't exist. (#18435) 2021-09-08 15:31:06 -07:00
Simon Mo
6d24214085
[Release] Make sure to uninstall ray for rllib_tests (#18448) 2021-09-08 23:29:40 +01:00
Edward Oakes
f0555f88d6
[runtime_env] Move worker process startup logic to context (#18341) 2021-09-08 17:08:27 -05:00
Antoni Baum
dd6abed6ce
[tune] Fix an edge case where DurableTrainable would not delete checkpoints in remote storage (#18318)
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-09-08 15:00:09 -07:00
Sven Mika
cd22a7d1bb
[RLlib] Add locking to PolicyMap in case it is accessed by a RolloutWorker and the same worker's AsyncSampler or the main LearnerThread. (#18444) 2021-09-08 23:32:23 +02:00
gjoliver
50cdf551ce
[RLlib] Fix test name typo. (#18423)
Co-authored-by: Jun Gong <jungong@mbpro.local>
2021-09-08 23:30:37 +02:00
gjoliver
808b683f81
[RLlib] Add a unittest for learning rate schedule used with APEX agent. (#18389) 2021-09-08 23:29:40 +02:00
Ian Rodney
c91e0eb065
[Dashboard] Increase Actor Snapshot Size (#18433) 2021-09-08 12:06:33 -07:00
Lixin Wei
052ed115e7
[Core] Make It Easier to Grep Debug State Dump (#18382)
* add keyword to debug dump

* fix
2021-09-08 12:03:54 -07:00
Yi Cheng
6011d4197f
Open [nightly] Add many_nodes_actor_test to nightly test (#18406) 2021-09-08 11:15:48 -07:00
Yi Cheng
7126d01c91
[core] upgrade gtest (#18288)
* up

* up

* format

* up

* flaky fix

* format

* up

* up

* format

* add debug

* up

* up

* up

* up

* up

* format

* fix

* format

* up

* up

* format
2021-09-08 11:15:34 -07:00
Sven Mika
45f60e51a9
[RLlib] DDPPO fixes and benchmarks. (#18390) 2021-09-08 19:39:01 +02:00
Sasha Sobol
f76f14fedf
[client] pass _credentials down from init (#18425) 2021-09-08 10:30:26 -07:00
Clark Zinzow
b30c41759d
[Datasets] Adds tensor column support (tensors-in-tables) via Pandas/Arrow extension types/arrays. (#18301) 2021-09-08 10:09:01 -07:00
mwtian
e427e4a467
Fix flakiness in test_proxy_manager_internal_kv (#18416) 2021-09-08 15:46:45 +03:00
Kai Fricke
dac3a8bc8e
[setup] Upstream conda patches (#17575)
Co-authored-by: Vasilij Litvinov <vasilij.n.litvinov@intel.com>
2021-09-08 10:37:17 +01:00
Lingxuan Zuo
46b941b702
[Streaming] Support streaming metric reporter (#17981)
* Streaming support metric reporter

* fix lint

* fix bazel format lint

* fix lint

* metric deps lint

* lint

* and comments for runtime reporter

* unordered_map instead

* comments

* fix visibility flag

* deps local .so target

* make stats public visibility

* stats lib in public

* add antgroup team tag
2021-09-08 14:36:00 +08:00
Chen Shen
df9c6aa863
[plasma] Check if the get request is removed (#18401) 2021-09-07 21:01:08 -07:00
Edward Oakes
56adaa32f1
[serve] Better logging for exceptions in backend_state.update() (#18402) 2021-09-07 21:40:41 -05:00
Simon Mo
a29da81cfc
Revert "Revert "Fix tracing bug when actors are defined before connecting to …" (#16122) 2021-09-07 16:19:49 -07:00
Edward Oakes
f2afb08125
[runtime_env] Don't modify passed runtime_env dictionary when validating (#18404) 2021-09-07 16:14:28 -07:00
Chen Shen
d65d291579
Revert "[Core] Fix ServerCall Leaking (#17863)" (#18410)
This reverts commit 4f6b50dc46.
2021-09-07 15:47:58 -07:00
Lada Kunc
1a72c49009
[serve] Fix get_handle execution from threads (#18198) 2021-09-07 14:49:36 -07:00
Guyang Song
f104a5aad7
[docs] Fix cpp wheel description (#18386) 2021-09-07 15:45:04 -05:00
Lixin Wei
4f6b50dc46
[Core] Fix ServerCall Leaking (#17863)
* fix backpressure bug

* update comments

* stash

* add test

* add basic tests

* add fixture

* stash

* fix

* draft

* fix

* test added

* fixed

* fixed

* lint

* Update src/ray/rpc/test/grpc_server_test.cc

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* add copyright

* move test service to saperate file

* add ClientCallManager timeout tests

* fix

* lint

* lint

* lint

* test windows CI

* fix

* lint

* lint

* retry windows

* retry windows

* fix mac

* lint

* lint

Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2021-09-07 12:15:43 -07:00
xwjiang2010
64c2f86a22
[Tune] Respect default_resources during Trial.reset(). (#18209) 2021-09-07 19:14:44 +01:00
Clark Zinzow
26b2720915
Add test coverage for writing to fsspec filesystems. (#18394) 2021-09-07 10:16:59 -07:00
Ian Rodney
ec2110e470
[Codeowners] Add Chris & Mingwei to Ray Client proto (#18395) 2021-09-07 09:17:23 -07:00
Jiajun Yao
2740d28fad
[client] Increase timeout for ProxyManager.get_channel (#18350) 2021-09-07 11:06:17 -05:00
Qing Wang
d87441cda7
[Java] ConcurrencyGroup in Java local mode. (#18241)
* WIP

* Fix

* Fix test

* Refine

* Fix lint,

* WIP2

* WIP2

* Refine

* Put a default concurrency group.

* Fix submitting task with concurrency group name.

* Remove unnecessary changes.

* Update java/runtime/src/main/java/io/ray/runtime/task/LocalModeTaskSubmitter.java

Co-authored-by: Kai Yang <kfstorm@outlook.com>

Co-authored-by: Kai Yang <kfstorm@outlook.com>
2021-09-07 20:43:31 +08:00
Sven Mika
cabaa3b3c6
[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. (#18381) 2021-09-07 11:48:41 +02:00
Jiajun Yao
64040a90a5
Datasets schema should match the columns selection for Parquet (#18361) 2021-09-07 00:41:26 -07:00
Sasha Sobol
f24ccf475e
[client] Add a grpc.ChannelCredentials argument to ray.init (#18365)
Co-authored-by: Thomas Desrosiers <thomas@anyscale.com>
2021-09-07 00:17:13 -07:00
Sven Mika
56f142cac1
[RLlib] Add support for evaluation_num_episodes=auto (run eval for as long as the parallel train step takes). (#18380) 2021-09-07 08:08:37 +02:00
Kai Fricke
f3a3a4bc92
[tune] Queue more than more actor/placement group (#18338) 2021-09-06 09:41:08 -07:00
Sven Mika
5292b70fc6
[RLlib] Add multi-GPU attention net tests to nightly test suite (+ R2D2 tests for LSTM and attention nets). (#18368) 2021-09-06 17:48:05 +02:00
Kai Fricke
d9552e6795
Update release process doc and checklist (#18336)
Co-authored-by: Qing Wang <kingchin1218@126.com>
2021-09-06 14:09:31 +01:00
Sven Mika
e3e6ed7aaa
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358) 2021-09-06 12:14:20 +02:00
Sven Mika
59f796edf3
[RLlib] Fix crash when using StochasticSampling exploration (most PG-style algos) w/ tf and numpy > 1.19.5 (#18366) 2021-09-06 12:14:00 +02:00
Guyang Song
5a89b47f56
[Event] support set event level (#18275)
Co-authored-by: Hao Chen <chenh1024@gmail.com>
2021-09-06 16:41:49 +08:00
Eric Liang
cbdafa0b63
[doc] Fix various workflow doc bugs (#18357) 2021-09-06 01:39:08 -07:00
Chen Shen
7c9d261dce
[Core][plasma] consolidate stats calculation for plasma store 2021-09-05 22:24:21 -07:00
Richard Liaw
0594deafdf
[tune] allow users to configure bootstrap for docker syncer (#17786) 2021-09-05 22:04:31 -07:00
Richard Liaw
93f7976215
[docs/deps] Clean up dependency ux/docs #18360
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-09-05 22:03:32 -07:00
qicosmos
1da05209b9
[C++ Worker]Add get actor API. (#17897)
* linkopts shared

* add get actor api

* fix

* improve

* reduce some duplicate code

* improve some
2021-09-06 11:46:46 +08:00
Sven Mika
ba58f5edb1
[RLlib] Strictly run evaluation_num_episodes episodes each evaluation run (no matter the other eval config settings). (#18335) 2021-09-05 15:37:05 +02:00