Commit graph

4789 commits

Author SHA1 Message Date
Jiao
c6436ba7d6
[Serve] Add ray serve's logging context manager (#16468)
* Add ray serve's logging context manager

* Add ray serve's logging context manager

run formatting script scripts/format.sh

* fix missing package-lock json file

* linter

Co-authored-by: Jiao Dong <jiaodong@anyscale.com>
2021-06-16 13:17:07 -07:00
Clark Zinzow
00eb833de2
[Core] Stopgap fix for async actor lost object bug, and adds reproduction as test. (#16414)
* Support asyncio with max_concurrency == 1.

* Added test that reproduces lost object error.

* Create a fiber thread per caller instead of sharing a fiber thread among all callers.

* Formatting.

* Remove debug print statement.

* Try to accomodate dumb stupid linter that apparently doesn't know that async list comprehensions landed in Python 3.6, let alone await in list literals.
2021-06-16 12:39:45 -07:00
SangBin Cho
5997d19a5a
[Test] Global gc unit test flakniess fix (#16471) 2021-06-16 09:26:04 -07:00
SangBin Cho
90599d3562
[Pubsub] Use a pubsub module for Ownership based object directory (#16407)
* in progress

* In progress 2

* progress

* OBOD pubsub done

* Fix

* Fix a bug.

* Clean up getObjectLocationOwner

* Fix a build issue.

* Lint issue.

* test fix in progress

* continue debugging

* in progress

* Fix issues again.

* Formatting

* formating

* fix issues.

* Revert "fix issues."

This reverts commit 2da577e68abc6278e03d64a60e8b96c3136145bf.

* Fix a critical bug.

* Revert "Revert "fix issues.""

This reverts commit 6546ecbd1eb9798de0bf990b30b85a3ca3e5b4ad.

* Addressed code review.
2021-06-16 09:15:13 -07:00
Ian Rodney
90805d302f
[Client] Fix ArgParse (#16456)
Co-authored-by: Ian Rodney <ilr@anyscale.com>
2021-06-15 23:52:02 -07:00
Antoni Baum
ec7d7c8630
[Tune] Add soft imports test (#16450) 2021-06-15 18:50:21 -07:00
Eric Liang
5967cd3cf3
Make placement_group=None work as expected. (#16437)
* update

* add task test

* fix
2021-06-15 18:30:53 -07:00
Antoni Baum
2fb10e6730
[SGD] Add support for native Torch AMP in SGD (#16382)
* SGD native AMP initial commit

* SGD native amp second pass

* Update docs

* Update TorchTrainer doc

* Temp fix release test

* Update release/sgd_tests/sgd_gpu/sgd_gpu_app_config.yaml

Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2021-06-15 17:48:21 -07:00
Amog Kamsetty
ca22df2367
[Dask] Re-enable scheduler on dask_shuffle example (#16405) 2021-06-15 17:47:57 -07:00
Amog Kamsetty
d23494d25a
[CI] Move test_shuffle to Medium tests (#16447)
* move

* unskip test
2021-06-15 17:45:54 -07:00
Amog Kamsetty
3bf8f94fa3
Skip test_shuffle on Travis (#16449) 2021-06-15 13:54:01 -07:00
Ruoyun Huang
562018b55a
[sgd] Use target label count as training batch size (#16400) 2021-06-15 12:09:51 -07:00
junfan.zhang
2abb1e1d38
Fix misleading tips in scripts.py (#16426) 2021-06-15 11:42:05 -07:00
Eric Liang
823c9af20d
Skip test_shuffle_hang on Windows 2021-06-15 11:21:22 -07:00
Antoni Baum
adb3b61c03
[tune] Fix ConcurrencyLimiter batch mode never finishing if searcher limits concurrency itself (#16416) 2021-06-15 11:18:12 -07:00
Eric Liang
1ef207abb6
Call Unblockifneeded (#16422) 2021-06-15 08:40:23 -07:00
Sven Mika
d0014cd351
[RLlib] Policies get/set_state fixes and enhancements. (#16354) 2021-06-15 13:08:43 +02:00
Eric Liang
992437eafe
Yield plasma lock to other threads during long-running gets (#16408) 2021-06-14 16:23:05 -07:00
Eric Liang
f93ca2b673
Make it much simpler to turn on event stats (#16401) 2021-06-14 09:51:24 -07:00
Amog Kamsetty
f9936c4252
[Dask] Dask Example Tests (#16346)
* add examples

* update dask docs

* add build file

* formatting

* fix ci command

* fix

* Update python/ray/util/dask/BUILD

* newline

* fix pytest fixtures

* fixes

* formatting

* fix shuffle example
2021-06-12 20:25:45 -07:00
Qingyun Wu
dae3ac1def
[Tune] Add new searchers from FLAML (#16329) 2021-06-12 02:10:51 -07:00
Xianyang Liu
59f639f9db
[core] Fixes connect from worker node failed (#16045)
* fixes connect from worker node

* add UT

* fixes

* address comments
2021-06-11 18:51:46 -07:00
Chen Shen
24e409f948
[spilled object push optimization 3/3] ObjectManager Push from Spilled Object (#16364) 2021-06-11 15:57:51 -07:00
Philipp Moritz
ab092d901f
Increase redis connection timeout (#16384) 2021-06-11 15:57:35 -07:00
architkulkarni
be1129e04f
[Serve] Add tests for Serve quickstart with ray client (#16344) 2021-06-11 15:43:47 -07:00
Chris K. W
3fa9f2e5d6
[Modin] Add tests for modin (#16260)
Adds modin tests that run with and without ray client.
2021-06-11 12:23:33 -07:00
Kai Fricke
e8f8e9f328
[tune] Adjust searcher sample bounds to match Tune API (#15899) 2021-06-11 14:31:08 +01:00
Eric Liang
47bbca04be
Add fallback allocator stats to "ray memory" (#16362) 2021-06-10 18:33:59 -07:00
mwtian
2e8d8fba02
[Build] Another attempt at building Python 3.9 MacOS wheels (#16347)
* Test rollforward python 3.9

* Upgrade setproctitle in thirdparty

* Update cython requirement to match those in wheel build scripts.

* Fix MANIFEST.in

* Fix file copying in setup.py
2021-06-10 10:20:30 -07:00
SongGuyang
09eebf0cc7
[C++ worker] use gflags to parse arguments in default worker (#16350)
* remove id.h dependence for c++ worker headers

* delete unuseful test

* fix lint

* [C++ worker] support config from RayConfig and command line(gflag)

* check password

* fix

* fix some comments

* optimization

* add files

* default to run cluster mode

* fix

* add node-ip-address params to C++ worker

* fix lint

* use gflags in c++ default_worker
2021-06-11 00:46:12 +08:00
Eric Liang
ae0e38b86d
Remove legacy feature flags / features (#16349) 2021-06-10 09:31:38 -07:00
Eric Liang
d390344a8f
Enable plasma fallback allocations by default (#16244) 2021-06-09 22:05:52 -07:00
SongGuyang
67761a4fc5
[C++ worker] add node-ip-address params to C++ worker (#16253) 2021-06-10 11:10:56 +08:00
architkulkarni
7d029f8e71
[Doc] [Core] [runtime env] Add runtime env doc (#16290) 2021-06-09 20:02:16 -05:00
Siyuan (Ryans) Zhuang
8aee4e5634
[Workflow] Workflow API extension (#16276) 2021-06-09 14:55:01 -07:00
Chen Shen
5fe03667b9
[RFC] add ray.util.get_locations() to look up objects' location. (#16130)
* Implement GetLocationFromOwner at CoreWorker that looks up the locations
for a list of object ids

* plumbing GetLocationAPI to CoWorker

* introduce primary_node_id in refcounter

* add python tests

* address comments

* fix linit

* remove C++ tests

* more tests

* add more tests

* linter

* lint

* lint

* address comments

* fix merge issue

* nits
2021-06-09 11:30:42 -07:00
SongGuyang
874e947d6f
[runtime env] support create or delete runtime envs in agent (#15904) 2021-06-09 20:22:25 +08:00
Ian Rodney
c2f5ca399f
[Cleanup] Use Constant instead of "RAY_ADDRESS" in code (#16257) 2021-06-08 22:53:56 -07:00
Kai Yang
81be461ba2
[Core] Limit starting workers with maximum_startup_concurrency per worker type (#16214) 2021-06-09 13:11:53 +08:00
Dmitri Gekhtman
41b2e569fb
[autoscaler] Don't rsync cluster state with local node provider (#16281) 2021-06-08 12:27:06 -07:00
Sven Mika
4b8dadccbd
[RLlib] Fix PR 16162: Having added sleep to _NextValueNotReady causes TD3 tests to become flakey. (#16309) 2021-06-08 07:27:02 -07:00
Chris K. W
c8e3ed9eec
[core] Use function_actor_manager.lock when deserializing (#16278)
* use function_actor_manager.lock when deserializing

* add comment and todo

* better comment

* fix comment
2021-06-08 00:13:42 -07:00
Alex Wu
6f5064b7ef
Use pytest not unittest (#16265)
* .

* done

* done

* .

* .

Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-07 12:26:56 -07:00
Alex Wu
9f8f108e3f
[deflek] Split test failure into test failure 4 (#16264)
* .

* .

Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-07 11:54:55 -07:00
Edward Oakes
418dd1e8b9
fix serve start namespace issue and add test (#16291) 2021-06-07 11:30:31 -07:00
Siyuan (Ryans) Zhuang
480e5e822e
Inital workflow API implementation (#16174) 2021-06-07 10:00:15 -07:00
architkulkarni
b88163f010
[Core] [runtime env] Fix injection of ray[default] (#16275) 2021-06-05 17:32:50 -05:00
architkulkarni
b3a0b97737
Revert "[Core] [runtime env] Inject ray[default] into pip dependencies (#16268)" (#16274)
This reverts commit e5fad4bc2d.
2021-06-05 21:26:19 +03:00
Eric Liang
ca861ee47f
update (#16270) 2021-06-05 11:16:01 -07:00
Dmitri Gekhtman
7d1e7a0d4f
[autoscaler] Fix local node provider (#16202)
* Don't override resources for local node provider.

* Wip

* Local node provider prep logic

* ../python/ray/autoscaler/local/defaults.yaml

* wip

* Fix example-full

* defaults comment

* wip

* head type max workers

* sync-state

* No docker

* Fix

* external head ip option

* wip

* move external_ip out of tags

* Update examples

* Update comment

* Skip local defaults

* Config test

* Test external ip

* Change ray start commands to what they were before

* missing yamls

* Fix test

* Remove scary Docker

* Fixes

* Extra test

* address comments

* fixes pre-single-node-type-attempy

* rewrite comment a bit

* One type

* fix

* get rid of pdb

* no placeholders

* fix

* worker nodes and head node optional during launch

* fix

* fix again

* config comment fixes

* mock -> aws, not local

* Update python/ray/autoscaler/_private/local/config.py

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>

* second pop fixed

* Explanatory comments for config logic

* deprecation comments

* Update python/ray/autoscaler/_private/local/config.py

Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>

* update test

* fix

* More descriptive name for local provider check

* Remove external-ip from example minimal and add a more detailed doc string.

* Make clearer the equivalence between a ray restart and non-empty ray-start commands

* extra comment

* Update python/ray/autoscaler/_private/local/node_provider.py

* Update python/ray/autoscaler/_private/commands.py

* Update python/ray/autoscaler/_private/commands.py

* Update python/ray/autoscaler/_private/util.py

* lint

* Update python/ray/autoscaler/_private/local/node_provider.py

Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
2021-06-05 19:29:19 +03:00