Commit graph

4127 commits

Author SHA1 Message Date
Maksim Smolin
3a134c7224
[RaySGD] Rename PyTorch API endpoints to start with Torch (#7425)
* Start renaming pytorch to torch

* Rename PyTorchTrainer to TorchTrainer

* Rename PyTorch runners to Torch runners

* Finish renaming API

* Rename to torch in tests

* Finish renaming docs + tests

* Run format + fix DeprecationWarning

* fix

* move tests up

* rename

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-03 16:44:42 -08:00
Siyuan (Ryans) Zhuang
f6883bf725
Keep cloudpickle up-to-date with the upstream (#7406) 2020-03-03 13:52:54 -08:00
Edward Oakes
b0bf5450c2
Fix flaky multiprocessing tests (#7413) 2020-03-03 15:07:59 -06:00
ijrsvt
fb76092d75
Re-route asyncio plasma code path through raylet instead of direct plasma connection (#7234) 2020-03-03 15:43:46 -05:00
Philipp Moritz
c2c6d96490
Fix install documentation on readthedocs (#7423) 2020-03-03 11:03:18 -08:00
Edward Oakes
04ec599441
Use ray.kill() in multiprocessing.Pool (#7409) 2020-03-03 12:49:13 -06:00
Allen
b74eb5fce6
Capture output for commands run by the autoscaler (#7381) 2020-03-03 10:19:21 -08:00
mehrdadn
4d42664b2a
Use prctl(PR_SET_PDEATHSIG) on Linux instead of reaper (#7150) 2020-03-03 11:45:42 -06:00
fangfengbin
f5b1062ed9
Fix TwoNodeTest.TestActorTaskCrossNodes testcase when enable gcs service (#7416) 2020-03-03 19:37:38 +08:00
ijrsvt
584645cc7d
Fix Experimental Async API (#7391) 2020-03-02 22:24:20 -06:00
Edward Oakes
580b017b43
Fix flaky global GC tests (#7407) 2020-03-02 21:03:01 -06:00
Edward Oakes
9e9f1962c7
Enable test_actor_pool in CI (#7405) 2020-03-02 20:24:36 -06:00
Edward Oakes
2b6f00724a
Enable test_joblib in CI (#7404) 2020-03-02 20:03:27 -06:00
Edward Oakes
d69fe54f6d
Temporarily skip testEndToEndReporting (#7402) 2020-03-02 18:27:34 -06:00
Eric Liang
0f88444686
[rllib] Support multi-agent training in pipeline impls, add easy flag to enable (#7338) 2020-03-02 15:16:37 -08:00
Sven Mika
d8eeb96413
Fix issue with torch PPO not handling action spaces of shape=(>1,). (#7398) 2020-03-02 10:53:19 -08:00
Qing Wang
2771af1036
Fix the bug of unregistered workers in worker pool (#7343)
* Fix

* Fix

* Fix complie

* Fix lint

* Fix linting

* Fix testDeleteObject

* Fix linting

* Update src/ray/raylet/worker_pool.cc

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update src/ray/raylet/worker_pool.cc

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update src/ray/raylet/worker_pool.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update src/ray/raylet/worker_pool.cc

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments.

* FIx linting

Co-authored-by: Hao Chen <chenh1024@gmail.com>
2020-03-02 16:30:39 +08:00
Siyuan (Ryans) Zhuang
0792b5cb93
Fix the numpy ndarray subclass serialization bug (#7392) 2020-03-01 23:05:59 -08:00
Richard Liaw
48cdca843f
[raysgd] Custom training operator (#7211) 2020-03-01 21:22:48 -08:00
Sven Mika
2d97650b1e
[RLlib] Add Exploration API documentation. (#7373)
* Add Exploration API documentation.

* Add Exploration API documentation.

* Add Exploration API documentation.

* Update exporation docs.
2020-03-01 16:55:41 -08:00
mehrdadn
44aded5272
Bazel mirrors (#7385)
* Switch to mirrors.bazel.build where possible

* Switch from .zip to .tar.gz for smaller downloads (it's also the default download on UNIX)

* Use direct GitHub URLs in Bazel files for clarity

* Don't pass patches to local_repository

* Remove github_repository()

* Switch to GitHub actions/checkout@v2 which is faster

* Use faster extraction method for LLVm on Windows

* Move LLVM_VERSION_WINDOWS to the shell script since it's not a CI-specific value

* Change GITHUB_TOKEN to GITHUB

* Don't show timestamps for GitHub Actions

* Factor out some options from GitHub Actions

* Tell Bazel to stay on the same volume in GitHun Actions

* Display progress output when downloading toolchains

Co-authored-by: GitHub Web Flow <noreply@github.com>
2020-03-01 14:04:06 -08:00
Sven Mika
83e06cd30a
[RLlib] DDPG refactor and Exploration API action noise classes. (#7314)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix

* WIP.

* Add TD3 quick Pendulum regresison.

* Cleanup.

* Fix.

* LINT.

* Fix.

* Sort quick_learning test cases, add TD3.

* Sort quick_learning test cases, add TD3.

* Revert test_checkpoint_restore.py (debugging) changes.

* Fix old soft_q settings in documentation and test configs.

* More doc fixes.

* Fix test case.

* Fix test case.

* Lower test load.

* WIP.
2020-03-01 11:53:35 -08:00
Eric Liang
3c6b94f3f5
[rllib] Enable performance metrics reporting for RLlib pipelines, add A3C (#7299) 2020-02-28 16:44:17 -08:00
SangBin Cho
50145e668d
Fix the problem that ray.remote reference is not visible at a document. (#7311) 2020-02-28 14:03:08 -08:00
Richard Liaw
fb73d51d4d
[tune] fix hparams for tbx (#7312)
* fix

* test_hist

* remove unnecessary value check

* pbt

* queue

* skip_for_now

* Apply suggestions from code review
2020-02-28 11:51:56 -08:00
Richard Liaw
ca40b0fcc6
[tune][minor] Avoid throwing error when gpu check fails (#7362) 2020-02-28 11:32:44 -08:00
Edward Oakes
f321eaec9b
Working but not passing test (#7358) 2020-02-28 12:57:28 -06:00
Edward Oakes
34488f52f3
Temporarily disable async_test (#7377) 2020-02-28 10:42:41 -08:00
mehrdadn
fb0bc7b947
Partially revert "[Core/RLlib] Move log_once from rllib to ray.util. (#7273)" (#7361)
This partially reverts commit 357232d124.

The addition of python/__init__.py broke the build on Windows. However, this is difficult to notice because Bazel doesn't seem to notice this dependency. You first have to go to a commit that fails on this issue, and then try to re-build this commit, so that Bazel actually performs a rebuild.

A useful command-line for triggering the exact build i:

bazel build --compile_one_dependency //:python/ray/_raylet.pyx
2020-02-28 10:27:45 -08:00
mehrdadn
5fb5be0ba5
Some bug fixes for Windows (#7374)
* Fix MAP_SHARED check in sys/mman.h

* Fix missing :platform_shims dependency for ray_util

* dlmalloc patch for Arrow
2020-02-28 10:22:32 -08:00
mehrdadn
0efaa9b310
Use Redis for Windows (#7364) 2020-02-28 10:18:56 -08:00
micafan
3f8b1d2756
Fix ServiceBasedGcsGcsClientTest timing bug (#7365) 2020-02-28 12:01:02 -06:00
Edward Oakes
93fe4b0b58
Change actor.__ray_kill__() to ray.kill(actor) (#7360) 2020-02-28 11:55:13 -06:00
Richard Liaw
3fc162f93c
[tune] Add Unit Test for nested PBT + Jenkins (#7324) 2020-02-27 18:17:11 -08:00
mehrdadn
8730996682
Windows changes (#7315) 2020-02-27 15:14:10 -08:00
Sven Mika
0c9e5db9cb
Fix SAC bug (twin Q not used for min'ing over both Q-nets in loss func). (#7354) 2020-02-27 12:49:08 -08:00
Edward Oakes
ced062319d
Decrease test_object_manager put size to avoid OOMs in CI (#7355) 2020-02-27 11:08:10 -08:00
Edward Oakes
cbf55d69a6
Remove serialized from_random object ids in tests (#7340) 2020-02-27 11:04:06 -08:00
Edward Oakes
bd9411f849
Call TriggerGlobalGC when the plasma store is full (#7337) 2020-02-27 11:01:49 -08:00
Sven Mika
357232d124
[Core/RLlib] Move log_once from rllib to ray.util. (#7273)
* Move log_once from rllib to tune.

* Move log_once from rllib to tune.

* LINT.

* Move to ray.util.debug.
2020-02-27 10:40:44 -08:00
Sven Mika
44ac0ead34
[RLlib] rollout.py; make video-recording options more intuitive and add warnings/errors (issue 7121). (#7347) 2020-02-27 10:39:02 -08:00
Edward Oakes
d9027acaf2
Deprecate non-direct-call API (#7336) 2020-02-27 10:37:23 -08:00
Edward Oakes
55ccfb6089
Fix asyncio actor race condition (#7335) 2020-02-27 10:16:04 -08:00
Eric Liang
58073f7260
[rllib] Fix multiagent example crash due to undefined abstract method (#7329)
* fix multiagent example

* 0 workers
2020-02-26 22:54:40 -08:00
mehrdadn
219180b580
Improve .editorconfig entries (#7344) 2020-02-26 19:05:36 -08:00
Edward Oakes
ee0f71e398
Add __commit__ field to ray package in wheels (#7305) 2020-02-26 17:54:22 -08:00
Edward Oakes
2ad9bc5684
Move plasma retry logic into plasma store provider (#7328) 2020-02-26 16:57:02 -08:00
Sven Mika
aec03656d5
[RLlib] TupleActions cannot be exported by Policy: Fixes issues 7231 and 5593. #7333 2020-02-26 15:22:54 -08:00
Eric Liang
b310661338
Add internal_api.global_gc() method, which triggers gc.collect() on all workers (#7327) 2020-02-26 14:09:29 -08:00
mehrdadn
bcecf8b46b
Bazel improvements (#7170) 2020-02-26 12:28:13 -08:00