Commit graph

5323 commits

Author SHA1 Message Date
krfricke
c741d1cf9c
[tune] stdout/stderr logging redirection (#9817)
* Add `log_to_file` parameter, pass to Trainable config, redirect stdout/stderr.

* Add logging handler to root ray logger

* Added test for `log_to_file` parameter

* Added logs, reuse test

* Revert debug change

* Update logdir on reset, flush streams after each train() step

* Remove magic keys from visible config

Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-03 11:18:34 -07:00
Ameer Haj Ali
9089fab0ef
[cluster] On Prem Server First PR (#9663)
* on prem server first commit

* minor fix

* verify error on autoscaling in on prem mode

* lint

* lint

* Tests complete

* add tests to check for backward compatibility

* Fixing comments and autoscaling

* minor fixes

* coordinating server mode

* tests

* lint

* remove unnecessary import

* Resolving Comments

* seperating coordinator and local node provider

Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
2020-08-03 10:38:44 -07:00
Michael Luo
4d7bd8c892
[RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) (#9409) 2020-08-02 18:12:09 +02:00
mehrdadn
b62ec7787f
Ignore grep exit code for shellcheck in format.sh (#9861)
Co-authored-by: Mehrdad <noreply@github.com>
2020-08-02 00:59:05 -07:00
Alex Wu
5b96a88cd7
[Core] Gpu type detection (#9695)
* .

* .

* .

* .

* .

* .

* .

* .

* Test cases

* detection only

* .

* Done?

* .

* .

* Done

* added test case

* .

* .

* .

* .

* .

* .

* Update python/ray/ray_constants.py

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* .

* .

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-08-01 11:43:56 -07:00
chaokunyang
64d6446cf3
change version from 0.1-SNAPSHOT to 0.9.0-SNAPSHOT (#9778) 2020-08-01 22:38:22 +08:00
Sven Mika
b4c527b3f3
[RLlib] Switch to PyTorch 1.6 for testing. (#9790) 2020-08-01 05:21:23 +02:00
Stephanie Wang
37a9c5783c
[core] Report resource load by shape (#9806)
* Report and aggregate resource load by shape

* python test

* python test

* x

* update
2020-07-31 16:57:30 -07:00
Alan Guo
3506910c5d
[autoscaler] Create worker_file_mounts config (#9762) 2020-07-31 14:33:27 -07:00
Eric Liang
b73080c85f
Allow tasks to be used with placement groups (#9738) 2020-07-31 10:51:37 -07:00
mehrdadn
78995d085f
Fix macOS incompatibility in format.sh (#9832)
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-31 09:25:55 -07:00
Kai Yang
006e034cdb
fix lint for ReferenceCountingTest.java (#9837) 2020-07-31 17:00:00 +08:00
Hao Chen
6fb6bd3e61
Refine Java "Ray Core Walkthrough" doc (#9836) 2020-07-31 15:35:43 +08:00
fangfengbin
3900643948
Add actor states definitions & transition diagram doc (#9754) 2020-07-31 15:35:25 +08:00
bermaker
88e8714bcb
Fix ray java worker metric test indentation (#9834) 2020-07-31 14:39:41 +08:00
Richard Liaw
a47121476f
[tune] Remove accidentally added files (#9835) 2020-07-30 21:47:27 -07:00
Kai Yang
02fd950252
[Java] Local and distributed ref counting in Java (#9371) 2020-07-31 11:49:31 +08:00
mehrdadn
e2c0174ab2
Factor out some Bazel options into .bazelrc (#9804)
* Factor out --keep_going in Bazel --config=ci

* Remove Bazel --test_timeout=600 for Windows

* Use global --test_output for Bazel CI

Co-authored-by: Mehrdad <noreply@github.com>
2020-07-30 18:09:31 -07:00
mehrdadn
a7b97b6f8a
Add shellcheck support (#8574) 2020-07-30 18:39:28 -05:00
Eric Liang
73df3f7bd2
Clean up formatting of placement group resources (#9740) 2020-07-30 15:52:32 -07:00
SangBin Cho
940617d092
Make test failure large. (#9822) 2020-07-30 13:11:51 -07:00
krfricke
619e44e54a
[tune] Added WandbLogger (#9725)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-07-30 13:09:03 -07:00
Barak Michener
68f3fec744
*: Centralize requirements.txt and unify dependency versions (#9759)
* python_test: fix cython_examples in doc/ and tests/

* update setup.py to parse the bazel version string better

* all: centralize all python deps into stackable requirements files in python/

* format

* Move cython test into the proper package

* Add cross-reference dependency comments for requirements and setup.py

* re-enable version pinning on CI, fix formatting

* fix up torchvision version

* fix case in shell
2020-07-30 11:22:56 -07:00
SangBin Cho
e6d1e3afe2
Use pass by reference for const auto in for loop. (#9811) 2020-07-30 12:34:24 -05:00
Richard Liaw
0c3b9ebeef
[tune/sgd] Document func_trainable and add checkpoint context (#9739)
Co-authored-by: krfricke <krfricke@users.noreply.github.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-07-30 09:46:37 -07:00
Sven Mika
e540e425e4
[RLlib] rllib rollout test and bug fixes. (#9779) 2020-07-30 16:17:03 +02:00
Sven Mika
f6bd12eb18
[RLlib] Add tensor-based tests for Schedules and fix some bugs related to using Schedules with tensor time input. (#9782) 2020-07-30 12:49:32 +02:00
Miguel Morales
372114b4ed
Update sampler.py (#9805)
Minor fix for warning string
2020-07-29 22:58:35 -07:00
bermaker
ccd6b90a42
Fix ray java worker metric registry indentation (#9780) 2020-07-30 13:20:24 +08:00
chaokunyang
6464bf55c6
[dist] Mvn deploy (#9777) 2020-07-30 11:48:31 +08:00
Kai Yang
9be5a2f0fc
Fix GCS related tests (#9783) 2020-07-30 11:46:36 +08:00
Hao Chen
260bc52254
Java doc: "Ray Core Walkthrough" page (#8595) 2020-07-30 11:13:38 +08:00
chaokunyang
5aba53e9b2
[dist] Fix travis deploy for java dist (#9768) 2020-07-30 10:59:11 +08:00
SangBin Cho
826f14c824
[Stats] Fix harvestor threads + Fix flaky stats shutdown. (#9745) 2020-07-29 18:57:59 -05:00
mehrdadn
07022f3f11
Fix src/ray/core_worker/common.h deleted constructor (#9785)
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-29 15:49:02 -07:00
Alex Wu
6e294dd90f
[Core] Custom socket name (#9766)
* fix issues

* hot fixes

* test

* test

* socket name change only
2020-07-29 13:19:41 -07:00
Alex Wu
e6696b2533
Fixed stderr logging (9765) 2020-07-29 13:19:04 -07:00
Alex Wu
72297dc46f
[Core] Socket creation race condition bug fixes (#9764)
* fix issues

* hot fixes

* test

* test

* Always info log
2020-07-29 13:17:46 -07:00
Sven Mika
b0b0463161
[RLlib] Trajectory View API (preparatory cleanup and enhancements). (#9678) 2020-07-29 21:15:09 +02:00
Bill Chambers
067c2752f8
[TUNE] Tune Docs re-organization (#9600)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-07-29 11:22:44 -07:00
SangBin Cho
d1b37ca7e4
[GCS Actor Management] Fix flaky test_dead_actors. (#9715)
* Fix.

* Add logs.

* Add an unit test.
2020-07-29 10:54:18 -07:00
Tao Wang
2babad9906
[GCS]Use a separate thread in node failure detector to handle heartbeat (#9416)
* use a sole thread to handle heartbeat

* separate signal thread

* use work to avoid exiting when task is underway

* protect shared data structure to avoid deadlock

* add comments

* decrease io service num

* minor changes

* fix test

* per stephanie's comments

* use single io service instead of 1-size io service pool

* typo
2020-07-29 09:58:58 -07:00
Lingxuan Zuo
156067b423
[Stats] enable core worker stats (#9355) 2020-07-29 17:28:33 +08:00
fangfengbin
a484947742
Fix leased worker leak bug if lease worker requests that are still waiting to be scheduled when GCS restarts (#9719) 2020-07-29 14:16:03 +08:00
Kai Yang
2cafc7cebe
[Java] Fix MetricTest.java due to incomplete changes from #9703 (#9770) 2020-07-29 12:18:17 +08:00
Kai Yang
bdc005a4d4
[Java] Use test groups to filter tests of different run modes (#9703) 2020-07-29 11:18:45 +08:00
Simon Mo
9fbfee2424
Pin pytest version (#9767) 2020-07-28 19:54:48 -07:00
mehrdadn
fb5280f21b
Fix some Windows CI issues (#9708)
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-28 18:10:23 -07:00
SangBin Cho
423dc96cc4
Revert "[dist] swap mac/linux wheel build order (#9746)" and "Fix package and upload ray jar (#9742)" (#9758)
* Revert "[dist] swap mac/linux wheel build order (#9746)"

This reverts commit a9340565ff.

* Revert "Fix package and upload ray jar (#9742)"

This reverts commit c290c308fe.
2020-07-28 15:34:29 -07:00
Alex Wu
21af0ceb0c
Register function race (#9346) 2020-07-28 13:51:34 -07:00