Commit graph

72 commits

Author SHA1 Message Date
Richard Liaw
e94bebb1de
[tune] Fix Jenkins tests (#6028) 2019-11-01 16:42:04 -07:00
Richard Liaw
48ba484640
[tune] Test TF2.0, TF1.14, TF1.12 Tensorboard support (#5931) 2019-10-18 13:50:42 -07:00
Richard Liaw
c52bb0621d
[tune] Support TF2.0 on Keras Callback (#5912) 2019-10-15 10:49:50 -07:00
Edward Oakes
abbfe7392f
Bump dev version to 0.8.0.dev6 (#5906) 2019-10-14 11:36:13 +01:00
Richard Liaw
e54c487d18 [hotfix] Docker (#5809)
* configspace

* reorder
2019-09-30 16:39:00 -07:00
Richard Liaw
baf85c6665
[tune/sgd] Fix Jenkins (#5765) 2019-09-27 09:59:08 -07:00
Eric Liang
b5da32df78 Bump Ray version in documentation to dev5 (#5794) 2019-09-27 00:19:17 -07:00
Robert Nishihara
93e103135b Update doc versions from 0.8.0.dev3 to 0.8.0.dev4. (#5585) 2019-08-29 22:42:57 -07:00
Richard Liaw
cdc9227f1b
[tune] ASHA xgboost and lightgbm examples (#5500) 2019-08-22 10:37:59 -07:00
Richard Liaw
d7b309223b
[tune] MLFlow Logger (#5438) 2019-08-14 15:58:18 -07:00
Lisa Dunlap
b7d0733362 [tune] Implement BOHB (#5382) 2019-08-13 12:32:07 -07:00
Simon Mo
18f1e904de Bump 0.8.0.dev2 -> 0.8.0.dev3 (#5409) 2019-08-09 11:37:19 -07:00
Kristian Hartikainen
13fb9fe3db [rllib] Feature/soft actor critic v2 (#5328)
* Add base for Soft Actor-Critic

* Pick changes from old SAC branch

* Update sac.py

* First implementation of sac model

* Remove unnecessary SAC imports

* Prune unnecessary noise and exploration code

* Implement SAC model and use that in SAC policy

* runs but doesn't learn

* clear state

* fix batch size

* Add missing alpha grads and vars

* -200 by 2k timesteps

* doc

* lazy squash

* one file

* ignore tfp

* revert done
2019-08-01 23:37:36 -07:00
Richard Liaw
b6509f46b0
Update wheels to 0.8.0dev2 (#5186) 2019-07-12 17:27:03 -07:00
Richard Liaw
0b540ab492
[tune] Test example checkpointing (#4728) 2019-07-10 01:58:26 -07:00
Richard Liaw
b1827d5fbe
[tune] Update MNIST Example (#4991) 2019-06-25 22:50:15 -07:00
Richard Liaw
bd8aceb896 [ci] Change Jenkins to py3 (#5022)
* conda3

* integration

* add nevergrad, remotedata

* pytest 0.3.1

* otherdockers

* setup

* tune
2019-06-24 21:50:37 -07:00
Philipp Moritz
2e342ef71f Fix tensorflow-1.14 installation in jenkins (#5007) 2019-06-21 11:04:40 -07:00
Robert Nishihara
c3f8fc1c44
Update version number in documentation after release 0.7.0 -> 0.7.1 and 0.8.0.dev0 -> 0.8.0.dev1. (#4941) 2019-06-06 17:22:45 -07:00
Devin Petersohn
a7d01aba9b Update wheel versions in documentation to 0.8.0.dev0 and 0.7.0. (#4847) 2019-05-24 16:49:13 -07:00
Philipp Moritz
b0f6ddf6d1 Remove CMake files (#4493) 2019-04-02 22:17:33 -07:00
Robert Nishihara
c6f12e5219 Update documentation from 0.7.0.dev1 to 0.7.0.dev2. (#4485) 2019-03-26 17:32:53 -07:00
Robert Nishihara
4c80177d6f Unpin gym in Python 2 since gym 0.12 was released. (#4291) 2019-03-07 15:59:30 -08:00
Philipp Moritz
39eed24d47 update version from 0.7.0.dev0 to 0.7.0.dev1 (#4282) 2019-03-06 14:43:09 -08:00
Richard Liaw
a27cb225b6
Modularize Tune tests from multi-node tests (#4204) 2019-03-02 19:21:08 -08:00
Richard Liaw
f7450dbdd7
[tests] Stress tests for Jenkins (#3789)
Stress testing for Jenkins.

<!--
Thank you for your contribution!

Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request.
-->


<!-- Please give a short brief about these changes. -->
TODO:
 - [x] Enable a common keypair for autoscaling 
 - [x] Add automatic timeouts?
 - [x] Switch out key pair one last time before merge
2019-02-26 14:24:37 -08:00
Philipp Moritz
ba52caff37 Make Bazel the default build system (#3898) 2019-02-23 11:58:59 -08:00
Adi Zimmerman
dac1969647 [tune] Add Nevergrad to Tune (#3985) 2019-02-12 11:00:04 -08:00
Adi Zimmerman
9797028a91 [tune] Add scikit-optimize to Tune (#3924) 2019-02-11 17:06:02 -08:00
Robert Nishihara
a654152f9c Pin gym version in Python 2 tests. (#3973) 2019-02-06 23:56:14 -08:00
Andrew Tan
8323419a6d [tune] Add SigOpt Integration (#3844) 2019-02-03 18:23:57 -08:00
Peter Schafhalter
62a0a7bdc7 [tune] Add BayesOpt (#3864)
Adds BayesOpt as a Tune suggestion algorithm.
2019-01-31 16:54:17 -08:00
Philipp Moritz
b3bf608608 Update arrow to reduce plasma IPCs. (#3497) 2018-12-14 23:49:37 -05:00
Eric Liang
32473cf22e
[rllib] Basic Offline Data IO API (#3473) 2018-12-12 13:57:48 -08:00
Richard Liaw
784a6399b0
[tune] Node Fault Tolerance (#3238)
This PR introduces single-node fault tolerance for Tune.

## Previous behavior:
 - Actors will be restarted without checking if resources are available. This can lead to problems if we lose resources.

## New behavior:
 - RUNNING trials will be resumed on another node on a best effort basis (meaning they will run if resources available). 
 - If the cluster is saturated, RUNNING trials on that failed node will become PENDING and queued.
 - During recovery, TrialSchedulers and SearchAlgorithms should receive notification of this (via `trial_runner.stop_trial`) so that they don’t wait/block for a trial that isn’t running.


Remaining questions:
 -  Should `last_result` be consistent during restore?
Yes; but not for earlier trials (trials that are yet to be checkpointed).

 - Waiting for some PRs to merge first (#3239)

Closes #2851.
2018-11-21 12:38:16 -08:00
Richard Liaw
f9b58d7b02
[tune] Tweaks to Trainable and Verbosity (#2889) 2018-10-11 23:42:13 -07:00
Robert Nishihara
e467f546b5 Upgrade version of anaconda. (#2730) 2018-08-23 19:14:39 -07:00
Richard Liaw
8e8c733696
[tune] Fix Categorical Space + Add Keras Example (#2401)
Previously did not properly resolve categorical variables for HyperOpt.
2018-07-17 23:52:52 +02:00
Richard Liaw
0048e77093
[rllib] RLlib CLI (#2375) 2018-07-12 19:12:04 +02:00
Alok Singh
fd234e3171 [rllib] Fix A3C PyTorch implementation (#2036)
* Use F.softmax instead of a pointless network layer

Stateless functions should not be network layers.

* Use correct pytorch functions

* Rename argument name to out_size

Matches in_size and makes more sense.

* Fix shapes of tensors

Advantages and rewards both should be scalars, and therefore a list of them
should be 1D.

* Fmt

* replace deprecated function

* rm unnecessary Variable wrapper

* rm all use of torch Variables

Torch does this for us now.

* Ensure that values are flat list

* Fix shape error in conv nets

* fmt

* Fix shape errors

Reshaping the action before stepping in the env fixes a few errors.

* Add TODO

* Use correct filter size

Works when `self.config['model']['channel_major'] = True`.

* Add missing channel major

* Revert reshape of action

This should be handled by the agent or at least in a cleaner way that doesn't
break existing envs.

* Squeeze action

* Squeeze actions along first dimension

This should deal with some cases such as cartpole where actions are scalars
while leaving alone cases where actions are arrays (some robotics tasks).

* try adding pytorch tests

* typo

* fixup docker messages

* Fix A3C for some envs

Pendulum doesn't work since it's an edge case (expects singleton arrays, which
`.squeeze()` collapses to scalars).

* fmt

* nit flake

* small lint
2018-05-30 10:48:11 -07:00
Robert Nishihara
3c76461b22 Remove smart_open install. (#1943) 2018-04-23 23:18:09 -07:00
Robert Nishihara
4379e9cea0 Pin cython version in docker base dependencies file. (#1898) 2018-04-13 20:33:20 -07:00
Richard Liaw
888e70f1be
[tune] HyperOpt Support (v2) (#1763) 2018-04-04 11:08:26 -07:00
James Lamb
6dbf4f6318 Remove vim from base-deps container and reduce number of build layers (#1667) 2018-03-07 10:16:08 -08:00
butchcom
936bebef99 [rllib] Upgrade to OpenAI Gym 0.10.3 (#1601) 2018-03-06 00:31:02 -08:00
Robert Nishihara
5859a2d249 Replace python setup.py install with pip install -e . (#1460) 2018-02-22 11:15:03 -08:00
Simon Mo
a24cc28773 [DataFrame] Add Parquet Support in Build Process (#1531)
* Add shell script for building parquet

* Use parquet ci script; remove anaconda

* Remove gcc flag, use default

* add boost_root

* Fix $TP_DIR reference issue

* fix the PR

* check out specific parquet-cpp commit
2018-02-16 07:18:42 -08:00
Robert Nishihara
7187f9fe56 Pin gym version to 0.9.5 in tests. (#1490) 2018-01-31 15:50:25 -08:00
Robert Nishihara
ab5d4a6010 Bring cloudpickle inside the repository. (#1445)
* Bring cloudpickle version 0.5.2 inside the repo.

* Use internal copy of cloudpickle everywhere.

* Fix linting.

* Import ordering.

* Change __init__.py.

* Set pickler in serialization context.

* Don't check ray location.
2018-01-25 11:36:37 -08:00
Philipp Moritz
26125e1547 Fixing the jenkins tests (#1299)
* trying to fix jenkins tests

* comment out more tests

* remove pytorch stuff

* use non-monotonic clock (monotonic not supported on python 2.7)

* whitespace
2017-12-07 17:03:58 -08:00