Commit graph

78 commits

Author SHA1 Message Date
Eric Liang
1a1324d2a2
Bump version from 0.8.0.dev6 -> 0.9.0.dev (#6508) 2019-12-16 23:57:42 -08:00
Edward Oakes
f63b64310a
Bump version to 0.8.0.dev7 (#6303) 2019-12-05 18:33:54 -08:00
Simon Mo
ac6aa21411
Fix the autoscaler docker file to use rayproject (#6357) 2019-12-04 16:20:04 -08:00
Simon Mo
22b305223a
Build Docker Containers for Linux Wheels (#6233) 2019-11-27 17:05:36 -08:00
Richard Liaw
62cbc043b4
[tune] tbx logger (#6133)
* tbx

* add_hparams

* fix_hparams

* ok

* ok

* fix

* ok

* fix
2019-11-15 08:45:44 -08:00
daiyaanarfeen
8f6d73a93a [sgd] Extend distributed pytorch functionality (#5675)
* raysgd

* apply fn

* double quotes

* removed duplicate TimerStat

* removed duplicate find_free_port

* imports in pytorch_trainer

* init doc

* ray.experimental

* remove resize example

* resnet example

* cifar

* Fix up after kwargs

* data_dir and dataloader_workers args

* formatting

* loss

* init

* update code

* lint

* smoketest

* better_configs

* fix

* fix

* fix

* train_loader

* fixdocs

* ok

* ok

* fix

* fix_update

* fix

* fix

* done

* fix

* fix

* fix

* small

* lint

* fix

* fix

* fix_test

* fix

* validate

* fix

* fi
2019-11-05 11:16:46 -08:00
Richard Liaw
e94bebb1de
[tune] Fix Jenkins tests (#6028) 2019-11-01 16:42:04 -07:00
Richard Liaw
48ba484640
[tune] Test TF2.0, TF1.14, TF1.12 Tensorboard support (#5931) 2019-10-18 13:50:42 -07:00
Richard Liaw
c52bb0621d
[tune] Support TF2.0 on Keras Callback (#5912) 2019-10-15 10:49:50 -07:00
Edward Oakes
abbfe7392f
Bump dev version to 0.8.0.dev6 (#5906) 2019-10-14 11:36:13 +01:00
Richard Liaw
e54c487d18 [hotfix] Docker (#5809)
* configspace

* reorder
2019-09-30 16:39:00 -07:00
Richard Liaw
baf85c6665
[tune/sgd] Fix Jenkins (#5765) 2019-09-27 09:59:08 -07:00
Eric Liang
b5da32df78 Bump Ray version in documentation to dev5 (#5794) 2019-09-27 00:19:17 -07:00
Robert Nishihara
93e103135b Update doc versions from 0.8.0.dev3 to 0.8.0.dev4. (#5585) 2019-08-29 22:42:57 -07:00
Richard Liaw
cdc9227f1b
[tune] ASHA xgboost and lightgbm examples (#5500) 2019-08-22 10:37:59 -07:00
Richard Liaw
d7b309223b
[tune] MLFlow Logger (#5438) 2019-08-14 15:58:18 -07:00
Lisa Dunlap
b7d0733362 [tune] Implement BOHB (#5382) 2019-08-13 12:32:07 -07:00
Simon Mo
18f1e904de Bump 0.8.0.dev2 -> 0.8.0.dev3 (#5409) 2019-08-09 11:37:19 -07:00
Kristian Hartikainen
13fb9fe3db [rllib] Feature/soft actor critic v2 (#5328)
* Add base for Soft Actor-Critic

* Pick changes from old SAC branch

* Update sac.py

* First implementation of sac model

* Remove unnecessary SAC imports

* Prune unnecessary noise and exploration code

* Implement SAC model and use that in SAC policy

* runs but doesn't learn

* clear state

* fix batch size

* Add missing alpha grads and vars

* -200 by 2k timesteps

* doc

* lazy squash

* one file

* ignore tfp

* revert done
2019-08-01 23:37:36 -07:00
Richard Liaw
b6509f46b0
Update wheels to 0.8.0dev2 (#5186) 2019-07-12 17:27:03 -07:00
Richard Liaw
0b540ab492
[tune] Test example checkpointing (#4728) 2019-07-10 01:58:26 -07:00
Richard Liaw
b1827d5fbe
[tune] Update MNIST Example (#4991) 2019-06-25 22:50:15 -07:00
Richard Liaw
bd8aceb896 [ci] Change Jenkins to py3 (#5022)
* conda3

* integration

* add nevergrad, remotedata

* pytest 0.3.1

* otherdockers

* setup

* tune
2019-06-24 21:50:37 -07:00
Philipp Moritz
2e342ef71f Fix tensorflow-1.14 installation in jenkins (#5007) 2019-06-21 11:04:40 -07:00
Robert Nishihara
c3f8fc1c44
Update version number in documentation after release 0.7.0 -> 0.7.1 and 0.8.0.dev0 -> 0.8.0.dev1. (#4941) 2019-06-06 17:22:45 -07:00
Devin Petersohn
a7d01aba9b Update wheel versions in documentation to 0.8.0.dev0 and 0.7.0. (#4847) 2019-05-24 16:49:13 -07:00
Philipp Moritz
b0f6ddf6d1 Remove CMake files (#4493) 2019-04-02 22:17:33 -07:00
Robert Nishihara
c6f12e5219 Update documentation from 0.7.0.dev1 to 0.7.0.dev2. (#4485) 2019-03-26 17:32:53 -07:00
Robert Nishihara
4c80177d6f Unpin gym in Python 2 since gym 0.12 was released. (#4291) 2019-03-07 15:59:30 -08:00
Philipp Moritz
39eed24d47 update version from 0.7.0.dev0 to 0.7.0.dev1 (#4282) 2019-03-06 14:43:09 -08:00
Richard Liaw
a27cb225b6
Modularize Tune tests from multi-node tests (#4204) 2019-03-02 19:21:08 -08:00
Richard Liaw
f7450dbdd7
[tests] Stress tests for Jenkins (#3789)
Stress testing for Jenkins.

<!--
Thank you for your contribution!

Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request.
-->


<!-- Please give a short brief about these changes. -->
TODO:
 - [x] Enable a common keypair for autoscaling 
 - [x] Add automatic timeouts?
 - [x] Switch out key pair one last time before merge
2019-02-26 14:24:37 -08:00
Philipp Moritz
ba52caff37 Make Bazel the default build system (#3898) 2019-02-23 11:58:59 -08:00
Adi Zimmerman
dac1969647 [tune] Add Nevergrad to Tune (#3985) 2019-02-12 11:00:04 -08:00
Adi Zimmerman
9797028a91 [tune] Add scikit-optimize to Tune (#3924) 2019-02-11 17:06:02 -08:00
Robert Nishihara
a654152f9c Pin gym version in Python 2 tests. (#3973) 2019-02-06 23:56:14 -08:00
Andrew Tan
8323419a6d [tune] Add SigOpt Integration (#3844) 2019-02-03 18:23:57 -08:00
Peter Schafhalter
62a0a7bdc7 [tune] Add BayesOpt (#3864)
Adds BayesOpt as a Tune suggestion algorithm.
2019-01-31 16:54:17 -08:00
Philipp Moritz
b3bf608608 Update arrow to reduce plasma IPCs. (#3497) 2018-12-14 23:49:37 -05:00
Eric Liang
32473cf22e
[rllib] Basic Offline Data IO API (#3473) 2018-12-12 13:57:48 -08:00
Richard Liaw
784a6399b0
[tune] Node Fault Tolerance (#3238)
This PR introduces single-node fault tolerance for Tune.

## Previous behavior:
 - Actors will be restarted without checking if resources are available. This can lead to problems if we lose resources.

## New behavior:
 - RUNNING trials will be resumed on another node on a best effort basis (meaning they will run if resources available). 
 - If the cluster is saturated, RUNNING trials on that failed node will become PENDING and queued.
 - During recovery, TrialSchedulers and SearchAlgorithms should receive notification of this (via `trial_runner.stop_trial`) so that they don’t wait/block for a trial that isn’t running.


Remaining questions:
 -  Should `last_result` be consistent during restore?
Yes; but not for earlier trials (trials that are yet to be checkpointed).

 - Waiting for some PRs to merge first (#3239)

Closes #2851.
2018-11-21 12:38:16 -08:00
Richard Liaw
f9b58d7b02
[tune] Tweaks to Trainable and Verbosity (#2889) 2018-10-11 23:42:13 -07:00
Robert Nishihara
e467f546b5 Upgrade version of anaconda. (#2730) 2018-08-23 19:14:39 -07:00
Richard Liaw
8e8c733696
[tune] Fix Categorical Space + Add Keras Example (#2401)
Previously did not properly resolve categorical variables for HyperOpt.
2018-07-17 23:52:52 +02:00
Richard Liaw
0048e77093
[rllib] RLlib CLI (#2375) 2018-07-12 19:12:04 +02:00
Alok Singh
fd234e3171 [rllib] Fix A3C PyTorch implementation (#2036)
* Use F.softmax instead of a pointless network layer

Stateless functions should not be network layers.

* Use correct pytorch functions

* Rename argument name to out_size

Matches in_size and makes more sense.

* Fix shapes of tensors

Advantages and rewards both should be scalars, and therefore a list of them
should be 1D.

* Fmt

* replace deprecated function

* rm unnecessary Variable wrapper

* rm all use of torch Variables

Torch does this for us now.

* Ensure that values are flat list

* Fix shape error in conv nets

* fmt

* Fix shape errors

Reshaping the action before stepping in the env fixes a few errors.

* Add TODO

* Use correct filter size

Works when `self.config['model']['channel_major'] = True`.

* Add missing channel major

* Revert reshape of action

This should be handled by the agent or at least in a cleaner way that doesn't
break existing envs.

* Squeeze action

* Squeeze actions along first dimension

This should deal with some cases such as cartpole where actions are scalars
while leaving alone cases where actions are arrays (some robotics tasks).

* try adding pytorch tests

* typo

* fixup docker messages

* Fix A3C for some envs

Pendulum doesn't work since it's an edge case (expects singleton arrays, which
`.squeeze()` collapses to scalars).

* fmt

* nit flake

* small lint
2018-05-30 10:48:11 -07:00
Robert Nishihara
3c76461b22 Remove smart_open install. (#1943) 2018-04-23 23:18:09 -07:00
Robert Nishihara
4379e9cea0 Pin cython version in docker base dependencies file. (#1898) 2018-04-13 20:33:20 -07:00
Richard Liaw
888e70f1be
[tune] HyperOpt Support (v2) (#1763) 2018-04-04 11:08:26 -07:00
James Lamb
6dbf4f6318 Remove vim from base-deps container and reduce number of build layers (#1667) 2018-03-07 10:16:08 -08:00