Commit graph

87 commits

Author SHA1 Message Date
Eric Liang
1660c9d627
Kill actor child processes on shutdown (#3297)
* example

* add env

* test pg

* change to test

* add atexit test

* Update rllib-env.rst

* comment

* revert unnecessary file

* fix title when actor is idle

* Update python/ray/actor.py

Co-Authored-By: ericl <ekhliang@gmail.com>
2018-11-13 19:16:42 -08:00
Eric Liang
65c27c70cf [rllib] Clean up agent resource configurations (#3296)
Closes #3284
2018-11-13 18:00:03 -08:00
Eric Liang
bd0dbde149
[rllib] Rename ServingEnv => ExternalEnv (#3302) 2018-11-12 16:31:27 -08:00
Eric Liang
53489d2f85
[sgd] Document and add simple MNIST example (#3236) 2018-11-10 21:52:20 -08:00
Eric Liang
813f51769f [rllib] Fix rllib rollouts script and add test (#3211)
## What do these changes do?

Clean up the checkpointing to handle the new checkpoint dirs. Add a test for rollout.py

## Related issue number

https://github.com/ray-project/ray/issues/3206
https://github.com/ray-project/ray/issues/3204
2018-11-05 00:33:25 -08:00
Eric Liang
369cb833fe
[rllib] Implement custom metrics (#3144) 2018-11-03 18:48:32 -07:00
Eric Liang
af0c1174cd
[sgd] Merge sharded param server based SGD implementation (#3033)
This includes most of the TF code used for the OSDI experiment. Perf sanity check on p3.16xl instances: Overall scaling looks ok, with the multi-node results within 5% of OSDI final numbers. This seems reasonable given that hugepages are not enabled here, and the param server shards are placed randomly.

$ RAY_USE_XRAY=1 ./test_sgd.py --gpu --batch-size=64 --num-workers=N \
  --devices-per-worker=M --strategy=<simple|ps> \
  --warmup --object-store-memory=10000000000

Images per second total
gpus total              | simple | ps
========================================
1                       | 218
2 (1 worker)            | 388
4 (1 worker)            | 759
4 (2 workers)           | 176    | 623
8 (1 worker)            | 985
8 (2 workers)           | 349    | 1031
16 (2 nodes, 2 workers) | 600    | 1661
16 (2 nodes, 4 workers) | 468    | 1712   <--- OSDI perf was 1817
2018-10-27 21:25:02 -07:00
Robert Nishihara
658c14282c Remove legacy Ray code. (#3121)
* Remove legacy Ray code.

* Fix cmake and simplify monitor.

* Fix linting

* Updates

* Fix

* Implement some methods.

* Remove more plasma manager references.

* Fix

* Linting

* Fix

* Fix

* Make sure class IDs are strings.

* Some path fixes

* Fix

* Path fixes and update arrow

* Fixes.

* linting

* Fixes

* Java fixes

* Some java fixes

* TaskLanguage -> Language

* Minor

* Fix python test and remove unused method signature.

* Fix java tests

* Fix jenkins tests

* Remove commented out code.
2018-10-26 13:36:58 -07:00
Robert Nishihara
9c1826ed69 Use XRay backend by default. (#3020)
* Use XRay backend by default.

* Remove irrelevant valgrind tests.

* Fix

* Move tests around.

* Fix

* Fix test

* Fix test.

* String/unicode fix.

* Fix test

* Fix unicode issue.

* Minor changes

* Fix bug in test_global_state.py.

* Fix test.

* Linting

* Try arrow change and other object manager changes.

* Use newer plasma client API

* Small updates

* Revert plasma client api change.

* Update

* Update arrow and allow SendObjectHeaders to fail.

* Update arrow

* Update python/ray/experimental/state.py

Co-Authored-By: robertnishihara <robertnishihara@gmail.com>

* Address comments.
2018-10-23 12:46:39 -07:00
Eric Liang
59901a88a0
[rllib] Native support for Dict and Tuple spaces; fix Tuple action spaces; add prev a, r to LSTM (#3051) 2018-10-20 15:21:22 -07:00
Eric Liang
3c891c6ece
[rllib] Parallel-data loading and multi-gpu support for IMPALA (#2766) 2018-10-15 11:02:50 -07:00
Eric Liang
e4bea8d10e
[rllib] Default to truncate_episodes and add some more config validators (#2967)
* update

* link it

* warn about truncation

* fix

* Update rllib-training.rst

* deprecate tests failing
2018-09-30 18:37:55 -07:00
Eric Liang
747253e0f6
[rllib] Don't shuffle samples in PPO when using lstm 2018-09-30 01:13:56 -07:00
Eric Liang
3267676994 [Experimental] Add experimental distributed SGD API (#2858)
* check in sgd api

* idx

* foreach_worker foreach_model

* add feed_dict

* update

* yapf

* typo

* lint

* plasma op change

* fix plasma op

* still not working

* fix

* fix

* comments

* yapf

* silly flake8

* small test
2018-09-19 21:12:37 -07:00
Eric Liang
3a3782c39f
[rllib] Fix LSTM regression on truncated sequences and add regression test (#2898)
* fix

* add test

* yapf

* yapf

* fix space

* Oops that should be lstm: True

* Update cartpole_lstm.py
2018-09-18 15:09:16 -07:00
old-bear
f3c1194be3 [tune] Add AutoML algorithm of GeneticSearcher (#2699)
Add new search algorithm (genetic) along with the base framework of the searcher (which performs some basic jobs such as logging, recording and organizing in our project).
Note that this is the initial commit. In the following days, we will add example, UT, and other refinements.
2018-09-12 09:17:04 -07:00
Eric Liang
995ac24a2c
[rllib] clarify train batch size for PPO (#2793)
It's possible to configure PPO in a way that ends up discarding most of the samples (they are treated as "stragglers"). Add a warning when this happens, and raise an exception if the waste is particularly egregious.
2018-09-05 12:06:13 -07:00
Eric Liang
df4788e501
[rllib/tune] Add test for fractional gpu support in xray mode; add rllib support for fractional gpu (#2768)
* frac gpu

* doc

* Update rllib-training.rst

* yapf

* remove xray
2018-09-03 11:12:23 -07:00
Eric Liang
b37a283053 [rllib] support local mode (#2795) 2018-09-02 23:02:19 -07:00
Richard Liaw
0347e6418b
[tune] Add PyTorch MNIST Example + Misc. Tweaks (#2708) 2018-08-30 16:18:56 -07:00
Eric Liang
fbe6c59f72
[rllib] Misc fixes, A2C (#2679)
A bunch of minor rllib fixes:

pull in latest baselines atari wrapper changes (and use deepmind wrapper by default)
move reward clipping to policy evaluator
add a2c variant of a3c
reduce vision network fc layer size to 256 units
switch to 84x84 images
doc tweaks
print timesteps in tune status
2018-08-20 15:28:03 -07:00
Eric Liang
6670880f03
[rllib] Workaround actor creation hang edge case for ape-X (#2661)
* apex hang

* fix

* move pyt to end
2018-08-16 18:03:50 -07:00
Yuhong Guo
9825da7233 Change training tasks to xray for Jenkins tests (#2567) 2018-08-06 13:35:26 -07:00
Yuhong Guo
d2ebe4d9a3 Fix frequent failure of Jenkins CI. (#2490) 2018-08-02 10:28:28 -07:00
Eric Liang
9ea57c2a93
[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504)
Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer
  Add AsyncSamplesOptimizer that implements the IMPALA architecture
  integrate V-trace with a3c policy graph
  audit V-trace integration
  benchmark compare vs A3C and with V-trace on/off
PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C.
2018-08-01 20:53:53 -07:00
Eric Liang
38d00986a5
[rllib] Cleanups: deep merge configs properly; enforce min iter time on APEX (#2500)
The dict merge prevents crashes when tune is trying to get resource requests for agents and you override a config subkey. The min iter time prevents iterations from getting too small, incurring high overhead. This is easy to run into on Ape-X since throughput can get very high.
2018-07-30 13:25:35 -07:00
Eric Liang
68660453e4
[rllib] Better support and add two-trainer example for multiagent (#2443)
This adds a simple DQN+PPO example for multi-agent. We don't do anything fancy here, just syncing weights between two separate trainers. This potentially is wasting some compute, but is very simple to set up.

It might be nice to share experience collection between the top-level trainers in the future.
2018-07-22 05:09:25 -07:00
Eric Liang
807f309b3a
[test] Fix broken rllib test (#2446)
This fixes the broken build.
2018-07-20 13:47:41 -07:00
Eric Liang
8e75d150f7
[rllib] Apex crash when compress_observations: False (#2426)
We shouldn't try to decompress uncompressed data.

Also, fix resource requests for ddpg + GPU.
2018-07-19 15:58:09 -07:00
Richard Liaw
8e8c733696
[tune] Fix Categorical Space + Add Keras Example (#2401)
Previously did not properly resolve categorical variables for HyperOpt.
2018-07-17 23:52:52 +02:00
Eric Liang
0cecf6b79c
[rllib] Cleanup RNN support and make it work with multi-GPU optimizer (#2394)
Cleanup: TFPolicyGraph now automatically adds loss input entries for state_in_*, so that graph sub-classes don't need to worry about it.

Multi-GPU support:

Allow setting up model tower replicas with existing state input tensors

Truncate the per-device minibatch slices so that they are always a multiple of max_seq_len.
2018-07-17 06:55:46 +02:00
Eric Liang
b316afeb43 [rllib] Add debug info back to PPO and fix optimizer compatibility (#2366) 2018-07-12 19:22:46 +02:00
Richard Liaw
0048e77093
[rllib] RLlib CLI (#2375) 2018-07-12 19:12:04 +02:00
Richard Liaw
4d7da9f668
[rllib] Remove "Common", cleanup some code (#2348) 2018-07-08 13:03:53 -07:00
Eric Liang
8aa56c12e6
[rllib] Document "v2" APIs (#2316)
* re

* wip

* wip

* a3c working

* torch support

* pg works

* lint

* rm v2

* consumer id

* clean up pg

* clean up more

* fix python 2.7

* tf session management

* docs

* dqn wip

* fix compile

* dqn

* apex runs

* up

* impotrs

* ddpg

* quotes

* fix tests

* fix last r

* fix tests

* lint

* pass checkpoint restore

* kwar

* nits

* policy graph

* fix yapf

* com

* class

* pyt

* vectorization

* update

* test cpe

* unit test

* fix ddpg2

* changes

* wip

* args

* faster test

* common

* fix

* add alg option

* batch mode and policy serving

* multi serving test

* todo

* wip

* serving test

* doc async env

* num envs

* comments

* thread

* remove init hook

* update

* fix ppo

* comments1

* fix

* updates

* add jenkins tests

* fix

* fix pytorch

* fix

* fixes

* fix a3c policy

* fix squeeze

* fix trunc on apex

* fix squeezing for real

* update

* remove horizon test for now

* multiagent wip

* update

* fix race condition

* fix ma

* t

* doc

* st

* wip

* example

* wip

* working

* cartpole

* wip

* batch wip

* fix bug

* make other_batches None default

* working

* debug

* nit

* warn

* comments

* fix ppo

* fix obs filter

* update

* wip

* tf

* update

* fix

* cleanup

* cleanup

* spacing

* model

* fix

* dqn

* fix ddpg

* doc

* keep names

* update

* fix

* com

* docs

* clarify model outputs

* Update torch_policy_graph.py

* fix obs filter

* pass thru worker index

* fix

* rename

* vlad torch comments

* fix log action

* debug name

* fix lstm

* remove unused ddpg net

* remove conv net

* revert lstm

* wip

* wip

* cast

* wip

* works

* fix a3c

* works

* lstm util test

* doc

* clean up

* update

* fix lstm check

* move to end

* fix sphinx

* fix cmd

* remove bad doc

* envs

* vec

* doc prep

* models

* rl

* alg

* up

* clarify

* copy

* async sa

* fix

* comments

* fix a3c conf

* tune lstm

* fix reshape

* fix

* back to 16

* tuned a3c update

* update

* tuned

* optional

* merge

* wip

* fix up

* move pg class

* rename env

* wip

* update

* tip

* alg

* readme

* fix catalog

* readme

* doc

* context

* remove prep

* comma

* add env

* link to paper

* paper

* update

* rnn

* update

* wip

* clean up ev creation

* fix

* fix

* fix

* fix lint

* up

* no comma

* ma

* Update run_multi_node_tests.sh

* fix

* sphinx is stupid

* sphinx is stupid

* clarify torch graph

* no horizon

* fix config

* sb

* Update test_optimizers.py
2018-07-01 00:05:08 -07:00
Eric Liang
b197c0c404
[rllib] General RNN support (#2299)
* wip

* cls

* re

* wip

* wip

* a3c working

* torch support

* pg works

* lint

* rm v2

* consumer id

* clean up pg

* clean up more

* fix python 2.7

* tf session management

* docs

* dqn wip

* fix compile

* dqn

* apex runs

* up

* impotrs

* ddpg

* quotes

* fix tests

* fix last r

* fix tests

* lint

* pass checkpoint restore

* kwar

* nits

* policy graph

* fix yapf

* com

* class

* pyt

* vectorization

* update

* test cpe

* unit test

* fix ddpg2

* changes

* wip

* args

* faster test

* common

* fix

* add alg option

* batch mode and policy serving

* multi serving test

* todo

* wip

* serving test

* doc async env

* num envs

* comments

* thread

* remove init hook

* update

* fix ppo

* comments1

* fix

* updates

* add jenkins tests

* fix

* fix pytorch

* fix

* fixes

* fix a3c policy

* fix squeeze

* fix trunc on apex

* fix squeezing for real

* update

* remove horizon test for now

* multiagent wip

* update

* fix race condition

* fix ma

* t

* doc

* st

* wip

* example

* wip

* working

* cartpole

* wip

* batch wip

* fix bug

* make other_batches None default

* working

* debug

* nit

* warn

* comments

* fix ppo

* fix obs filter

* update

* wip

* tf

* update

* fix

* cleanup

* cleanup

* spacing

* model

* fix

* dqn

* fix ddpg

* doc

* keep names

* update

* fix

* com

* docs

* clarify model outputs

* Update torch_policy_graph.py

* fix obs filter

* pass thru worker index

* fix

* rename

* vlad torch comments

* fix log action

* debug name

* fix lstm

* remove unused ddpg net

* remove conv net

* revert lstm

* wip

* wip

* cast

* wip

* works

* fix a3c

* works

* lstm util test

* doc

* clean up

* update

* fix lstm check

* move to end

* fix sphinx

* fix cmd

* remove bad doc

* clarify

* copy

* async sa

* fix

* comments

* fix a3c conf

* tune lstm

* fix reshape

* fix

* back to 16

* tuned a3c update

* update

* tuned

* optional

* fix catalog

* remove prep
2018-06-27 22:51:04 -07:00
Eric Liang
1251abf0d1
[rllib] Modularize Torch and TF policy graphs (#2294)
* wip

* cls

* re

* wip

* wip

* a3c working

* torch support

* pg works

* lint

* rm v2

* consumer id

* clean up pg

* clean up more

* fix python 2.7

* tf session management

* docs

* dqn wip

* fix compile

* dqn

* apex runs

* up

* impotrs

* ddpg

* quotes

* fix tests

* fix last r

* fix tests

* lint

* pass checkpoint restore

* kwar

* nits

* policy graph

* fix yapf

* com

* class

* pyt

* vectorization

* update

* test cpe

* unit test

* fix ddpg2

* changes

* wip

* args

* faster test

* common

* fix

* add alg option

* batch mode and policy serving

* multi serving test

* todo

* wip

* serving test

* doc async env

* num envs

* comments

* thread

* remove init hook

* update

* fix ppo

* comments1

* fix

* updates

* add jenkins tests

* fix

* fix pytorch

* fix

* fixes

* fix a3c policy

* fix squeeze

* fix trunc on apex

* fix squeezing for real

* update

* remove horizon test for now

* multiagent wip

* update

* fix race condition

* fix ma

* t

* doc

* st

* wip

* example

* wip

* working

* cartpole

* wip

* batch wip

* fix bug

* make other_batches None default

* working

* debug

* nit

* warn

* comments

* fix ppo

* fix obs filter

* update

* wip

* tf

* update

* fix

* cleanup

* cleanup

* spacing

* model

* fix

* dqn

* fix ddpg

* doc

* keep names

* update

* fix

* com

* docs

* clarify model outputs

* Update torch_policy_graph.py

* fix obs filter

* pass thru worker index

* fix

* rename

* vlad torch comments

* fix log action

* debug name

* fix lstm

* remove unused ddpg net

* remove conv net

* revert lstm

* cast

* clean up

* fix lstm check

* move to end

* fix sphinx

* fix cmd

* remove bad doc

* clarify

* copy

* async sa

* fix
2018-06-26 13:17:15 -07:00
Eric Liang
a9a26b7560
[rllib] Part 2 of multiagent support (#2286)
* wip

* cls

* re

* wip

* wip

* a3c working

* torch support

* pg works

* lint

* rm v2

* consumer id

* clean up pg

* clean up more

* fix python 2.7

* tf session management

* docs

* dqn wip

* fix compile

* dqn

* apex runs

* up

* impotrs

* ddpg

* quotes

* fix tests

* fix last r

* fix tests

* lint

* pass checkpoint restore

* kwar

* nits

* policy graph

* fix yapf

* com

* class

* pyt

* vectorization

* update

* test cpe

* unit test

* fix ddpg2

* changes

* wip

* args

* faster test

* common

* fix

* add alg option

* batch mode and policy serving

* multi serving test

* todo

* wip

* serving test

* doc async env

* num envs

* comments

* thread

* remove init hook

* update

* fix ppo

* comments1

* fix

* updates

* add jenkins tests

* fix

* fix pytorch

* fix

* fixes

* fix a3c policy

* fix squeeze

* fix trunc on apex

* fix squeezing for real

* update

* remove horizon test for now

* multiagent wip

* update

* fix race condition

* fix ma

* t

* doc

* st

* wip

* example

* wip

* working

* cartpole

* wip

* batch wip

* fix bug

* make other_batches None default

* working

* debug

* nit

* warn

* comments

* fix ppo

* fix obs filter

* update

* fix obs filter

* pass thru worker index

* fix

* fix log action

* debug name

* fix sphinx
2018-06-25 22:33:57 -07:00
Eric Liang
e5724a9cfe
[rllib] Add a simple REST policy server and client example (#2232)
* wip

* cls

* re

* wip

* wip

* a3c working

* torch support

* pg works

* lint

* rm v2

* consumer id

* clean up pg

* clean up more

* fix python 2.7

* tf session management

* docs

* dqn wip

* fix compile

* dqn

* apex runs

* up

* impotrs

* ddpg

* quotes

* fix tests

* fix last r

* fix tests

* lint

* pass checkpoint restore

* kwar

* nits

* policy graph

* fix yapf

* com

* class

* pyt

* vectorization

* update

* test cpe

* unit test

* fix ddpg2

* changes

* wip

* args

* faster test

* common

* fix

* add alg option

* batch mode and policy serving

* multi serving test

* todo

* wip

* serving test

* doc async env

* num envs

* comments

* thread

* remove init hook

* update

* policy serve

* spaces

* checkpoint

* no train

* fix ppo

* comments1

* fix

* updates

* add jenkins tests

* fix

* fix pytorch

* fix

* fixes

* fix a3c policy

* fix squeeze

* fix trunc on apex

* fix squeezing for real

* update

* remove horizon test for now

* fix race condition

* update

* com

* updat

* add test

* Update run_multi_node_tests.sh

* use curl

* curl

* kill

* Update run_multi_node_tests.sh

* Update run_multi_node_tests.sh

* fix import

* update
2018-06-20 13:22:39 -07:00
Eric Liang
7dee2c6735
[rllib] Envs for vectorized execution, async execution, and policy serving (#2170)
## What do these changes do?

**Vectorized envs**: Users can either implement `VectorEnv`, or alternatively set `num_envs=N` to auto-vectorize gym envs (this vectorizes just the action computation part).

```
# CartPole-v0 on single core with 64x64 MLP:

# vector_width=1:
Actions per second 2720.1284458322966

# vector_width=8:
Actions per second 13773.035334888269

# vector_width=64:
Actions per second 37903.20472563333
```

**Async envs**: The more general form of `VectorEnv` is `AsyncVectorEnv`, which allows agents to execute out of lockstep. We use this as an adapter to support `ServingEnv`. Since we can convert any other form of env to `AsyncVectorEnv`, utils.sampler has been rewritten to run against this interface.

**Policy serving**: This provides an env which is not stepped. Rather, the env executes in its own thread, querying the policy for actions via `self.get_action(obs)`, and reporting results via `self.log_returns(rewards)`. We also support logging of off-policy actions via `self.log_action(obs, action)`. This is a more convenient API for some use cases, and also provides parallelizable support for policy serving (for example, if you start a HTTP server in the env) and ingest of offline logs (if the env reads from serving logs).

Any of these types of envs can be passed to RLlib agents. RLlib handles conversions internally in CommonPolicyEvaluator, for example:
 ```
        gym.Env => rllib.VectorEnv => rllib.AsyncVectorEnv
        rllib.ServingEnv => rllib.AsyncVectorEnv
```
2018-06-18 11:55:32 -07:00
Eric Liang
71eb558eb0 [rllib] Refactor rllib to have a common sample collection pathway (#2149) 2018-06-09 00:21:35 -07:00
Alok Singh
fd234e3171 [rllib] Fix A3C PyTorch implementation (#2036)
* Use F.softmax instead of a pointless network layer

Stateless functions should not be network layers.

* Use correct pytorch functions

* Rename argument name to out_size

Matches in_size and makes more sense.

* Fix shapes of tensors

Advantages and rewards both should be scalars, and therefore a list of them
should be 1D.

* Fmt

* replace deprecated function

* rm unnecessary Variable wrapper

* rm all use of torch Variables

Torch does this for us now.

* Ensure that values are flat list

* Fix shape error in conv nets

* fmt

* Fix shape errors

Reshaping the action before stepping in the env fixes a few errors.

* Add TODO

* Use correct filter size

Works when `self.config['model']['channel_major'] = True`.

* Add missing channel major

* Revert reshape of action

This should be handled by the agent or at least in a cleaner way that doesn't
break existing envs.

* Squeeze action

* Squeeze actions along first dimension

This should deal with some cases such as cartpole where actions are scalars
while leaving alone cases where actions are arrays (some robotics tasks).

* try adding pytorch tests

* typo

* fixup docker messages

* Fix A3C for some envs

Pendulum doesn't work since it's an edge case (expects singleton arrays, which
`.squeeze()` collapses to scalars).

* fmt

* nit flake

* small lint
2018-05-30 10:48:11 -07:00
Eric Liang
7ab890f4a1 [tune] [rllib] Automatically determine RLlib resources and add queueing mechanism for autoscaling (#1848) 2018-04-16 16:58:15 -07:00
alvkao58
15a668dd12 [RLLib] DDPG (#1685) 2018-04-11 15:08:39 -07:00
Richard Liaw
888e70f1be
[tune] HyperOpt Support (v2) (#1763) 2018-04-04 11:08:26 -07:00
Richard Liaw
9b361115c3
[tune] Added Async HyperBand example (#1709) 2018-03-16 13:25:29 -07:00
Richard Liaw
78716094b5
[tune] Async Hyperband (#1595) 2018-03-04 14:05:56 -08:00
Eric Liang
ecb811c26e
[rllib] Ape-X implementation and DQN refactor to handle replay in policy optimizer (#1604)
* minimal apex checkin

* cleanup dqn options

* actor utils

* Sun Feb 25 17:39:54 PST 2018

* update

* compression refactor

* fix

* add test

* fix models

* Sun Feb 25 21:46:27 PST 2018

* Wed Feb 28 10:26:34 PST 2018

* Wed Feb 28 10:28:09 PST 2018

* Wed Feb 28 10:42:59 PST 2018

* refactor

* Wed Feb 28 11:17:19 PST 2018

* Wed Feb 28 11:42:08 PST 2018

* Wed Feb 28 11:42:13 PST 2018

* Wed Feb 28 11:59:02 PST 2018

* Wed Feb 28 11:59:58 PST 2018

* Wed Feb 28 12:00:08 PST 2018

* Wed Feb 28 12:02:19 PST 2018

* Wed Feb 28 13:44:31 PST 2018

* Wed Feb 28 17:01:20 PST 2018

* Sat Mar  3 14:55:59 PST 2018

* make optimizer construction explicit

* Sat Mar  3 18:23:08 PST 2018

* Sat Mar  3 18:24:28 PST 2018

* Sat Mar  3 18:49:28 PST 2018

* Sat Mar  3 18:50:42 PST 2018

* Sat Mar  3 18:56:10 PST 2018
2018-03-04 12:25:25 -08:00
Eric Liang
80d7def9dc
[autoscaler] [tune] More doc fixes (#1560)
* Fri Feb 16 13:53:50 PST 2018

* Sat Feb 17 15:32:08 PST 2018

* Sat Feb 17 15:44:59 PST 2018

* fix

* Sun Feb 18 14:46:24 PST 2018

* Sun Feb 18 14:46:37 PST 2018

* Sun Feb 18 14:55:52 PST 2018

* Sun Feb 18 15:14:32 PST 2018

* Wed Feb 21 17:34:17 PST 2018

* Sun Feb 25 17:51:17 PST 2018

* Sun Feb 25 22:18:40 PST 2018

* Wed Feb 28 13:19:05 PST 2018

* Wed Feb 28 13:22:13 PST 2018

* Wed Feb 28 13:33:29 PST 2018

* Wed Feb 28 13:35:33 PST 2018

* add ex

* Fri Mar  2 12:50:17 PST 2018

* Fri Mar  2 12:54:31 PST 2018
2018-03-03 13:01:49 -08:00
Richard Liaw
c2ad800cbf
[rllib] Registry fix for DQN Replay Evaluators (#1593) 2018-02-25 22:30:11 -08:00