Commit graph

97 commits

Author SHA1 Message Date
Eric Liang
b45bed4bce
[rllib] Propagate model options correctly in ARS / ES, to action dist of PPO (#2974)
* fix

* fix

* fix it

* propagate conf to action dist

* move carla example too

* rr

* Update policies.py

* wip

* lint
2018-10-01 12:49:39 -07:00
Eric Liang
8ea926c266
[rllib] _init renamed to _build_layers in example 2018-07-12 19:21:58 +02:00
Alok Singh
f795173b51 Use flake8-comprehensions (#1976)
* Add flake8 to Travis

* Add flake8-comprehensions

[flake8 plugin](https://github.com/adamchainz/flake8-comprehensions) that
checks for useless constructions.

* Use generators instead of lists where appropriate

A lot of the builtins can take in generators instead of lists.

This commit applies `flake8-comprehensions` to find them.

* Fix lint error

* Fix some string formatting

The rest can be fixed in another PR

* Fix compound literals syntax

This should probably be merged after #1963.

* dict() -> {}

* Use dict literal syntax

dict(...) -> {...}

* Rewrite nested dicts

* Fix hanging indent

* Add missing import

* Add missing quote

* fmt

* Add missing whitespace

* rm duplicate pip install

This is already installed in another file.

* Fix indent

* move `merge_dicts` into utils

* Bring up to date with `master`

* Add automatic syntax upgrade

* rm pyupgrade

In case users want to still use it on their own, the upgrade-syn.sh script was
left in the `.travis` dir.
2018-05-20 16:15:06 -07:00
Alok Singh
cdf94c18a4 Clean up syntax for supported Python versions. (#1963)
* Use set/dict literal syntax

Ran code through [pyupgrade](https://github.com/asottile/pyupgrade). This is
supported in every Python version 2.7+.

* Drop unnecessary string format specification

No need to specify 0,1.. if paramters are passed in order.

* Revert "Drop unnecessary string format specification"

This reverts commit efa5ec85d30ff69f34e5ed93e31343fea7647bcb.

* Undo changes to cloudpickle

Drop use of set literal until cloudpickle uses it.

* Reformat code with YAPF

We need to set up a git pre-push hook to automatically run this stuff.
2018-05-03 07:45:11 -07:00
Eric Liang
e4b17e03f6
updates (#1896) 2018-04-13 00:57:00 -07:00
Eric Liang
72595cca0d [tune] Change tune resource request syntax to be less confusing (#1764)
* update

* update examples

* Wed Mar 21 15:19:56 PDT 2018

* Wed Mar 21 15:21:32 PDT 2018

* Update train_a3c.py

* Update train.py

* fix resources accounting
2018-03-23 06:25:01 -07:00
butchcom
936bebef99 [rllib] Upgrade to OpenAI Gym 0.10.3 (#1601) 2018-03-06 00:31:02 -08:00
Robert Nishihara
e96acc26f7 Fix MNIST downloading problems in parameter server examples. (#1457)
* Fix MNIST downloading problems in parameter server examples.

* Improve seeding.

* Fixes.
2018-01-25 14:14:37 -08:00
Eric Liang
173f1d629a
[tune] Ray Tune API cleanup (#1454)
Remove rllib dep: trainable is now a standalone abstract class that can be easily subclassed.

Clean up hyperband: fix debug string and add an example.

Remove YAML api / ScriptRunner: this was never really used.

Move ray.init() out of run_experiments(): This provides greater flexibility and should be less confusing since there isn't an implicit init() done there. Note that this is a breaking API change for tune.
2018-01-24 16:55:17 -08:00
Eric Liang
424bd7f74d
[rllib] improve custom env docs (#1447)
* env docs

* add env

* update env

* Fri Jan 19 18:55:34 PST 2018
2018-01-19 21:36:18 -08:00
Eric Liang
c60ccbad46 [carla] [rllib] Add support for carla nav planner and scenarios from paper (#1382)
* wip

* Sat Dec 30 15:07:28 PST 2017

* log video

* video doesn't work well

* scenario integration

* Sat Dec 30 17:30:22 PST 2017

* Sat Dec 30 17:31:05 PST 2017

* Sat Dec 30 17:31:32 PST 2017

* Sat Dec 30 17:32:16 PST 2017

* Sat Dec 30 17:34:11 PST 2017

* Sat Dec 30 17:34:50 PST 2017

* Sat Dec 30 17:35:34 PST 2017

* Sat Dec 30 17:38:49 PST 2017

* Sat Dec 30 17:40:39 PST 2017

* Sat Dec 30 17:43:00 PST 2017

* Sat Dec 30 17:43:04 PST 2017

* Sat Dec 30 17:45:56 PST 2017

* Sat Dec 30 17:46:26 PST 2017

* Sat Dec 30 17:47:02 PST 2017

* Sat Dec 30 17:51:53 PST 2017

* Sat Dec 30 17:52:54 PST 2017

* Sat Dec 30 17:56:43 PST 2017

* Sat Dec 30 18:27:07 PST 2017

* Sat Dec 30 18:27:52 PST 2017

* fix train

* Sat Dec 30 18:41:51 PST 2017

* Sat Dec 30 18:54:11 PST 2017

* Sat Dec 30 18:56:22 PST 2017

* Sat Dec 30 19:05:04 PST 2017

* Sat Dec 30 19:05:23 PST 2017

* Sat Dec 30 19:11:53 PST 2017

* Sat Dec 30 19:14:31 PST 2017

* Sat Dec 30 19:16:20 PST 2017

* Sat Dec 30 19:18:05 PST 2017

* Sat Dec 30 19:18:45 PST 2017

* Sat Dec 30 19:22:44 PST 2017

* Sat Dec 30 19:24:41 PST 2017

* Sat Dec 30 19:26:57 PST 2017

* Sat Dec 30 19:40:37 PST 2017

* wip models

* reward bonus

* test prep

* Sun Dec 31 18:45:25 PST 2017

* Sun Dec 31 18:58:28 PST 2017

* Sun Dec 31 18:59:34 PST 2017

* Sun Dec 31 19:03:33 PST 2017

* Sun Dec 31 19:05:05 PST 2017

* Sun Dec 31 19:09:25 PST 2017

* fix train

* kill

* add tuple preprocessor

* Sun Dec 31 20:38:33 PST 2017

* Sun Dec 31 22:51:24 PST 2017

* Sun Dec 31 23:14:13 PST 2017

* Sun Dec 31 23:16:04 PST 2017

* Mon Jan  1 00:08:35 PST 2018

* Mon Jan  1 00:10:48 PST 2018

* Mon Jan  1 01:08:31 PST 2018

* Mon Jan  1 14:45:44 PST 2018

* Mon Jan  1 14:54:56 PST 2018

* Mon Jan  1 17:29:29 PST 2018

* switch to euclidean dists

* Mon Jan  1 17:39:27 PST 2018

* Mon Jan  1 17:41:47 PST 2018

* Mon Jan  1 17:44:18 PST 2018

* Mon Jan  1 17:47:09 PST 2018

* Mon Jan  1 20:31:02 PST 2018

* Mon Jan  1 20:39:33 PST 2018

* Mon Jan  1 20:40:55 PST 2018

* Mon Jan  1 20:55:06 PST 2018

* Mon Jan  1 21:05:52 PST 2018

* fix env path

* merge richards fix

* fix hash

* Mon Jan  1 22:04:00 PST 2018

* Mon Jan  1 22:25:29 PST 2018

* Mon Jan  1 22:30:42 PST 2018

* simplified reward function

* add framestack

* add env configs

* simplify speed reward

* Tue Jan  2 17:36:15 PST 2018

* Tue Jan  2 17:49:16 PST 2018

* Tue Jan  2 18:10:38 PST 2018

* add lane keeping simple mode

* Tue Jan  2 20:25:26 PST 2018

* Tue Jan  2 20:30:30 PST 2018

* Tue Jan  2 20:33:26 PST 2018

* Tue Jan  2 20:41:42 PST 2018

* ppo lane keep

* simplify discrete actions

* Tue Jan  2 21:41:05 PST 2018

* Tue Jan  2 21:49:03 PST 2018

* Tue Jan  2 22:12:23 PST 2018

* Tue Jan  2 22:14:42 PST 2018

* Tue Jan  2 22:20:59 PST 2018

* Tue Jan  2 22:23:43 PST 2018

* Tue Jan  2 22:26:27 PST 2018

* Tue Jan  2 22:27:20 PST 2018

* Tue Jan  2 22:44:00 PST 2018

* Tue Jan  2 22:57:58 PST 2018

* Tue Jan  2 23:08:51 PST 2018

* Tue Jan  2 23:11:32 PST 2018

* update dqn reward

* Thu Jan  4 12:29:40 PST 2018

* Thu Jan  4 12:30:26 PST 2018

* Update train_dqn.py

* fix
2018-01-05 21:32:41 -08:00
Eric Liang
0ae660ce4e [carla] In carla example, save all images and measurements to local disk (#1350)
* revamp saving

* smaller jpgs

* hide verbose

* Tue Dec 19 22:25:01 PST 2017

* make sure temp dirs sort lexiographically

* save total reward too

* zero pad i

* 160x160 dqn

* ever higher res dqn
2017-12-21 15:19:55 -08:00
Eric Liang
6724f57b03 [Examples] Add Carla test env (#1343)
* add carla example

* add reward

* set obs

* Sun Dec 17 16:06:00 PST 2017

* add spec

* fix measurement

* add train script

* resize to 80x80

* null

* initial small training run

* robustify env, clean up action space

* clean up vars

* switch to town2 which is faster

* tunify train.py

* add discrete mode

* update

* fix excessive brakinG

* fix the weather

* rename

* redirect output and from future import

* doc

* update

* fix rebase

* allow dqn gpu growht

* adjust dqn hyperparams

* better ppo parameters
2017-12-19 12:57:58 -08:00
Philipp Moritz
2c0d5544ac Add streaming MapReduce example (#1251)
Add streaming MapReduce example.
2017-11-27 21:38:35 -08:00
Melih Elibol
e066bcf633 Synchronous parameter server example. (#1220)
* Synchronous parameter server example.

* Added sync parameter server example to documentation index.

* Consolidate documentation and minor simplifications.

* Fix linting.
2017-11-15 17:49:31 -08:00
Daniel Suo
4f0da6f81c Add basic functionality for Cython functions and actors (#1193)
* Add basic functionality for Cython functions and actors

* Fix up per @pcmoritz comments

* Fixes per @richardliaw comments

* Fixes per @robertnishihara comments

* Forgot double quotes when updating masked_log

* Remove import typing for Python 2 compatibility
2017-11-09 17:49:06 -08:00
Robert Nishihara
1bf276cc08 Basic parameter server example. (#1198)
* Basic parameter server example.

* Consolidate files.

* Whitespace.

* Add documentation.
2017-11-08 23:40:51 -08:00
Richard Liaw
797f4fcbf3 Fixing Lint after flake upgrade (#1162)
* Fixing Lint after flake upgrade

* more lint fixes
2017-10-26 21:02:07 -05:00
Abishek Bhat
6da7761d5d Fix overlooked typo. (#1158)
Without this the example script would crash with an UnboundLocalError.
2017-10-25 07:40:52 -07:00
Wapaul1
c26c7553bc Resnet Example Uses tf.Datasets now (#960)
Change Resnet example to use tf.Datasets instead of queues.
2017-09-20 14:14:04 -07:00
ustcfriend
9ec3608eca Fix resnet crash by setting config.gpu_options.allow_growth = True. (#971) 2017-09-12 22:36:06 -07:00
Robert Nishihara
4b76335157 Allow ResNet example to run on multiple machines. (#891)
* Allow a redis address to be passed into the ResNet example.

* Update documentation.
2017-08-29 21:37:53 -07:00
Si-Yuan
8099cdeb9d Fix: 'hyperopt_adaptive' example keeps fake 'best_hyperparameters' (#883) 2017-08-28 22:47:16 -07:00
Robert Nishihara
80e8426b5e Test example applications and rllib in jenkins tests. (#707)
* Test example applications in Jenkins.

* Fix default upload_dir argument for Algorithm class.

* Fix evolution strategies.

* Comment out policy gradient example which doesn't seem to work.

* Set --env-name for evolution strategies.
2017-07-16 18:51:33 +00:00
Robert Nishihara
e0867c8845 Switch Python indentation from 2 spaces to 4 spaces. (#726)
* 4 space indentation for actor.py.

* 4 space indentation for worker.py.

* 4 space indentation for more files.

* 4 space indentation for some test files.

* Check indentation in Travis.

* 4 space indentation for some rl files.

* Fix failure test.

* Fix multi_node_test.

* 4 space indentation for more files.

* 4 space indentation for remaining files.

* Fixes.
2017-07-13 21:53:57 +00:00
Eric Liang
2d81edfcdc [rllib] Move a3c implementation from examples/ to python/ray/rllib/ (#698)
* rllib v0

* fix imports

* lint

* comments

* update docs

* a3c wip

* a3c wip

* report stats

* update doc

* name is too long

* fix small bug

* propagate exception on error

* fetch metrics

* fix lint
2017-06-29 15:49:56 +00:00
Eric Liang
a674ec958c [rllib] Move policy gradient and evolution strategies algorithms from examples/ to ray/rllib/ (#694)
* rllib v0

* fix imports

* lint

* comments

* update docs
2017-06-25 22:13:03 +00:00
Philipp Moritz
9bcaaaeaf5 Debugging for policy gradients (#681)
* configuration option for tensorflow debugger

* add model checkpointing

* fix linting

* make it possible to run without checkpointing

* fix

* loading from checkpoint and expose debugger through cli

* todo for filters

* Fix typo.
2017-06-18 17:58:41 -07:00
Eric Liang
4374ad1453 Policy gradient example: Support multi-GPU training (#584)
* add tf metrics

* comments

* fix network scopes

* add doc

* initial work

* try with 3 virtual cpus

* clean up metrics

* use format string

* fix trace level

* back to pong

* always run summary on cpu

* plot intermediate and final sgd stats

* add back a global step

* update

* add timeline

* use staging area and reuse weights properly

* stage at cpu

* whoops, stage only the batch

* clean up a bit

* fix py flake

* wip

* create an optimizer graph per device

* print timeline on 5th batch instead

* print examples per second

* log placement for training ops

* force placement on cpu:0

* try separating weights onto different gpus

* try using nccl

* add cpu fallback

* remove space from date

* check has gpu device

* fix flag config

* checkpoint

* wip

* update

* add some timing

* trace loading

* try cpu

* revert that

* remove expensive test

* lint

* cleanups

* clean up timers

* clean it up a bit

* fix code for non-scalar action spaces

* address some nits

* fix quotes

* efficient shuffling between sgd epochs
2017-06-13 06:03:25 +00:00
Philipp Moritz
690fe10bb6 Save policies for Evolution Strategies (#638)
Save policies for evolution strategies.
2017-06-04 16:21:19 -07:00
Philipp Moritz
679910496e fix policy gradients for mujoco domains (#589) 2017-05-24 18:39:37 -07:00
Eric Liang
06241daf61 Policy gradient example: record stats for tensorboard (#577)
* add tf metrics

* comments

* fix network scopes

* add doc

* use format string

* fix trace level

* plot intermediate and final sgd stats

* add back a global step
2017-05-21 14:51:24 -07:00
Robert Nishihara
b62693ca67 Fix Python 2 bug in hyperopt example. (#575) 2017-05-19 16:12:13 -07:00
Wapaul1
f861124b9a Added python2 support and check for outdated tf (#562)
Improve the Evolutionary Strategies example.
2017-05-17 20:42:17 -07:00
Robert Nishihara
ec2534422b Remove register_class from API. (#550)
* Perform ray.register_class under the hood.

* Fix bug.

* Release worker lock when waiting for imports to arrive in get.

* Remove calls to register_class from examples and tests.

* Clear serialization state between tests.

* Fix bug and add test for multiple custom classes with same name.

* Fix failure test.

* Fix linting and cleanups to python code.

* Fixes to documentation.

* Implement recursion depth for recursively registering classes.

* Fix linting.

* Push warning to user if waiting for class for too long.

* Fix typos.

* Don't export FunctionToRun if pickling the function fails.

* Don't broadcast class definition when pickling class.
2017-05-16 18:38:52 -07:00
Robert Nishihara
3ebfd850e1 Make example applications pep8 compliant. (#553)
* Test examples for pep8 compliance.

* Make rl_pong example pep8 compliant.

* Make policy gradient example pep8 compliant.

* Make lbfgs example pep8 compliant.

* Make hyperopt example pep8 compliant.

* Make a3c example pep8 compliant.

* Make evolution strategies example pep8 compliant.

* Make resnet example pep8 compliant.

* Fix.
2017-05-16 14:12:18 -07:00
Wapaul1
31bf0e8da4 Improved the Resnet Example. (#551)
* Initial updates

* Mostly done

* Now works with no arguments

* Changed version check
2017-05-15 22:40:41 -07:00
Robert Nishihara
3c5375345f Initial version of evolution strategies example. (#544)
* Initial commit of evolution strategies example.

* Some small simplifications.

* Update example to use new API.

* Add example to documentation.
2017-05-14 17:53:51 -07:00
Robert Nishihara
9f91eb8c91 Change API for remote function declaration, actor instantiation, and actor method invocation. (#541)
* Direction substitution of @ray.remote -> @ray.task.

* Changes to make '@ray.task' work.

* Instantiate actors with Class.remote() instead of Class().

* Convert actor instantiation in tests and examples from Class() to Class.remote().

* Change actor method invocation from object.method() to object.method.remote().

* Update tests and examples to invoke actor methods with .remote().

* Fix bugs in jenkins tests.

* Fix example applications.

* Change @ray.task back to @ray.remote.

* Changes to make @ray.actor -> @ray.remote work.

* Direct substitution of @ray.actor -> @ray.remote.

* Fixes.

* Raise exception if @ray.actor decorator is used.

* Simplify ActorMethod class.
2017-05-14 00:01:20 -07:00
Richard Liaw
94f32db5e6 A3C Polishing (#385)
* number

* gym doesn't have versioning

* Benchmarks

* visualization

* formatting

* small fix for tensorboard

* first pass removing universe dependency

* code

* results polish

* removed extra line

* removed universe dependency

* doc

* remove gym versioning stuff

* changes as suggested

* nit
2017-04-11 22:51:52 -07:00
Wapaul1
6d9820ef5d Added tensorboard to resnet (#374)
Added tensorboard to resnet example.
2017-03-17 18:36:23 -07:00
Philipp Moritz
4af0aa6258 Atari on pixels (#364)
* pong on pixels working (not cleaned up)

* make training compatible with all atari games

* cartpole runs

* Update documentation and usage for policy gradients.
2017-03-14 13:31:29 -07:00
Robert Nishihara
99583f5b08 Clean up rl_pong example. (#365)
* Clean up RL pong example.

* More troubleshooting instructions.

* Typo.

* Fix typo.
2017-03-11 21:16:36 -08:00
Wapaul1
b1cb48159a Examples updated with actors. (#358)
* Updated examples with actors

* Small changes, and convert documentation from MD to RST.
2017-03-11 15:30:31 -08:00
Richard Liaw
b463d9e5c7 Initial A3C Example - PongDeterministic-v3 (#331)
* Initializing A3C code

* Modifications for Ray usage

* cleanup

* removing universe dependency

* fixes (not yet working

* hack

* documentation

* Cleanup

* Preliminary Portion

Make sure to change when merging

* RL part

* Cleaning up Driver and Worker code

* Updating driver code

* instructions...

* fixed

* Minor changes.

* Fixing cmake issues

* ray instruction

* updating port to new universe

* Fix for env.configure

* redundant commands

* Revert scipy.misc -> cv2 and raise exception for wrong gym version.
2017-03-11 00:57:53 -08:00
Philipp Moritz
555dcf35a2 Add policy gradient example. (#344)
* add policy gradient example

* fix typos

* Minor changes plus some documentation.

* Minor fixes.
2017-03-07 23:42:44 -08:00
Wapaul1
c66178bcd7 Resnet Adapted to Ray (#229)
* Initial conversion

* Further changes

* fixes

* some changes

* Fixes

* Added data pipeline

* Added updates to cifar

* Currently borken need sep pr

* Added test for retriving variables from an optimizer

* Removed FlAG ref in environment variables

* Added comments to test

* Addressed comments

* Added updates

* Made further changes for tfutils

* Fixed finalized bug

* Removed ipython

* Added accuracy printing

* Temp commit

* added fixes

* changes

* Added writing to file

* Fixes for gpus

* Cleaned up code

* Temp commit

* Gpu support fully implemented

* Updated to use num_gpus for actors

* Finished testing gpus implementation

* Changed to be more in line with origin implementation

* Updated test to use actors

* Added support for cpu only systems

* Now works with no cpus

* Minor changes and some documentation.
2017-03-07 01:07:32 -08:00
Robert Nishihara
0a233b7144 Update hyperparameter optimization example. (#332)
* Update hyperparameter optimization example.

* Remove early stopping.
2017-03-04 10:45:15 -08:00
Robert Nishihara
1a997ed279 Move documentation to ReadTheDocs. (#326) 2017-02-27 21:14:31 -08:00
Wapaul1
db7297865f Added functionality for retrieving variables from control dependencies (#220)
* Added test for retriving variables from an optimizer

* Added comments to test

* Addressed comments

* Fixed travis bug

* Added fix to circular controls

* Added set for explored operations and duplicate prefix stripping

* Removed embeded ipython

* Removed prefix, use seperate graph for each network

* Removed redundant imports

* Addressed comments and added separate graph to initializer

* fix typos

* get rid of prefix in documentation
2017-01-30 19:17:42 -08:00