Commit graph

74 commits

Author SHA1 Message Date
Robert Nishihara
80e8426b5e Test example applications and rllib in jenkins tests. (#707)
* Test example applications in Jenkins.

* Fix default upload_dir argument for Algorithm class.

* Fix evolution strategies.

* Comment out policy gradient example which doesn't seem to work.

* Set --env-name for evolution strategies.
2017-07-16 18:51:33 +00:00
Robert Nishihara
e0867c8845 Switch Python indentation from 2 spaces to 4 spaces. (#726)
* 4 space indentation for actor.py.

* 4 space indentation for worker.py.

* 4 space indentation for more files.

* 4 space indentation for some test files.

* Check indentation in Travis.

* 4 space indentation for some rl files.

* Fix failure test.

* Fix multi_node_test.

* 4 space indentation for more files.

* 4 space indentation for remaining files.

* Fixes.
2017-07-13 21:53:57 +00:00
Eric Liang
2d81edfcdc [rllib] Move a3c implementation from examples/ to python/ray/rllib/ (#698)
* rllib v0

* fix imports

* lint

* comments

* update docs

* a3c wip

* a3c wip

* report stats

* update doc

* name is too long

* fix small bug

* propagate exception on error

* fetch metrics

* fix lint
2017-06-29 15:49:56 +00:00
Eric Liang
a674ec958c [rllib] Move policy gradient and evolution strategies algorithms from examples/ to ray/rllib/ (#694)
* rllib v0

* fix imports

* lint

* comments

* update docs
2017-06-25 22:13:03 +00:00
Philipp Moritz
9bcaaaeaf5 Debugging for policy gradients (#681)
* configuration option for tensorflow debugger

* add model checkpointing

* fix linting

* make it possible to run without checkpointing

* fix

* loading from checkpoint and expose debugger through cli

* todo for filters

* Fix typo.
2017-06-18 17:58:41 -07:00
Eric Liang
4374ad1453 Policy gradient example: Support multi-GPU training (#584)
* add tf metrics

* comments

* fix network scopes

* add doc

* initial work

* try with 3 virtual cpus

* clean up metrics

* use format string

* fix trace level

* back to pong

* always run summary on cpu

* plot intermediate and final sgd stats

* add back a global step

* update

* add timeline

* use staging area and reuse weights properly

* stage at cpu

* whoops, stage only the batch

* clean up a bit

* fix py flake

* wip

* create an optimizer graph per device

* print timeline on 5th batch instead

* print examples per second

* log placement for training ops

* force placement on cpu:0

* try separating weights onto different gpus

* try using nccl

* add cpu fallback

* remove space from date

* check has gpu device

* fix flag config

* checkpoint

* wip

* update

* add some timing

* trace loading

* try cpu

* revert that

* remove expensive test

* lint

* cleanups

* clean up timers

* clean it up a bit

* fix code for non-scalar action spaces

* address some nits

* fix quotes

* efficient shuffling between sgd epochs
2017-06-13 06:03:25 +00:00
Philipp Moritz
690fe10bb6 Save policies for Evolution Strategies (#638)
Save policies for evolution strategies.
2017-06-04 16:21:19 -07:00
Philipp Moritz
679910496e fix policy gradients for mujoco domains (#589) 2017-05-24 18:39:37 -07:00
Eric Liang
06241daf61 Policy gradient example: record stats for tensorboard (#577)
* add tf metrics

* comments

* fix network scopes

* add doc

* use format string

* fix trace level

* plot intermediate and final sgd stats

* add back a global step
2017-05-21 14:51:24 -07:00
Robert Nishihara
b62693ca67 Fix Python 2 bug in hyperopt example. (#575) 2017-05-19 16:12:13 -07:00
Wapaul1
f861124b9a Added python2 support and check for outdated tf (#562)
Improve the Evolutionary Strategies example.
2017-05-17 20:42:17 -07:00
Robert Nishihara
ec2534422b Remove register_class from API. (#550)
* Perform ray.register_class under the hood.

* Fix bug.

* Release worker lock when waiting for imports to arrive in get.

* Remove calls to register_class from examples and tests.

* Clear serialization state between tests.

* Fix bug and add test for multiple custom classes with same name.

* Fix failure test.

* Fix linting and cleanups to python code.

* Fixes to documentation.

* Implement recursion depth for recursively registering classes.

* Fix linting.

* Push warning to user if waiting for class for too long.

* Fix typos.

* Don't export FunctionToRun if pickling the function fails.

* Don't broadcast class definition when pickling class.
2017-05-16 18:38:52 -07:00
Robert Nishihara
3ebfd850e1 Make example applications pep8 compliant. (#553)
* Test examples for pep8 compliance.

* Make rl_pong example pep8 compliant.

* Make policy gradient example pep8 compliant.

* Make lbfgs example pep8 compliant.

* Make hyperopt example pep8 compliant.

* Make a3c example pep8 compliant.

* Make evolution strategies example pep8 compliant.

* Make resnet example pep8 compliant.

* Fix.
2017-05-16 14:12:18 -07:00
Wapaul1
31bf0e8da4 Improved the Resnet Example. (#551)
* Initial updates

* Mostly done

* Now works with no arguments

* Changed version check
2017-05-15 22:40:41 -07:00
Robert Nishihara
3c5375345f Initial version of evolution strategies example. (#544)
* Initial commit of evolution strategies example.

* Some small simplifications.

* Update example to use new API.

* Add example to documentation.
2017-05-14 17:53:51 -07:00
Robert Nishihara
9f91eb8c91 Change API for remote function declaration, actor instantiation, and actor method invocation. (#541)
* Direction substitution of @ray.remote -> @ray.task.

* Changes to make '@ray.task' work.

* Instantiate actors with Class.remote() instead of Class().

* Convert actor instantiation in tests and examples from Class() to Class.remote().

* Change actor method invocation from object.method() to object.method.remote().

* Update tests and examples to invoke actor methods with .remote().

* Fix bugs in jenkins tests.

* Fix example applications.

* Change @ray.task back to @ray.remote.

* Changes to make @ray.actor -> @ray.remote work.

* Direct substitution of @ray.actor -> @ray.remote.

* Fixes.

* Raise exception if @ray.actor decorator is used.

* Simplify ActorMethod class.
2017-05-14 00:01:20 -07:00
Richard Liaw
94f32db5e6 A3C Polishing (#385)
* number

* gym doesn't have versioning

* Benchmarks

* visualization

* formatting

* small fix for tensorboard

* first pass removing universe dependency

* code

* results polish

* removed extra line

* removed universe dependency

* doc

* remove gym versioning stuff

* changes as suggested

* nit
2017-04-11 22:51:52 -07:00
Wapaul1
6d9820ef5d Added tensorboard to resnet (#374)
Added tensorboard to resnet example.
2017-03-17 18:36:23 -07:00
Philipp Moritz
4af0aa6258 Atari on pixels (#364)
* pong on pixels working (not cleaned up)

* make training compatible with all atari games

* cartpole runs

* Update documentation and usage for policy gradients.
2017-03-14 13:31:29 -07:00
Robert Nishihara
99583f5b08 Clean up rl_pong example. (#365)
* Clean up RL pong example.

* More troubleshooting instructions.

* Typo.

* Fix typo.
2017-03-11 21:16:36 -08:00
Wapaul1
b1cb48159a Examples updated with actors. (#358)
* Updated examples with actors

* Small changes, and convert documentation from MD to RST.
2017-03-11 15:30:31 -08:00
Richard Liaw
b463d9e5c7 Initial A3C Example - PongDeterministic-v3 (#331)
* Initializing A3C code

* Modifications for Ray usage

* cleanup

* removing universe dependency

* fixes (not yet working

* hack

* documentation

* Cleanup

* Preliminary Portion

Make sure to change when merging

* RL part

* Cleaning up Driver and Worker code

* Updating driver code

* instructions...

* fixed

* Minor changes.

* Fixing cmake issues

* ray instruction

* updating port to new universe

* Fix for env.configure

* redundant commands

* Revert scipy.misc -> cv2 and raise exception for wrong gym version.
2017-03-11 00:57:53 -08:00
Philipp Moritz
555dcf35a2 Add policy gradient example. (#344)
* add policy gradient example

* fix typos

* Minor changes plus some documentation.

* Minor fixes.
2017-03-07 23:42:44 -08:00
Wapaul1
c66178bcd7 Resnet Adapted to Ray (#229)
* Initial conversion

* Further changes

* fixes

* some changes

* Fixes

* Added data pipeline

* Added updates to cifar

* Currently borken need sep pr

* Added test for retriving variables from an optimizer

* Removed FlAG ref in environment variables

* Added comments to test

* Addressed comments

* Added updates

* Made further changes for tfutils

* Fixed finalized bug

* Removed ipython

* Added accuracy printing

* Temp commit

* added fixes

* changes

* Added writing to file

* Fixes for gpus

* Cleaned up code

* Temp commit

* Gpu support fully implemented

* Updated to use num_gpus for actors

* Finished testing gpus implementation

* Changed to be more in line with origin implementation

* Updated test to use actors

* Added support for cpu only systems

* Now works with no cpus

* Minor changes and some documentation.
2017-03-07 01:07:32 -08:00
Robert Nishihara
0a233b7144 Update hyperparameter optimization example. (#332)
* Update hyperparameter optimization example.

* Remove early stopping.
2017-03-04 10:45:15 -08:00
Robert Nishihara
1a997ed279 Move documentation to ReadTheDocs. (#326) 2017-02-27 21:14:31 -08:00
Wapaul1
db7297865f Added functionality for retrieving variables from control dependencies (#220)
* Added test for retriving variables from an optimizer

* Added comments to test

* Addressed comments

* Fixed travis bug

* Added fix to circular controls

* Added set for explored operations and duplicate prefix stripping

* Removed embeded ipython

* Removed prefix, use seperate graph for each network

* Removed redundant imports

* Addressed comments and added separate graph to initializer

* fix typos

* get rid of prefix in documentation
2017-01-30 19:17:42 -08:00
Robert Nishihara
ba8933e10f Update tutorial. (#196)
* Update tutorial.

* Small updates to documentation and code.
2017-01-10 23:52:38 -08:00
Robert Nishihara
87d8d05792 Rename reusable variables -> environment variables. (#195) 2017-01-10 20:14:33 -08:00
Wapaul1
aaf3be3c53 Fixed lbfgs for ray-cluster (#180)
* Updated lbfgs example to include TensorflowVariables

* Whitespace.
2017-01-10 18:40:06 -08:00
Robert Nishihara
be4a37bf37 Various cleanups: remove start_ray_local from ray.init, remove unused code, fix "pip install numbuf". (#193)
* Remove start_ray_local from ray.init and change default number of workers to 10.

* Remove alexnet example.

* Move array methods to experimental.

* Remove TRPO example.

* Remove old files.

* Compile plasma when we build numbuf.

* Address comments.
2017-01-10 17:35:27 -08:00
Wapaul1
c45342e39d Updated code to mesh with get_weights returning a dict and new tf code (#187)
* Updated code to mesh with get_weights returning a dict and new tf code

* Added tf.global_variables_initalizer to hyperopt example as well

* Small fix.

* Small name change.
2017-01-07 14:25:45 -08:00
Wapaul1
417c04bac8 Removed iteritems and xrange for python3 in rl_pong (#182)
* Removed iteritems and xrange for python3

* Remove unused variable.
2017-01-05 20:37:00 -08:00
Robert Nishihara
ddba1df802 Start working toward Python3 compatibility. (#117) 2016-12-11 12:25:31 -08:00
Robert Nishihara
336a904404 Implement repr, hash, and richcompare for ObjectIDs. (#33)
* Implement repr, hash, and richcompare for ObjectIDs.

* Addressing comments.

* Partially fix example applications.
2016-11-11 09:18:36 -08:00
Robert Nishihara
1c3aaf7189 Update documentation (#445)
* Update documentation for serialization.

* Update documentation for reusable variables.

* Update documentation for using Ray with TensorFlow. This change is to allow code blocks to be copied and pasted into a Python interpreter.

* Fix documentation for hyperparameter optimization example.
2016-10-12 15:41:00 -07:00
Richard Liaw
8925580fb6 nit for filter (#443) 2016-09-26 20:58:19 -07:00
Richard Liaw
d22321da7a Changes to run TRPO (#442)
Filter/new updates

final keras changes

some changes
2016-09-26 20:20:45 -07:00
Robert Nishihara
91f16a3df0 Migrate repositories to ray-project. (#438)
* Migrate repositories to ray-project.

* Update numbuf to the migrated version.
2016-09-17 00:52:05 -07:00
Robert Nishihara
4863a5155c Cleanup setting and getting of tensorflow weights. (#385)
* Cleanup setting and getting of tensorflow weights.

* Add documentation for using TensorFlow.

* Group get_weights and set_weights in a function.

* Update readme.
2016-09-16 23:05:14 -07:00
Wapaul1
d5815673a5 Changed ray.select() to ray.wait() and its functionality (#426)
* Re-implemented select, changed name to wait

* Changed tests for select to tests for wait

* Updated the hyperopt example to match wait

* Small fixes and improve example readme.

* Make tests pass.
2016-09-14 17:14:11 -07:00
Robert Nishihara
1ad663b689 Add more print statements to lbfgs app to help debug. (#420) 2016-09-08 11:43:26 -07:00
Philipp Moritz
3548797202 [API] Implement get for multiple objects (#398)
* [API] Implement get for multiple objects

* Small fixes.
2016-09-02 18:02:44 -07:00
Robert Nishihara
5cf1d60cb2 Fix documentation. 2016-08-30 17:20:00 -07:00
Robert Nishihara
fb7ccef493 Allow remote decorator to be used with no parentheses. 2016-08-30 16:38:26 -07:00
Robert Nishihara
b87912cb2f Remove typing module. 2016-08-29 22:16:19 -07:00
Robert Nishihara
d7f313a026 Remove type information from remote decorator. 2016-08-29 22:05:59 -07:00
Wapaul1
7246013008 Implement select to enable waiting for a specific number of remote objects to be ready. (#369) 2016-08-15 16:51:59 -07:00
Robert Nishihara
13df8302e6 enable running example apps in cluster mode (#357) 2016-08-08 16:01:13 -07:00
Philipp Moritz
eae27f23ac TRPO example (#336) 2016-08-01 18:40:34 -07:00