Eric Liang
81bed0fef8
[tune] Add internal implementation overview + image for task timeline ( #1254 )
2017-11-26 10:57:32 -08:00
Richard Liaw
f34d705178
[rllib] Update Docs for RLLib ( #1248 )
...
* init_changes
* last_changes
* addressing comments
* fix comments
* update
* nit
2017-11-24 10:36:57 -08:00
Robert Nishihara
7af5292646
Give error if a worker has a version mismatch for Python Ray, or clou… ( #1245 )
...
* Give error if a worker has a version mismatch for Python Ray, or cloudpickle.
* Check version when attaching driver to cluster.
* Only do check if the version info is present.
* Bug fix.
* Fix typo.
2017-11-23 23:31:03 -08:00
Eric Liang
ddfe00b7e8
[tune] Documentation for Ray.tune ( #1243 )
2017-11-23 11:31:59 -08:00
Robert Nishihara
477a40f76d
Prohibit returning actor handles and also update actor documentation. ( #1246 )
...
* Prohibit returning actor handles and also update actor documentation.
* Clarify documentation.
2017-11-23 09:37:24 -08:00
Robert Nishihara
2ae5a8484f
Upgrade cloudpickle to 0.5.2. ( #1244 )
2017-11-22 20:23:04 -08:00
Robert Nishihara
e0a340ee7e
Allow actors to pin at most 1000 dummy objects at a time. ( #1241 )
...
* Allow actors to pin at most 1000 dummy objects at a time.
* Fix linting.
2017-11-22 13:38:01 -08:00
Eric Liang
ad044cbe8f
changes ( #1237 )
2017-11-20 21:15:54 -08:00
Eric Liang
316f9e2bb7
[tune] Support user-defined trainable functions / classes / envs with a shared object registry ( #1226 )
2017-11-20 17:52:43 -08:00
Eric Liang
9233e496cc
Raise exception when getting the task results of workers that died ( #1224 )
...
* wip
* with test
* add timeout
* also add test for f
* remove on cleanup
* update
* wip
* fix tests
* mark actor removed in redis
* clang-format
* fix bug when no-inprogress tasks
* try to set task status done
* Add comment.
2017-11-20 15:18:39 -08:00
Eric Liang
ae4e1dd396
[tune] [rllib] Allow checkpointing to object store instead of local disk ( #1212 )
...
* wip
* use normal pickle
* fix checkpoint test
* comment
* Comment
* fix test
* fix lint
* fix py 3.5
* Update agent.py
* fix lint
2017-11-19 00:36:43 -08:00
Robert Nishihara
9a2e37a63e
Don't record event log on driver. ( #1217 )
2017-11-16 23:17:59 -08:00
Robert Nishihara
0eae917766
[rllib] Clean up evolution strategies example. ( #1225 )
...
* Remove ES observation statistics.
* Consolidate policy classes.
* Remove random stream.
* Move rollout function out of policy.
* Consolidate policy initialization.
* Replace act implementation with sess.run.
* Remove tf_utils.
* Remove variable scope.
* Remove unused imports.
* Use regular TF session.
* Use MeanStdFilter.
* Minor.
* Clarify naming.
* Update documentation.
* eps -> episodes
* Report noiseless evaluation runs.
* Clean up naming.
* Update documentation.
* Fix some bugs.
* Make it run on atari.
* Don't add action noise during evaluation runs.
* Add ES to checkpoint/restore test.
* Small cleanups and remove redundant calls to get_weights.
* Remove outdated comment.
2017-11-16 21:58:30 -08:00
Richard Liaw
eadb998643
[tune] Make HyperBand Usable ( #1215 )
2017-11-16 10:31:42 -08:00
Richard Liaw
3a0206a1f4
[tune] Parallel Coordinate Visualization Notebook ( #1218 )
2017-11-16 00:42:28 -08:00
Eric Liang
009f59defc
[tune] [rllib] Centralized driver logging ( #1208 )
...
* logger v2
* add logger
* lint
* todo
* viskit works now
* doc
* remove none check
* fix timeout
* Missing Numpy for Sigmoid data
2017-11-15 22:11:47 -08:00
Richard Liaw
71f8cd2403
[tune] Fixing up Hyperband ( #1207 )
...
* Fixing up Hyperband
* nit
* cleanup
* Timing test Added
* added_exception_back
* fixup_tests
* reverse placement
* fixes_and_tests
* fix
* fix
* fixlint
* cleanup_timing
* lint
* Update hyperband.py
2017-11-12 12:05:32 -08:00
Eric Liang
7c38f964b7
[tune] Add command line support for choosing early stopping schedulers ( #1209 )
...
* command line support
* add checkpoint freq
* fix other flags
* fix
* docs
* doc
2017-11-12 12:05:18 -08:00
Richard Liaw
afdc87323f
[rllib] PyTorch Models for A3C ( #1187 )
...
* fixing policy
* Compute Action is singular, fixed weird issue with arrays
* remove vestige
* extraneous ipdb
* Can Drop in Pytorch Model
* lint
* introducing models
* fix base policy
* Missed this from last time
* lint
* removedolds
* getting vision working
* LINT
* trying to fix test dependencies
* requiremnets
* try
* tryconda
* yes
* shutup
* flake_passes
* changes
* removing weight initializer for lstm for now
* unused
* adam
* clip
* zero
* properscaling
* weight
* try
* fix up pytorch visionnet
* bias correction
* fix model
* same visionnet
* matching_bad_things
* test
* try locking
* fixing_linear
* naming
* lint
* FORJENKINS
* clouds
* lint
* Lint + removed dependencies
* removed dependencies
* format
2017-11-12 00:20:33 -08:00
Philipp Moritz
e798a652bc
Change TaskSpec to allow multiple object IDs per argument. ( #1204 )
...
* Implement object ID bags
* linting
* fix tests
* fix linting
* fix comments
2017-11-10 16:33:34 -08:00
Stephanie Wang
07f0532b9b
Local scheduler filters out dead clients during reconstruction ( #1182 )
...
* Object table lookup returns vector of DBClientID instead of address strings
* Add node IP address to DBClient notification
* DB client cache stores entire DB client, convert addresses to std::string
* get cached db client returns the client
* Expose a call to initialize the redis cache
* Local scheduler filters out dead clients during reconstruction
* Remove node ip address from dbclient, use aux_address for plasma managers
* Get entire db client entry when not found in cache
* Fix common tests
* Fix address in tests
* Push error to driver if driver task did the put
* Address Robert's comments and cleanup
* Remove unused Redis command
* Fix db test
2017-11-10 11:29:24 -08:00
Daniel Suo
4f0da6f81c
Add basic functionality for Cython functions and actors ( #1193 )
...
* Add basic functionality for Cython functions and actors
* Fix up per @pcmoritz comments
* Fixes per @richardliaw comments
* Fixes per @robertnishihara comments
* Forgot double quotes when updating masked_log
* Remove import typing for Python 2 compatibility
2017-11-09 17:49:06 -08:00
Robert Nishihara
11f8f8bd8c
Document --num-workers better. ( #1201 )
2017-11-09 17:02:18 -08:00
Richard Liaw
6197b260b8
Fix Jenkins issue introduced by Variant Generator ( #1194 )
...
* try fix
* shorten
* added a flag
* finish
* Fix linting.
2017-11-09 00:56:20 -08:00
Robert Nishihara
3a37d1cf7d
Pin cloudpickle to 0.4.1. ( #1200 )
2017-11-08 21:14:09 -08:00
Robert Nishihara
1c6b30b5e2
Move all config constants into single file. ( #1192 )
...
* Initial pass at factoring out C++ configuration into a single file.
* Expose config through Python.
* Forward declarations.
* Fixes with Python extensions
* Remove old code.
* Consistent naming for constants.
* Fixes
* Fix linting.
* More linting.
* Whitespace
* rename config -> _config.
* Move config inside a class.
* update naming convention
* Fix linting.
* More linting
* More linting.
* Add in some more constants.
* Fix linting
2017-11-08 11:10:38 -08:00
Eric Liang
52888e4c6f
[tune] Improve the tune Python API and variant generation ( #1154 )
...
* new variant gen
* wip
* Sat Oct 21 18:21:34 PDT 2017
* update
* comment
* fix
* update
* update readme
* fix
* Update README.rst
* Update README.rst
* fix repeat
* update
* note on restore
2017-11-06 23:41:17 -08:00
Richard Liaw
6222ec3bd7
[tune] hyperband ( #1156 )
...
* trial scheduler interface
* remove
* wip median stopping
* remove
* median stopping rule
* update
* docs
* update
* Revrt
* update
* hyperband untested
* small changes before moving on
* added endpoints
* good changes
* init tests
* smore tests
* unfinished tests
* testing
* testing code
* morbugs
* fixes
* end
* tests and typo
* nit
* try this
* tests
* testing
* lint
* lint
* lint
* comments and docs
* almost screwed up
* lint
2017-11-06 22:30:25 -08:00
Eric Liang
d06beacd84
[tune] Implement median stopping rule ( #1170 )
...
* trial scheduler interface
* remove
* wip median stopping
* remove
* median stopping rule
* update
* docs
* update
* Revrt
* update
* comments
* fix tesT
2017-11-03 11:25:02 -07:00
Philipp Moritz
fdf069bd1d
update version to 0.2.2 ( #1178 )
2017-11-01 20:41:24 -07:00
Robert Nishihara
3317d38278
Replace hostnames with numerical IP addresses in redis address. ( #1177 )
...
* Replace hostnames with numerical IP addresses in redis address.
* Also do conversion for node_ip_address. Add test.
* Simplifications.
2017-11-01 17:13:22 -07:00
Eric Liang
202e7bf19a
fix ( #1174 )
2017-11-01 13:45:39 -07:00
Richard Liaw
dc66a2d7d5
[rllib] A3C Refactoring ( #1166 )
...
* fixing policy
* Compute Action is singular, fixed weird issue with arrays
* remove vestige
* extraneous ipdb
* Can Drop in Pytorch Model
* lint
* naming
* finish comments
2017-10-29 11:12:17 -07:00
Eric Liang
4cace0976d
[rllib] Fix DQN inefficiency, and cleanup for different modes of parallelism ( #1151 )
...
* initial checkin
* flake
* dqn
* docs
* add tuned pong
* remove
* upd
* add both
* better gamma
* update
* Last nit
2017-10-29 10:52:30 -07:00
Richard Liaw
304c3cade4
[tune] 10 second timeout for stopping ( #1169 )
...
* 10 second timeout for stopping
* prints for travis
* lint
* try better returning mechanism
* lint
2017-10-29 00:49:29 -07:00
Robert Nishihara
6852e8839e
Expose custom serializers through the API. ( #1147 )
...
* Expose custom serializers through the API.
* minor renaming
* Add test.
* Remove comment.
* Clean up assertions.
2017-10-29 00:08:55 -07:00
Eric Liang
3b157ab933
[tune] Allow resources to not all be assigned to the driver ( #1150 )
...
* dgpu
* update
* update
* update
* also support cmdline
* limit
* Update README.rst
* documentation
* typo
* small coverage for driver_gpu_limit
* lint
* fix lint
2017-10-28 22:16:05 -07:00
Robert Nishihara
f59867850e
Upgrade to cloudpickle 0.4.1. ( #1164 )
...
* Upgrade to cloudpickle 0.4.1.
* Catch more general exceptions thrown by cloudpickle.
2017-10-28 01:35:35 -07:00
Eric Liang
2b6c7af8ad
[tune] Trial scheduler interface ( #1160 )
...
* trial scheduler interface
* remove
* update
2017-10-27 13:29:15 -07:00
Richard Liaw
797f4fcbf3
Fixing Lint after flake upgrade ( #1162 )
...
* Fixing Lint after flake upgrade
* more lint fixes
2017-10-26 21:02:07 -05:00
Eric Liang
cd9dc398ff
[rllib] Support discrete observation spaces such as FrozenLake-v0 ( #1140 )
...
* add
* remove transform_shape
* fix test
* fix
2017-10-23 23:16:52 -07:00
Richard Liaw
0c9817fa76
[tune] Tune Pausing ( #1136 )
...
* fix yaml bug
* add ext agent
* gpus
* update
* tuning
* docs
* Sun Oct 15 21:09:25 PDT 2017
* lint
* update
* Sun Oct 15 22:39:55 PDT 2017
* Sun Oct 15 22:40:17 PDT 2017
* Sun Oct 15 22:43:06 PDT 2017
* Sun Oct 15 22:46:06 PDT 2017
* Sun Oct 15 22:46:21 PDT 2017
* Sun Oct 15 22:48:11 PDT 2017
* Sun Oct 15 22:48:44 PDT 2017
* Sun Oct 15 22:49:23 PDT 2017
* Sun Oct 15 22:50:21 PDT 2017
* Sun Oct 15 22:53:00 PDT 2017
* Sun Oct 15 22:53:34 PDT 2017
* Sun Oct 15 22:54:33 PDT 2017
* Sun Oct 15 22:54:50 PDT 2017
* Sun Oct 15 22:55:20 PDT 2017
* Sun Oct 15 22:56:56 PDT 2017
* Sun Oct 15 22:59:03 PDT 2017
* fix
* Update tune_mnist_ray.py
* remove script trial
* fix
* reorder
* fix ex
* py2 support
* upd
* comments
* comments
* cleanup readme
* fix trial
* annotate
* Update rllib.rst
* init pausing
* Docs, Lint
* fix danglings and restore endpoint moved to trialrunner
* renaming
* nit
* start always starts from checkpoint
* smalls
* nits
* lint
* last change
2017-10-22 23:04:15 -07:00
Eric Liang
81ca27dc08
[rllib] [minor] Rename agent_id to experiment_tag ( #1143 )
...
* tagstr
* doc
* rename
* fix test
2017-10-22 18:44:18 -07:00
Robert Nishihara
97c6369b49
Update arrow to include custom serializer for pytorch and register default serialization handlers. ( #1152 )
...
* Update arrow to include custom serializer for pytorch.
* Call pyarrow function for registering default custom serialization handlers.
* Change class ID used in serialization context for object IDs.
2017-10-21 21:24:10 -07:00
Stephanie Wang
af47737bd5
Prototype distributed actor handles ( #1137 )
...
* Add actor handle ID to the task spec
* Local scheduler dispatches actor tasks according to a task counter per handle
* Fix python test
* Allow passing actor handles into tasks. Not completely working yet. Also this is very messy.
* Fixes, should be roughly working now.
* Refactor actor handle wrapper
* Fix __init__ tests
* Terminate actor when the original handle goes out of scope
* TODO and a couple test cases
* Make tests for unsupported cases
* Fix Python mode tests
* Linting.
* Cache actor definitions that occur before ray.init() is called.
* Fix export actor class
* Deterministically compute actor handle ID
* Fix __getattribute__
* Fix string encoding for python3
* doc
* Add comment and assertion.
2017-10-19 23:49:59 -07:00
Philipp Moritz
2f45ac9e95
Make travis runs less verbose. ( #1145 )
...
* make travis runs less verbose
* update
* more -q flags
2017-10-19 22:25:56 -07:00
Robert Nishihara
8ab56b5906
Always redirect redis stdout/stderr. ( #1142 )
2017-10-19 17:09:09 -07:00
Eric Liang
782125ef3f
warn if agent failed ( #1141 )
2017-10-19 11:39:25 -07:00
Eric Liang
5a50e0e1d7
[rllib] Add the ability to run arbitrary Python scripts with ray.tune ( #1132 )
...
* fix yaml bug
* add ext agent
* gpus
* update
* tuning
* docs
* Sun Oct 15 21:09:25 PDT 2017
* lint
* update
* Sun Oct 15 22:39:55 PDT 2017
* Sun Oct 15 22:40:17 PDT 2017
* Sun Oct 15 22:43:06 PDT 2017
* Sun Oct 15 22:46:06 PDT 2017
* Sun Oct 15 22:46:21 PDT 2017
* Sun Oct 15 22:48:11 PDT 2017
* Sun Oct 15 22:48:44 PDT 2017
* Sun Oct 15 22:49:23 PDT 2017
* Sun Oct 15 22:50:21 PDT 2017
* Sun Oct 15 22:53:00 PDT 2017
* Sun Oct 15 22:53:34 PDT 2017
* Sun Oct 15 22:54:33 PDT 2017
* Sun Oct 15 22:54:50 PDT 2017
* Sun Oct 15 22:55:20 PDT 2017
* Sun Oct 15 22:56:56 PDT 2017
* Sun Oct 15 22:59:03 PDT 2017
* fix
* Update tune_mnist_ray.py
* remove script trial
* fix
* reorder
* fix ex
* py2 support
* upd
* comments
* comments
* cleanup readme
* fix trial
* annotate
* Update rllib.rst
2017-10-18 11:49:28 -07:00
Robert Nishihara
f3e3c7ec71
Add is_actor_checkpoint_method to TaskSpec. ( #1117 )
...
* Add is_actor_checkpoint_method to TaskSpec.
* Fix linting.
* Fix rebase error.
* Fix errors from rebase.
2017-10-15 16:52:10 -07:00