Commit graph

336 commits

Author SHA1 Message Date
shane
9af8dc568a testing with --rm and docker run (#1240)
Add --rm to docker run for Jenkins tests.
2017-11-22 10:20:04 -08:00
Eric Liang
316f9e2bb7 [tune] Support user-defined trainable functions / classes / envs with a shared object registry (#1226) 2017-11-20 17:52:43 -08:00
Eric Liang
9233e496cc Raise exception when getting the task results of workers that died (#1224)
* wip

* with test

* add timeout

* also add test for f

* remove on cleanup

* update

* wip

* fix tests

* mark actor removed in redis

* clang-format

* fix bug when no-inprogress tasks

* try to set task status done

* Add comment.
2017-11-20 15:18:39 -08:00
Eric Liang
28f1e12940 [rllib] [build-fix] ES iterations get unexpectedly long (#1235)
* fix very long es

* Revert prior change.

* Shorten ES jenkins tests.
2017-11-20 14:42:42 -08:00
Robert Nishihara
0eae917766 [rllib] Clean up evolution strategies example. (#1225)
* Remove ES observation statistics.

* Consolidate policy classes.

* Remove random stream.

* Move rollout function out of policy.

* Consolidate policy initialization.

* Replace act implementation with sess.run.

* Remove tf_utils.

* Remove variable scope.

* Remove unused imports.

* Use regular TF session.

* Use MeanStdFilter.

* Minor.

* Clarify naming.

* Update documentation.

* eps -> episodes

* Report noiseless evaluation runs.

* Clean up naming.

* Update documentation.

* Fix some bugs.

* Make it run on atari.

* Don't add action noise during evaluation runs.

* Add ES to checkpoint/restore test.

* Small cleanups and remove redundant calls to get_weights.

* Remove outdated comment.
2017-11-16 21:58:30 -08:00
Richard Liaw
eadb998643
[tune] Make HyperBand Usable (#1215) 2017-11-16 10:31:42 -08:00
Richard Liaw
71f8cd2403
[tune] Fixing up Hyperband (#1207)
* Fixing up Hyperband

* nit

* cleanup

* Timing test Added

* added_exception_back

* fixup_tests

* reverse placement

* fixes_and_tests

* fix

* fix

* fixlint

* cleanup_timing

* lint

* Update hyperband.py
2017-11-12 12:05:32 -08:00
Eric Liang
7c38f964b7 [tune] Add command line support for choosing early stopping schedulers (#1209)
* command line support

* add checkpoint freq

* fix other flags

* fix

* docs

* doc
2017-11-12 12:05:18 -08:00
Richard Liaw
afdc87323f
[rllib] PyTorch Models for A3C (#1187)
* fixing policy

* Compute Action is singular, fixed weird issue with arrays

* remove vestige

* extraneous ipdb

* Can Drop in Pytorch Model

* lint

* introducing models

* fix base policy

* Missed this from last time

* lint

* removedolds

* getting vision working

* LINT

* trying to fix test dependencies

* requiremnets

* try

* tryconda

* yes

* shutup

* flake_passes

* changes

* removing weight initializer for lstm for now

* unused

* adam

* clip

* zero

* properscaling

* weight

* try

* fix up pytorch visionnet

* bias correction

* fix model

* same visionnet

* matching_bad_things

* test

* try locking

* fixing_linear

* naming

* lint

* FORJENKINS

* clouds

* lint

* Lint + removed dependencies

* removed dependencies

* format
2017-11-12 00:20:33 -08:00
Daniel Suo
4f0da6f81c Add basic functionality for Cython functions and actors (#1193)
* Add basic functionality for Cython functions and actors

* Fix up per @pcmoritz comments

* Fixes per @richardliaw comments

* Fixes per @robertnishihara comments

* Forgot double quotes when updating masked_log

* Remove import typing for Python 2 compatibility
2017-11-09 17:49:06 -08:00
Richard Liaw
6197b260b8 Fix Jenkins issue introduced by Variant Generator (#1194)
* try fix

* shorten

* added a flag

* finish

* Fix linting.
2017-11-09 00:56:20 -08:00
Eric Liang
52888e4c6f [tune] Improve the tune Python API and variant generation (#1154)
* new variant gen

* wip

* Sat Oct 21 18:21:34 PDT 2017

* update

* comment

* fix

* update

* update readme

* fix

* Update README.rst

* Update README.rst

* fix repeat

* update

* note on restore
2017-11-06 23:41:17 -08:00
Richard Liaw
6222ec3bd7
[tune] hyperband (#1156)
* trial scheduler interface

* remove

* wip median stopping

* remove

* median stopping rule

* update

* docs

* update

* Revrt

* update

* hyperband untested

* small changes before moving on

* added endpoints

* good changes

* init tests

* smore tests

* unfinished tests

* testing

* testing code

* morbugs

* fixes

* end

* tests and typo

* nit

* try this

* tests

* testing

* lint

* lint

* lint

* comments and docs

* almost screwed up

* lint
2017-11-06 22:30:25 -08:00
Eric Liang
d06beacd84 [tune] Implement median stopping rule (#1170)
* trial scheduler interface

* remove

* wip median stopping

* remove

* median stopping rule

* update

* docs

* update

* Revrt

* update

* comments

* fix tesT
2017-11-03 11:25:02 -07:00
Robert Nishihara
3317d38278 Replace hostnames with numerical IP addresses in redis address. (#1177)
* Replace hostnames with numerical IP addresses in redis address.

* Also do conversion for node_ip_address. Add test.

* Simplifications.
2017-11-01 17:13:22 -07:00
Robert Nishihara
6852e8839e Expose custom serializers through the API. (#1147)
* Expose custom serializers through the API.

* minor renaming

* Add test.

* Remove comment.

* Clean up assertions.
2017-10-29 00:08:55 -07:00
Richard Liaw
797f4fcbf3 Fixing Lint after flake upgrade (#1162)
* Fixing Lint after flake upgrade

* more lint fixes
2017-10-26 21:02:07 -05:00
Eric Liang
cd9dc398ff [rllib] Support discrete observation spaces such as FrozenLake-v0 (#1140)
* add

* remove transform_shape

* fix test

* fix
2017-10-23 23:16:52 -07:00
Richard Liaw
0c9817fa76 [tune] Tune Pausing (#1136)
* fix yaml bug

* add ext agent

* gpus

* update

* tuning

* docs

* Sun Oct 15 21:09:25 PDT 2017

* lint

* update

* Sun Oct 15 22:39:55 PDT 2017

* Sun Oct 15 22:40:17 PDT 2017

* Sun Oct 15 22:43:06 PDT 2017

* Sun Oct 15 22:46:06 PDT 2017

* Sun Oct 15 22:46:21 PDT 2017

* Sun Oct 15 22:48:11 PDT 2017

* Sun Oct 15 22:48:44 PDT 2017

* Sun Oct 15 22:49:23 PDT 2017

* Sun Oct 15 22:50:21 PDT 2017

* Sun Oct 15 22:53:00 PDT 2017

* Sun Oct 15 22:53:34 PDT 2017

* Sun Oct 15 22:54:33 PDT 2017

* Sun Oct 15 22:54:50 PDT 2017

* Sun Oct 15 22:55:20 PDT 2017

* Sun Oct 15 22:56:56 PDT 2017

* Sun Oct 15 22:59:03 PDT 2017

* fix

* Update tune_mnist_ray.py

* remove script trial

* fix

* reorder

* fix ex

* py2 support

* upd

* comments

* comments

* cleanup readme

* fix trial

* annotate

* Update rllib.rst

* init pausing

* Docs, Lint

* fix danglings and restore endpoint moved to trialrunner

* renaming

* nit

* start always starts from checkpoint

* smalls

* nits

* lint

* last change
2017-10-22 23:04:15 -07:00
Eric Liang
81ca27dc08 [rllib] [minor] Rename agent_id to experiment_tag (#1143)
* tagstr

* doc

* rename

* fix test
2017-10-22 18:44:18 -07:00
Stephanie Wang
af47737bd5 Prototype distributed actor handles (#1137)
* Add actor handle ID to the task spec

* Local scheduler dispatches actor tasks according to a task counter per handle

* Fix python test

* Allow passing actor handles into tasks. Not completely working yet. Also this is very messy.

* Fixes, should be roughly working now.

* Refactor actor handle wrapper

* Fix __init__ tests

* Terminate actor when the original handle goes out of scope

* TODO and a couple test cases

* Make tests for unsupported cases

* Fix Python mode tests

* Linting.

* Cache actor definitions that occur before ray.init() is called.

* Fix export actor class

* Deterministically compute actor handle ID

* Fix __getattribute__

* Fix string encoding for python3

* doc

* Add comment and assertion.
2017-10-19 23:49:59 -07:00
Eric Liang
5a50e0e1d7 [rllib] Add the ability to run arbitrary Python scripts with ray.tune (#1132)
* fix yaml bug

* add ext agent

* gpus

* update

* tuning

* docs

* Sun Oct 15 21:09:25 PDT 2017

* lint

* update

* Sun Oct 15 22:39:55 PDT 2017

* Sun Oct 15 22:40:17 PDT 2017

* Sun Oct 15 22:43:06 PDT 2017

* Sun Oct 15 22:46:06 PDT 2017

* Sun Oct 15 22:46:21 PDT 2017

* Sun Oct 15 22:48:11 PDT 2017

* Sun Oct 15 22:48:44 PDT 2017

* Sun Oct 15 22:49:23 PDT 2017

* Sun Oct 15 22:50:21 PDT 2017

* Sun Oct 15 22:53:00 PDT 2017

* Sun Oct 15 22:53:34 PDT 2017

* Sun Oct 15 22:54:33 PDT 2017

* Sun Oct 15 22:54:50 PDT 2017

* Sun Oct 15 22:55:20 PDT 2017

* Sun Oct 15 22:56:56 PDT 2017

* Sun Oct 15 22:59:03 PDT 2017

* fix

* Update tune_mnist_ray.py

* remove script trial

* fix

* reorder

* fix ex

* py2 support

* upd

* comments

* comments

* cleanup readme

* fix trial

* annotate

* Update rllib.rst
2017-10-18 11:49:28 -07:00
Eric Liang
802941994d [rllib] Use RLlib preprocessors in DQN (fixes PongDeterministic-v4) (#1124)
* fix pong

* rename

* update
2017-10-14 20:16:36 -07:00
Stephanie Wang
15486a14a0 Refactor actor task queues (#1118)
* Refactor add_task_to_actor_queue into queue_actor_task and insert_actor_task_queue

* Refactor actor task queue to share the waiting task queue

* Fix
2017-10-13 20:52:11 -07:00
Eric Liang
79ea205b3e [rllib] Initial work on integrating hyperparameter search tool (#1107)
* clean up train

* update

* update train script

* add tuned examples

* add agent catalog

* add tune lib

* update

* fix

* testS

* remove

* train docs

* comments

* todo

* fix resource parsing

* fix cr test

* add test

* try to fix travis test
2017-10-13 16:18:16 -07:00
Stephanie Wang
3764f2f2e1 Actor checkpointing with object lineage reconstruction (#1004)
* Worker reports error in previous task, actor task counter is incremented after task is successful

* Refactor actor task execution

- Return new task counter in GetTaskRequest
- Update worker state for actor tasks inside of the actor method
  executor

* Manually invoked checkpoint method

* Scheduling for actor checkpoint methods

* Fix python bugs in checkpointing

* Return task success from worker to local scheduler instead of actor counter

* Kill local schedulers halfway through actor execution instead of waiting for all tasks to execute once

* Remove redundant actor tasks during dispatch, reconstruct missing dependencies for actor tasks

* Make executor for temporary actor methods

* doc

* Set default argument for whether the previous task was a success

* Refactor actor method call

* Simplify checkpoint task submission

* lint

* fix philipp's comments

* Add missing line

* Make actor reconstruction tests run faster

* Unimportant whitespace.

* Unimportant whitespace.

* Update checkpoint method signature

* Documentation and handle exceptions during checkpoint save/resume

* Rename get_task message field to actor_checkpoint_failed

* Fix bug.

* Remove debugging check, redirect test output
2017-10-12 09:53:32 -07:00
Robert Nishihara
7a954f4b5f Use monotonic clock for some python tests. (#1112) 2017-10-11 19:58:59 -07:00
Robert Nishihara
a52a1e893f Automatically set CUDA_VISIBLE_DEVICES when worker gets task. (#1044)
* Automatically set CUDA_VISIBLE_DEVICES when worker gets task.

* Add test.
2017-10-06 18:38:08 -07:00
Robert Nishihara
4669c59fa8 Release GPU resources as soon as an actor exits. (#1088)
* Release GPU resources as soon as an actor exits.

* Add a test.

* Store local_scheduler_id and driver_id in the worker object instead of the actor object.
2017-10-06 17:58:19 -07:00
Stephanie Wang
aebe9f9374 Fix actor garbage collection by breaking cyclic references (#1064)
* Fix bug in wait_for_pid_to_exit, add test for actor deletion.

* Fix actor garbage collection by breaking cyclic references

* Add test for calling actor method immediately after actor creation.

* Fix bug, must dispatch tasks when workers are killed.

* Fix python test

* Fix cyclic reference problem by creating ActorMethod objects on the fly.

* Try simply increasing the time allowed for many_drivers_test.py.
2017-10-05 00:55:33 -07:00
Eric Liang
6ecc899cf2 [rllib] Fix DQN checkpoint/restore and enable test in jenkins (#1063)
* fix dqn restore and add test

* Update .gitignore

* Update test_checkpoint_restore.py

* add checkpoint restore
2017-10-03 23:17:54 -07:00
Richard Liaw
cb6dea94bc [rllib] Fix Preprocessor for ATARI (#1066)
* Removing squeeze, fix atari preprocessing

* nit comment

* comments

* jenkins

* Lint
2017-10-03 18:45:02 -07:00
Philipp Moritz
57bd1d6ff5 Specialize Serialization for OrderedDict (#1035)
Specialize Serialization for OrderedDict and defaultdict
2017-10-02 17:33:10 -07:00
Robert Nishihara
ad61af7333 Workaround for passing empty list to ray.wait. (#1043)
* Workaround for passing empty list to ray.wait.

* Add test for passing empty list to wait.
2017-10-01 11:45:02 -07:00
Richard Liaw
16e82b43d1 [rllib] Changes for preprocessors (#1033)
* Changes for preprocessors

* removed comments

* Changes + push for lint

* linted

* adding dependency for travis

* linting won't pass

* reordering

* needed for testing

* added comments

* pip it

* pip dependencies
2017-09-30 13:11:20 -07:00
Robert Nishihara
b991dc8900 Add flag for ignoring the UI, don't start UI in jenkins tests. (#1021) 2017-09-29 15:22:51 -07:00
Eric Liang
9f3a4fce50 [rllib] Parallelize sample collection and gradient computation in DQN (#746)
* wip

* works with cartpole

* lint

* fix pg

* comment

* action dist rename

* preprocessor

* fix test

* typo

* fix the action[0] nonsense

* revert

* satisfy the lint

* wip

* wip

* works with cartpole

* lint

* fix pg

* comment

* action dist rename

* preprocessor

* fix test

* typo

* fix the action[0] nonsense

* revert

* satisfy the lint

* Minor indentation changes.

* fix merge

* add humanoid

* initial dqn refactor

* remove tfutil

* fix calls

* fix tf errors 1

* closer

* runs now

* lint

* tensorboard graph

* fix linting

* more 4 space

* fix

* fix linT

* more lint

* oops

* es parity

* remove example.py

* fix training bug

* add cartpole demo

* try fixing cartpole

* allow model options, configure cartpole

* debug

* simplify

* no dueling

* avoid out of file handles

* Test dqn in jenkins.

* Minor formatting.

* lint

* fix py3

* fix issue

* remove chekcpoint

* revert

* Fixit

* sanity check configs

* update cuda

* fix

* parallel gradient computation

* update

* upd

* bug

* upd

* always record training stats

* fix

* comments

* revert assert

* add gpu mask

* fofset

* a tie

* Merge

* fix

* fix

* fix examples

* A3C -> DQN

* fix dqn test

* remove submodule

* fix linting
2017-09-29 00:06:51 -07:00
Zongheng Yang
5a50e80b63 Make Monitor remove dead Redis entries from exiting drivers. (#994)
* WIP: removing OL, OI, TT on client exit; no saving yet.

* ray_redis_module.cc: update header comment.

* Cleanup: just the removal.

* Reformat via yapf: use pep8 style instead of google.

* Checkpoint addressing comments (partially)

* Add 'b' marker before strings (py3 compat)

* Add MonitorTest.

* Use `isort` to sort imports.

* Remove some loggings

* Fix flake8 noqa marker runtest.py

* Try to separate tests out to monitor_test.py

* Rework cleanup algorithm: correct logic

* Extend tests to cover multi-shard cases

* Add some small comments and formatting changes.
2017-09-26 00:11:38 -07:00
gycn
a432285e77 Disable parallelization for Actors and ray.wait for debugging (#961)
Support actors and ray.wait in PYTHON_MODE.
2017-09-17 00:12:50 -07:00
Stephanie Wang
74ac80631b Local scheduler sends a null heartbeat to global scheduler (#962)
* Local scheduler sends a null heartbeat to global scheduler to notify death

* Add whitespace.

* Speed up component failures test

* Free local scheduler state upon plasma manager disconnection
2017-09-12 10:45:21 -07:00
Eric Liang
e17412a72b fix free log std param (#964) 2017-09-11 18:52:48 -07:00
Stephanie Wang
99c8b1f38c Actor fault tolerance using object lineage reconstruction (#902)
* Revert Python actor reconstruction

* Actor reconstruction using object lineage

* Add dummy arguments and return values for actor tasks

* Pin dummy outputs for actor tasks

* Skip checkpointing test for now

* TODOs

* minor edits

* Generate dummy object dependencies in Python, not C

* Fix linting.

* Move actor counter and dummy objects inside of the actor handle

* Refactor Worker._process_task, suppress exception propagation for
sequential actor tasks
2017-09-10 19:29:28 -07:00
Eric Liang
d8aa826e63 [webui] Scalability fixes for the task timeline and visualizations (#935)
* fixes

* comments

* fix test

* Update ui.py

* upd

* Fix linting.
2017-09-10 15:47:44 -07:00
Eric Liang
1ebfe9608f [rllib] Add downscale and frameskip options for Montezumas (#908)
* up

* update

* fix

* update

* update

* update

* api break

* Update run_multi_node_tests.sh

* fix
2017-09-02 17:20:56 -07:00
Stephanie Wang
7496c98010 Fault tolerance race (#894)
* Remove race between local scheduler disconnecting and global scheduler
assigning a task

* Fix number of workers started in component failures test

* Fix race between global scheduler retrying a task assignment and monitor
cleaning up task table. The global scheduler should only retry the task
assignment if the local scheduler is still alive.

* Clean up task_table_update callback if failure

* Look up current local scheduler mapping when retrying actor task submission

* Log warning if no subscribers received a task table update

* Clean up database handle memory in local scheduler
2017-08-30 22:20:50 -07:00
Philipp Moritz
164a8f368e [rllib] Rename algorithms (#890)
* rename algorithms

* fix

* fix jenkins test

* fix documentation

* fix
2017-08-29 16:56:42 -07:00
Robert Nishihara
e1831792f8 For PPO, rename num_agents -> num_workers. (#882) 2017-08-28 23:11:06 -07:00
Philipp Moritz
791bee343f [rllib] Implement GAE for PPO (#849)
* make information available for GAE

* buggy version of GAE estimator

* fix

* add more logging and reweight losses

* fix logging

* fix loss

* adapt advantage calculation

* update gae

* standardize returns

* don't normalize td lambda ret

* fix

* don't standardize advantages

* do standardization earlier

* different standardization

* initializer

* drop into the debugger

* fix tensorflow broadcasting bug

* vf clipping

* don't standardize tdlambdaret

* different standardization

* use huber loss for value function

* refactor -- first half

* it runs

* fix

* update

* documentation

* linting and tests

* fix linting

* naming

* fix

* linting

* fix

* remove prefix madness

* fixes

* fix

* add value function example

* fix linting

* remove newline
2017-08-23 20:35:47 -07:00
Alexey Tumanov
fc885bd918 Adding basic support for a user-interpretable resource label (#761)
* adding support for the user-interpretable label(UIR)

* more plumbing for num_uirs further upstream; set to infty when specified on cmd line

* pass default num_uirs for actors; update GlobalStateAPI

* support num_uirs in ray.init()

* local scheduler resource accounting: support num_uirs; prep for vectorized resource accounting

* global scheduler test updated

* Fix bug introduced by rebase.

* Rename UIR -> CustomResource and add test.

* Small changes and use constexpr instead of macros.

* Linting and some renaming.

* Reorder some code.

* Remove cpus_in_use and fix bug.

* Add another test and make a small change.

* Rephrase documentation about feature stability.
2017-08-08 02:53:59 -07:00
Philipp Moritz
862e56000b [rllib] Unify RLLib examples and add jenkins test for policy gradients (#815)
* add jenkins test

* correct handling of the number of iterations

* convert policy gradient and evolution strategies script

* convert DQN

* fix A3C

* fix

* fix

* fixes

* remove redundant A3C example
2017-08-07 19:05:48 -07:00