Augusto Yao
0d90a17426
Pass cleanup argument to start_monitor. ( #1040 )
2017-09-30 15:35:25 -07:00
Wapaul1
97b3355adc
Register Class Only Creates Entry in Redis Once ( #1038 )
...
Don't export the same custom class definition multiple times.
2017-09-30 15:30:27 -07:00
Richard Liaw
16e82b43d1
[rllib] Changes for preprocessors ( #1033 )
...
* Changes for preprocessors
* removed comments
* Changes + push for lint
* linted
* adding dependency for travis
* linting won't pass
* reordering
* needed for testing
* added comments
* pip it
* pip dependencies
2017-09-30 13:11:20 -07:00
Alexey Tumanov
2d0f439b7b
hugepage + plasma directory support plumbing + documentation ( #1030 )
...
* hugepage + plasma directory support plumbing + documentation
* Indentation fix.
* huge_pages_enabled --> huge_pages
* One more change
2017-09-30 09:56:52 -07:00
Robert Nishihara
ce278aa06a
Fix valgrind tests. ( #1037 )
...
* Comment out local scheduler valgrind test.
* Fix free/delete error.
* More free -> delete errors
* One more free -> delete and also clean up callback state in plasma manager.
* Add set -x to run_valgrind scripts.
* Fix valgrind error in CreateLocalSchedulerInfoMessage.
2017-09-30 00:11:09 -07:00
Eric Liang
ba153adc4c
Downgrade severity of most common messages ( #1039 )
...
* downgrade severity of most common messages
* update
2017-09-30 00:01:49 -07:00
Eric Liang
b118cef49e
[webui] Allow timeline scroll-to-zoom without holding ALT ( #993 )
...
* Allow timeline scroll-to-zoom without holding ALT
* Update build_ui.sh
* Update build_ui.sh
* Update build_ui.sh
* Update build_ui.sh
* Retry when getting catapult.
2017-09-29 21:35:12 -07:00
Robert Nishihara
b991dc8900
Add flag for ignoring the UI, don't start UI in jenkins tests. ( #1021 )
2017-09-29 15:22:51 -07:00
Eric Liang
9f3a4fce50
[rllib] Parallelize sample collection and gradient computation in DQN ( #746 )
...
* wip
* works with cartpole
* lint
* fix pg
* comment
* action dist rename
* preprocessor
* fix test
* typo
* fix the action[0] nonsense
* revert
* satisfy the lint
* wip
* wip
* works with cartpole
* lint
* fix pg
* comment
* action dist rename
* preprocessor
* fix test
* typo
* fix the action[0] nonsense
* revert
* satisfy the lint
* Minor indentation changes.
* fix merge
* add humanoid
* initial dqn refactor
* remove tfutil
* fix calls
* fix tf errors 1
* closer
* runs now
* lint
* tensorboard graph
* fix linting
* more 4 space
* fix
* fix linT
* more lint
* oops
* es parity
* remove example.py
* fix training bug
* add cartpole demo
* try fixing cartpole
* allow model options, configure cartpole
* debug
* simplify
* no dueling
* avoid out of file handles
* Test dqn in jenkins.
* Minor formatting.
* lint
* fix py3
* fix issue
* remove chekcpoint
* revert
* Fixit
* sanity check configs
* update cuda
* fix
* parallel gradient computation
* update
* upd
* bug
* upd
* always record training stats
* fix
* comments
* revert assert
* add gpu mask
* fofset
* a tie
* Merge
* fix
* fix
* fix examples
* A3C -> DQN
* fix dqn test
* remove submodule
* fix linting
2017-09-29 00:06:51 -07:00
Peter Schafhalter
10027974b1
Replaced ObjectWaitRequests with unordered map ( #990 )
...
* Replaced ObjectWaitRequests with unordered map
* Pass C++ STL object by reference
* Formatting changes and typos.
2017-09-28 15:29:26 -07:00
Eric Liang
19562f6ce5
[rllib] Fix issues with PPO model restoration ( #1018 )
...
* fix filter
* add test
* lint
* fix
* commit
* Update a3c.py
2017-09-28 13:12:06 -07:00
Zongheng Yang
427dee511b
Fill out specs of the task table in ray_redis_module.cc. ( #1024 )
...
* Fill out specs of the task table in ray_redis_module.cc.
* local scheduler field in task table
* linting
2017-09-27 23:45:58 -07:00
Peter Schafhalter
bb76d4ca0a
PlasmaRequestBuffer data structure updates ( #1023 )
...
* Replaced utstring with std::string
* Converted transfer_queue to a list
* Converted pending_object_transfers to unordered_map
* Fix free/delete bug and small modifications.
2017-09-27 19:50:37 -07:00
Philipp Moritz
b020e6bf1f
fix installation instructions ( #999 )
2017-09-27 13:48:23 -07:00
Robert Nishihara
116fe168b5
Download boost 1.65.1 from bintray. ( #1019 )
...
* Download boost 1.65.1 from bintray.
* Pass --no-check-certificate to wget.
2017-09-27 13:25:05 -07:00
Zongheng Yang
5a50e80b63
Make Monitor remove dead Redis entries from exiting drivers. ( #994 )
...
* WIP: removing OL, OI, TT on client exit; no saving yet.
* ray_redis_module.cc: update header comment.
* Cleanup: just the removal.
* Reformat via yapf: use pep8 style instead of google.
* Checkpoint addressing comments (partially)
* Add 'b' marker before strings (py3 compat)
* Add MonitorTest.
* Use `isort` to sort imports.
* Remove some loggings
* Fix flake8 noqa marker runtest.py
* Try to separate tests out to monitor_test.py
* Rework cleanup algorithm: correct logic
* Extend tests to cover multi-shard cases
* Add some small comments and formatting changes.
2017-09-26 00:11:38 -07:00
Peter Schafhalter
6e9657e696
Replaced utstring with std::string ( #1009 )
2017-09-24 22:42:17 -07:00
Wapaul1
c26c7553bc
Resnet Example Uses tf.Datasets now ( #960 )
...
Change Resnet example to use tf.Datasets instead of queues.
2017-09-20 14:14:04 -07:00
Eric Liang
5c70faf76b
Update common.py ( #996 )
2017-09-19 10:10:56 -07:00
gycn
a432285e77
Disable parallelization for Actors and ray.wait for debugging ( #961 )
...
Support actors and ray.wait in PYTHON_MODE.
2017-09-17 00:12:50 -07:00
Philipp Moritz
73f40bd844
[rllib] user defined preprocessor ( #985 )
...
* add register_preprocessor to ModelCatalog
* add pytest
* make staticmethod a classmethod
* update
* install gym on travis
* fix linting
* fix
2017-09-16 15:53:19 -07:00
Wapaul1
29ac95d87a
Web UI Documentation ( #983 )
...
* Initial Draft of Documentation
* Cleanup
* Fix line lengths and modify some text.
2017-09-16 15:41:52 -07:00
Eric Liang
98142ef51f
fix checkpoint ( #988 )
2017-09-16 15:29:36 -07:00
Peter Schafhalter
241612709e
Data structure updates to plasma manager ( #937 )
...
* Implemented local_available_objects as an unordered set
* Implemented fetch_requests as an unordered map
* Fixed bug and changed fetch_requests from pointer to object
* free(PlasmaManagerState *) -> delete PlasmaManagerState *
* removed unnecessary newline
* Make local_available_objects not a pointer.
* Attempt to safely iterate over unordered_map and remove elements.
2017-09-15 20:09:29 -07:00
Philipp Moritz
6601bb5f9e
[rllib] Make observation filter optional ( #940 )
...
* make observation filter optional
* fix linting
2017-09-14 17:37:19 -07:00
Robert Nishihara
413140df38
Autogenerate catapult files if they are not already present. ( #978 )
...
* Autogenerate catapult files if they are not already present.
* Fix bash syntax.
2017-09-14 12:37:33 -07:00
Richard Liaw
d516d9440e
Fixing local directory ( #977 )
...
* Fixing local directory
Enables ability to set custom local directory; code may be messy.
* Create all intermediate parent directories
2017-09-14 10:33:52 -07:00
Philipp Moritz
1eb8c83314
[rllib] Initial RLLib documentation ( #969 )
...
* initial documentation for RLLib
* more RL documentation
* fix linting
* fix comments
* update
* fix
2017-09-12 23:38:21 -07:00
ustcfriend
9ec3608eca
Fix resnet crash by setting config.gpu_options.allow_growth = True. ( #971 )
2017-09-12 22:36:06 -07:00
Eric Liang
9f42ef6a4f
[rllib] Make sure to always record stats like time elapsed, timesteps ( #965 )
...
* always record training stats
* fix
* comments
* revert assert
* nan
* fix
2017-09-12 14:28:16 -07:00
Stephanie Wang
74ac80631b
Local scheduler sends a null heartbeat to global scheduler ( #962 )
...
* Local scheduler sends a null heartbeat to global scheduler to notify death
* Add whitespace.
* Speed up component failures test
* Free local scheduler state upon plasma manager disconnection
2017-09-12 10:45:21 -07:00
Philipp Moritz
dd4e99b481
Fix ray website ( #963 )
...
* update instructions
* update blog
* fix images
* Remove outdated documentation.
2017-09-11 23:11:15 -07:00
Eric Liang
e17412a72b
fix free log std param ( #964 )
2017-09-11 18:52:48 -07:00
Stephanie Wang
99c8b1f38c
Actor fault tolerance using object lineage reconstruction ( #902 )
...
* Revert Python actor reconstruction
* Actor reconstruction using object lineage
* Add dummy arguments and return values for actor tasks
* Pin dummy outputs for actor tasks
* Skip checkpointing test for now
* TODOs
* minor edits
* Generate dummy object dependencies in Python, not C
* Fix linting.
* Move actor counter and dummy objects inside of the actor handle
* Refactor Worker._process_task, suppress exception propagation for
sequential actor tasks
2017-09-10 19:29:28 -07:00
Eric Liang
d8aa826e63
[webui] Scalability fixes for the task timeline and visualizations ( #935 )
...
* fixes
* comments
* fix test
* Update ui.py
* upd
* Fix linting.
2017-09-10 15:47:44 -07:00
Robert Nishihara
f3c1248d98
Clone catapult and generate html files during installation. ( #956 )
...
* Clone catapult and generate static html during setup.
* Include UI files in installation.
* Fix directory to clone catapult to and fix linting.
* Use absolute path.
* Make sure we find a sufficiently new version of python2 when building wheels.
* Copy the trace_viewer_full.html file to the local directory if it is not present.
* Make sure wheels fail to build if UI is not included.
2017-09-10 13:41:16 -07:00
Philipp Moritz
546ba23ceb
Upgrade to latest arrow to include set serialization speedups ( #957 )
...
* update arrow to pull in the set serialization speedups
* remove _register_class for set
2017-09-10 00:12:17 -07:00
Robert Nishihara
d6612a93a2
Add mailing list to README and documentation. ( #950 )
...
* Add mailing list to documentation.
* Add contact page to documentation.
2017-09-09 10:21:51 -07:00
Peter Schafhalter
8906a920f7
Implemented wait_requests as vector ( #943 )
2017-09-08 13:39:54 -07:00
Eric Liang
953878364e
[webui] Print out timeline link for full-screen trace viewing ( #936 )
...
* up
* update
2017-09-06 01:41:21 -07:00
Wapaul1
e19e2c6284
Print jupyter notebook token when starting web UI. ( #887 )
...
* User now only needs to copy url to get to notebook
* Fixed duplicate code
* Added function to print url
* Added exception for calling function on worker
* Stored webui url in Redis
* Fix linting and simplify code.
* Now uses 24 bytes hex token
* Fixed python 3 compatibility
* Fix linting and python 3 compat
* Added comment explaining generating the token.
* Removed newline
* Small fixes.
* Fixed jenkins failure
* Rebased and changed formatting
* Revert "changed formatting"
This reverts commit 226510cf0cdcaab9cf42ad30bd9588a963683592.
2017-09-05 23:31:44 -07:00
Robert Nishihara
853969225b
Sleep longer when starting plasma manager in valgrind case to catch errors where port bind fails. ( #934 )
2017-09-05 20:58:12 -07:00
Philipp Moritz
7030ef366f
Rebase Ray on latest arrow (remove numbuf from Ray). ( #910 )
...
* remove some stuff
* put get roundtrip working
* fixes
* more fixes
* cleanup
* fix tests
* latest arrow
* fixes
* fix tests
* fix linting
* rebase
* fixes
* fix bug
* bring back libgcc error
* fix linting
* use official arrow repo
* fixes
2017-09-04 22:58:49 -07:00
Eric Liang
a2814567e1
[webui] Quick fix to timeline on task failure ( #930 )
...
* foo
* update
* Move _add_missing_timestamps to task_profiles function.
2017-09-04 22:58:19 -07:00
Eric Liang
63d8d11714
[webui] Checkboxes should go to the left of their labels ( #932 )
2017-09-04 17:05:13 -07:00
Robert Nishihara
d8010723d7
Attempt to wget boost up to 20 times during installation. ( #927 )
2017-09-04 14:42:29 -07:00
Robert Nishihara
d5eec0c2cd
Pin opencv-python version to 3.2.0.8 in dockerfile. ( #926 )
2017-09-03 23:51:59 -07:00
Robert Nishihara
8ed03b1cf0
Make task timeline work with ipywidgets==7.0.0, change slider default values. ( #925 )
...
* Make task timeline work with ipywidgets==7.0.0.
* Change initial UI slider values from 70-100 to 0-100.
2017-09-03 23:15:46 -07:00
Stephanie Wang
ae0212b399
Fix failing task table test ( #924 )
2017-09-03 22:41:38 -07:00
Peter Schafhalter
2c19ae97a3
Implemented db_client_cache as unordered_map ( #921 )
...
* Implemented db_client_cache as unordered_map
* Fix for memory leak
* Fixed linting
2017-09-03 17:26:05 -07:00