Michael Luo
|
e5dded917c
|
SAC site changes (#6759)
|
2020-01-09 18:13:42 -08:00 |
|
Eric Liang
|
69c5a2bc3c
|
Warn if OMP_NUM_THREADS is set (#6729)
|
2020-01-08 14:59:07 -08:00 |
|
Frithjof
|
872a3522aa
|
Add machinable to list of projects using Tune (#6737)
|
2020-01-07 15:10:17 -08:00 |
|
Eric Liang
|
a6c8c342b7
|
Better document guarantees provided by par iter API (#6726)
* update
* Update doc/source/iter.rst
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* Update doc/source/iter.rst
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
|
2020-01-07 14:41:50 -08:00 |
|
Robert Nishihara
|
5e43b25e8c
|
Document fault tolerance behavior. (#6698)
|
2020-01-06 22:34:06 -08:00 |
|
Simon Mo
|
6285851743
|
Add sphinx copy button (#6694)
* Add sphinx copy button
* Update requirements-doc.txt
Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>
|
2020-01-04 19:31:49 -06:00 |
|
Ujval Misra
|
ca651af1d7
|
[tune] Async restores and S3/GCP-capable trial FT (#6376)
* Initial commit for asynchronous save/restore
* Set stage for cloud checkpointable trainable.
* Refactor log_sync and sync_client.
* Add durable trainable impl.
* Support delete in cmd based client
* Fix some tests and such
* Cleanup, comments.
* Use upload_dir instead.
* Revert files belonging to other PR in split.
* Pass upload_dir into trainable init.
* Pickle checkpoint at driver, more robust checkpoint_dir discovery.
* Cleanup trainable helper functions, fix tests.
* Addressed comments.
* Fix bugs from cluster testing, add parameterized cluster tests.
* Add trainable util test
* package_ref
* pbt_address
* Fix bug after running pbt example (_save returning dir).
* get cluster tests running, other bug fixes.
* raise_errors
* Fix deleter bug, add durable trainable example.
* Fix cluster test bugs.
* filelock
* save/restore bug fixes
* .
* Working cluster tests.
* Lint, revert to tracking memory checkpoints.
* Documentation, cleanup
* fixinitialsync
* fix_one_test
* Fix cluster test bug
* nit
* lint
* Revert tune md change
* Fix basename bug for directories.
* lint
* fix_tests
* nit_fix
* Add __init__ file.
* Move to utils package
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-01-02 20:40:53 -08:00 |
|
Harrison Feng
|
57061a15cf
|
[docs] configure.rst with --num-cpus (#6678)
--num-cpus -> --num-gpus
Signed-off-by: Harrison Feng <feng.harrison@gmail.com>
|
2020-01-02 20:33:41 -08:00 |
|
Eric Liang
|
895f2727fb
|
Add experimental parallel iterators API (#6644)
|
2020-01-02 13:45:26 -08:00 |
|
Michael Luo
|
1cb335487e
|
SAC for Mujoco Environments (#6642)
|
2019-12-31 00:16:54 -08:00 |
|
Edward Oakes
|
2a66529fb7
|
Add multiprocessing.Pool API (#6194)
|
2019-12-29 21:40:58 -06:00 |
|
Eric Liang
|
677004ee3d
|
Add 'ray stat' command for debugging (#6622)
* wip
* wip
* wip
* iterate
* move
* fix thread safety
|
2019-12-28 14:40:32 -08:00 |
|
Robert Nishihara
|
96f2f8ff10
|
Stop testing Python 2.7 and building Python 2.7 wheels. (#6601)
|
2019-12-27 20:47:49 -08:00 |
|
Robert Nishihara
|
8724e5ffd5
|
Start WebUI by default. (#6493)
|
2019-12-27 13:49:07 -08:00 |
|
zhu-eric
|
65297e65f0
|
Experimental Actor Pool (#6055)
* mod_table
* Example fix for gallery
* lint
* nit
* nit
* fix
* gallery
* remove table for now
* training, object store, tune, actors, advanced
* start tf code
* first cut tf
* yapf
* pytorch
* add torch example
* torch
* parallel
* tune
* tuning
* reviewsready
* finetune
* fix
* move_code
* update conf
* compile
* init hyperparameter
* Start images
* overview
* extra
* fix
* works
* update-ps-example
* param_actor
* fix
* examples
* simple
* simplify_pong
* flake8 and run hyperopt
* add comments
* add comments
* add suggestion
* add suggestion
* suggestions
* add suggestion
* add suggestions
* fixed in wrong area
* last edit
* finish changes
* add line
* format
* reset
* tests and docs
* fix tests
* bazelify
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2019-12-26 14:35:10 -08:00 |
|
micafan
|
b98b288ffd
|
[GCS] Change GCS Test to cc_test (#6596)
|
2019-12-26 14:34:35 +08:00 |
|
Yuhao Yang
|
df4533c649
|
[tune] demo exporting trained models in pbt examples (#6533)
|
2019-12-26 02:14:49 +01:00 |
|
Edward Oakes
|
6b1a57542e
|
Add actor.__ray_kill__() to terminate actors immediately (#6523)
|
2019-12-23 23:12:57 -06:00 |
|
Edward Oakes
|
1b14fbe179
|
Hotfix ray download links in documentation (#6572)
|
2019-12-21 23:00:29 +01:00 |
|
Rong Rong
|
3af5fe60e7
|
modify developer tips page to add PR and CI tips (#6530)
|
2019-12-18 16:20:05 -08:00 |
|
Simon Mo
|
e530c37b0e
|
Use localhost and set redis password by default (#6481)
|
2019-12-17 19:41:19 -08:00 |
|
Eric Liang
|
6725a61bda
|
Release 0.8.0 test logs (#6512)
|
2019-12-17 15:56:50 -08:00 |
|
Eric Liang
|
1a1324d2a2
|
Bump version from 0.8.0.dev6 -> 0.9.0.dev (#6508)
|
2019-12-16 23:57:42 -08:00 |
|
Edward Oakes
|
8636d67b72
|
Improve release docs and add results from 0.7.7 (#6506)
* Improve docs, add logs
* add logs
* microbenchmark
* lint
|
2019-12-16 15:51:39 -08:00 |
|
Richard Liaw
|
5719a05757
|
[sgd] Add support for multi-model multi-optimizer training (#6317)
|
2019-12-15 15:19:45 -08:00 |
|
Philipp Moritz
|
f5d10eea0b
|
[Projects] Refactor cluster specification (#6488)
|
2019-12-14 22:43:06 -08:00 |
|
Yuhao Yang
|
ad4da17899
|
[Tune] Add example and tutorial for DCGAN (#6400)
|
2019-12-13 14:15:44 -08:00 |
|
Richard Liaw
|
4ff6ca89f4
|
[docs] slight doc modifications (#6466)
|
2019-12-13 10:38:17 -08:00 |
|
Maltimore
|
0ec613c95a
|
[rllib] doc: fix typo: on_postprocess_batch -> on_postprocess_traj (#6438)
|
2019-12-11 15:00:53 -08:00 |
|
Dean Wampler
|
abb4fb3f8e
|
Added small section on installation when using Anaconda. (#6427)
|
2019-12-11 10:23:41 -08:00 |
|
Ameer Haj Ali
|
1a9948eef9
|
Update rllib-examples.rst (#6396)
|
2019-12-08 16:21:50 -08:00 |
|
Victor Le
|
4e24c805ee
|
AlphaZero and Ranked reward implementation (#6385)
|
2019-12-07 12:08:40 -08:00 |
|
Yuhao Yang
|
c327ae152f
|
[doc] Update the test command in getting-involved. (#6347)
|
2019-12-07 11:03:52 -08:00 |
|
Edward Oakes
|
f63b64310a
|
Bump version to 0.8.0.dev7 (#6303)
|
2019-12-05 18:33:54 -08:00 |
|
Philipp Moritz
|
dd27bfbb75
|
Rename .rayproject to ray-project (#6278)
|
2019-12-05 16:15:42 -08:00 |
|
Eric Liang
|
bc5e259264
|
[rllib] Add a doc section on computing actions (#6326)
* options doc
* add note
* hint shr
* doc update
|
2019-12-03 00:10:50 -08:00 |
|
Shital Shah
|
670cb6374e
|
Doc enhancement: use build.sh for ray, clarification on how rllib selects VisionNetwork, note on setup-dev.py for rllib. (#6092)
|
2019-12-02 22:19:01 -08:00 |
|
Richard Liaw
|
0b3d5d989b
|
[docs] Add public materials (#6331)
* startup
* update tune readme
* usingrah
|
2019-12-02 19:59:23 -08:00 |
|
Eric Liang
|
0b0a16982a
|
[doc] Use .options() (#6323)
* options doc
* add note
* hint shr
|
2019-12-01 17:24:00 -08:00 |
|
Richard Liaw
|
d3c7a8fda5
|
[docs] yarn update (#6173)
|
2019-11-19 16:15:08 -08:00 |
|
Yuhao Yang
|
d3ff2252c4
|
[doc] Fix link to getting involved
|
2019-11-18 12:59:14 -08:00 |
|
Eric Liang
|
8fc2272f43
|
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181)
|
2019-11-18 10:39:07 -08:00 |
|
Ujval Misra
|
2965dc1b72
|
[tune] Fault tolerance improvements (#5877)
* Precede ray.get with ray.wait.
* Trigger checkpoint deletes locally in Trainable
* Clean-up code.
* Minor changes.
* Track best checkpoint so far again
* Pulled checkpoint GC out of Trainable.
* Added comments, error logging.
* Immediate pull after checkpoint taken; rsync source delete on pull
* Minor doc fixes
* Fix checkpoint manager bug
* Fix bugs, tests, formatting
* Fix bugs, feature flag for force sync.
* Fix test.
* Fix minor bugs: clear proc and less verbose sync_on_checkpoint warnings.
* Fix bug: update IP of last_result.
* Fixed message.
* Added a lot of logging.
* Changes to ray trial executor.
* More bug fixes (logging after failure), better logging.
* Fix richards bug and logging
* Add comments.
* try-except
* Fix heapq bug.
* .
* Move handling of no available trials to ray_trial_executor (#1)
* Fix formatting bug, lint.
* Addressed Richard's comments
* Revert tests.
* fix rebase
* Fix trial location reporting.
* Fix test
* Fix lint
* Rebase, use ray.get w/ timeout, lint.
* lint
* fix rebase
* Address richard's comments
|
2019-11-18 01:14:41 -08:00 |
|
Richard Liaw
|
62cbc043b4
|
[tune] tbx logger (#6133)
* tbx
* add_hparams
* fix_hparams
* ok
* ok
* fix
* ok
* fix
|
2019-11-15 08:45:44 -08:00 |
|
Edward Oakes
|
385783fcec
|
Ray on YARN + Skein Documentation (#6119)
|
2019-11-14 15:06:05 -08:00 |
|
Eric Liang
|
243b1b7281
|
[rllib] Add microbatch optimizer with A2C example (#6161)
|
2019-11-14 12:14:00 -08:00 |
|
Ujval Misra
|
e3e3ad4b25
|
Add timeout param to ray.get (#6107)
|
2019-11-14 00:50:04 -08:00 |
|
Eric Liang
|
e4565c9cc6
|
Reduce RLlib log verbosity (#6154)
|
2019-11-13 18:50:45 -08:00 |
|
Edward Oakes
|
5780ec1b62
|
Refresh ObjectIDs in raylet for stopgap GC (#6109)
|
2019-11-10 23:12:59 -08:00 |
|
Adam Gleave
|
c157e93ba1
|
[tune] Retry failed tasks with checkpointing disabled (#6126)
* Allow recovery for failed tasks without checkpointing
* Update docs
|
2019-11-09 19:35:27 -08:00 |
|