Ujval Misra
ca651af1d7
[tune] Async restores and S3/GCP-capable trial FT ( #6376 )
...
* Initial commit for asynchronous save/restore
* Set stage for cloud checkpointable trainable.
* Refactor log_sync and sync_client.
* Add durable trainable impl.
* Support delete in cmd based client
* Fix some tests and such
* Cleanup, comments.
* Use upload_dir instead.
* Revert files belonging to other PR in split.
* Pass upload_dir into trainable init.
* Pickle checkpoint at driver, more robust checkpoint_dir discovery.
* Cleanup trainable helper functions, fix tests.
* Addressed comments.
* Fix bugs from cluster testing, add parameterized cluster tests.
* Add trainable util test
* package_ref
* pbt_address
* Fix bug after running pbt example (_save returning dir).
* get cluster tests running, other bug fixes.
* raise_errors
* Fix deleter bug, add durable trainable example.
* Fix cluster test bugs.
* filelock
* save/restore bug fixes
* .
* Working cluster tests.
* Lint, revert to tracking memory checkpoints.
* Documentation, cleanup
* fixinitialsync
* fix_one_test
* Fix cluster test bug
* nit
* lint
* Revert tune md change
* Fix basename bug for directories.
* lint
* fix_tests
* nit_fix
* Add __init__ file.
* Move to utils package
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-01-02 20:40:53 -08:00
Harrison Feng
57061a15cf
[docs] configure.rst with --num-cpus ( #6678 )
...
--num-cpus -> --num-gpus
Signed-off-by: Harrison Feng <feng.harrison@gmail.com>
2020-01-02 20:33:41 -08:00
Eric Liang
895f2727fb
Add experimental parallel iterators API ( #6644 )
2020-01-02 13:45:26 -08:00
Robert Nishihara
d2c6457832
Remove public facing references to --redis-address. ( #6631 )
2019-12-31 13:21:53 -08:00
Michael Luo
1cb335487e
SAC for Mujoco Environments ( #6642 )
2019-12-31 00:16:54 -08:00
Philipp Moritz
735f282494
Use 0.9.0.dev0 as the version tag ( #6630 )
2019-12-30 10:14:07 -08:00
Richard Liaw
646643a588
[doc] remove redundant PS example ( #6634 )
2019-12-29 20:54:42 -08:00
Edward Oakes
2a66529fb7
Add multiprocessing.Pool API ( #6194 )
2019-12-29 21:40:58 -06:00
Eric Liang
677004ee3d
Add 'ray stat' command for debugging ( #6622 )
...
* wip
* wip
* wip
* iterate
* move
* fix thread safety
2019-12-28 14:40:32 -08:00
Robert Nishihara
96f2f8ff10
Stop testing Python 2.7 and building Python 2.7 wheels. ( #6601 )
2019-12-27 20:47:49 -08:00
Robert Nishihara
8724e5ffd5
Start WebUI by default. ( #6493 )
2019-12-27 13:49:07 -08:00
zhu-eric
65297e65f0
Experimental Actor Pool ( #6055 )
...
* mod_table
* Example fix for gallery
* lint
* nit
* nit
* fix
* gallery
* remove table for now
* training, object store, tune, actors, advanced
* start tf code
* first cut tf
* yapf
* pytorch
* add torch example
* torch
* parallel
* tune
* tuning
* reviewsready
* finetune
* fix
* move_code
* update conf
* compile
* init hyperparameter
* Start images
* overview
* extra
* fix
* works
* update-ps-example
* param_actor
* fix
* examples
* simple
* simplify_pong
* flake8 and run hyperopt
* add comments
* add comments
* add suggestion
* add suggestion
* suggestions
* add suggestion
* add suggestions
* fixed in wrong area
* last edit
* finish changes
* add line
* format
* reset
* tests and docs
* fix tests
* bazelify
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2019-12-26 14:35:10 -08:00
micafan
b98b288ffd
[GCS] Change GCS Test to cc_test ( #6596 )
2019-12-26 14:34:35 +08:00
Yuhao Yang
df4533c649
[tune] demo exporting trained models in pbt examples ( #6533 )
2019-12-26 02:14:49 +01:00
Edward Oakes
6b1a57542e
Add actor.__ray_kill__()
to terminate actors immediately ( #6523 )
2019-12-23 23:12:57 -06:00
Edward Oakes
1b14fbe179
Hotfix ray download links in documentation ( #6572 )
2019-12-21 23:00:29 +01:00
Rong Rong
3af5fe60e7
modify developer tips page to add PR and CI tips ( #6530 )
2019-12-18 16:20:05 -08:00
Simon Mo
e530c37b0e
Use localhost and set redis password by default ( #6481 )
2019-12-17 19:41:19 -08:00
Eric Liang
6725a61bda
Release 0.8.0 test logs ( #6512 )
2019-12-17 15:56:50 -08:00
Eric Liang
2530eb90dc
Move tf.test.is_gpu_available() to after session init ( #6515 )
...
* move to after session init
* script fixes
2019-12-17 14:55:39 -08:00
Eric Liang
1a1324d2a2
Bump version from 0.8.0.dev6 -> 0.9.0.dev ( #6508 )
2019-12-16 23:57:42 -08:00
Edward Oakes
8636d67b72
Improve release docs and add results from 0.7.7 ( #6506 )
...
* Improve docs, add logs
* add logs
* microbenchmark
* lint
2019-12-16 15:51:39 -08:00
Richard Liaw
5719a05757
[sgd] Add support for multi-model multi-optimizer training ( #6317 )
2019-12-15 15:19:45 -08:00
Philipp Moritz
f5d10eea0b
[Projects] Refactor cluster specification ( #6488 )
2019-12-14 22:43:06 -08:00
Yuhao Yang
ad4da17899
[Tune] Add example and tutorial for DCGAN ( #6400 )
2019-12-13 14:15:44 -08:00
Richard Liaw
4ff6ca89f4
[docs] slight doc modifications ( #6466 )
2019-12-13 10:38:17 -08:00
alindkhare
76e678d775
[Serve] Added deadline awareness ( #6442 )
...
* [Serve] Added deadline awareness
Added deadline awareness while enqueuing a query
Using Blist sorted-list implementation (ascending order) to get queries according to their specified deadlines. [buffer_queues]
Exposed slo_ms via handle/http request
Added slo example
The queries in example will be executed in almost the opposite order of which they are fired
Added slo pytest
Added check for slo_ms to not be negative
Included the changes suggested
* Linting Corrections
* Adding the code changes suggested by format.sh
* Added the suggested changes
Added justification for blist
Added blist in travis/ci/install-dependencies.sh
* Fixed linting issues
* Added blist to ray/doc/requirements-doc.txt
2019-12-11 16:41:54 -08:00
Maltimore
0ec613c95a
[rllib] doc: fix typo: on_postprocess_batch -> on_postprocess_traj ( #6438 )
2019-12-11 15:00:53 -08:00
Dean Wampler
abb4fb3f8e
Added small section on installation when using Anaconda. ( #6427 )
2019-12-11 10:23:41 -08:00
Ameer Haj Ali
1a9948eef9
Update rllib-examples.rst ( #6396 )
2019-12-08 16:21:50 -08:00
Dean Wampler
65694cdc4c
[bug] Attempt to fix links not working. ( #6390 )
2019-12-07 14:31:50 -08:00
Victor Le
4e24c805ee
AlphaZero and Ranked reward implementation ( #6385 )
2019-12-07 12:08:40 -08:00
Yuhao Yang
c327ae152f
[doc] Update the test command in getting-involved. ( #6347 )
2019-12-07 11:03:52 -08:00
Dean Wampler
53d62d3eec
Expanded with new pages for getting started, etc. Blog links unchanged. ( #6388 )
2019-12-06 15:18:47 -08:00
Edward Oakes
f63b64310a
Bump version to 0.8.0.dev7 ( #6303 )
2019-12-05 18:33:54 -08:00
Philipp Moritz
dd27bfbb75
Rename .rayproject to ray-project ( #6278 )
2019-12-05 16:15:42 -08:00
Eric Liang
bc5e259264
[rllib] Add a doc section on computing actions ( #6326 )
...
* options doc
* add note
* hint shr
* doc update
2019-12-03 00:10:50 -08:00
Shital Shah
670cb6374e
Doc enhancement: use build.sh for ray, clarification on how rllib selects VisionNetwork, note on setup-dev.py for rllib. ( #6092 )
2019-12-02 22:19:01 -08:00
Richard Liaw
0b3d5d989b
[docs] Add public materials ( #6331 )
...
* startup
* update tune readme
* usingrah
2019-12-02 19:59:23 -08:00
Eric Liang
0b0a16982a
[doc] Use .options() ( #6323 )
...
* options doc
* add note
* hint shr
2019-12-01 17:24:00 -08:00
Philipp Moritz
a4437813eb
[Projects] Unify hyphen vs underscore handling for arguments ( #6208 )
2019-11-20 23:52:41 -08:00
Richard Liaw
d3c7a8fda5
[docs] yarn update ( #6173 )
2019-11-19 16:15:08 -08:00
Yuhao Yang
d3ff2252c4
[doc] Fix link to getting involved
2019-11-18 12:59:14 -08:00
Eric Liang
8fc2272f43
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO ( #6181 )
2019-11-18 10:39:07 -08:00
Ujval Misra
2965dc1b72
[tune] Fault tolerance improvements ( #5877 )
...
* Precede ray.get with ray.wait.
* Trigger checkpoint deletes locally in Trainable
* Clean-up code.
* Minor changes.
* Track best checkpoint so far again
* Pulled checkpoint GC out of Trainable.
* Added comments, error logging.
* Immediate pull after checkpoint taken; rsync source delete on pull
* Minor doc fixes
* Fix checkpoint manager bug
* Fix bugs, tests, formatting
* Fix bugs, feature flag for force sync.
* Fix test.
* Fix minor bugs: clear proc and less verbose sync_on_checkpoint warnings.
* Fix bug: update IP of last_result.
* Fixed message.
* Added a lot of logging.
* Changes to ray trial executor.
* More bug fixes (logging after failure), better logging.
* Fix richards bug and logging
* Add comments.
* try-except
* Fix heapq bug.
* .
* Move handling of no available trials to ray_trial_executor (#1 )
* Fix formatting bug, lint.
* Addressed Richard's comments
* Revert tests.
* fix rebase
* Fix trial location reporting.
* Fix test
* Fix lint
* Rebase, use ray.get w/ timeout, lint.
* lint
* fix rebase
* Address richard's comments
2019-11-18 01:14:41 -08:00
Richard Liaw
62cbc043b4
[tune] tbx logger ( #6133 )
...
* tbx
* add_hparams
* fix_hparams
* ok
* ok
* fix
* ok
* fix
2019-11-15 08:45:44 -08:00
Edward Oakes
385783fcec
Ray on YARN + Skein Documentation ( #6119 )
2019-11-14 15:06:05 -08:00
Eric Liang
243b1b7281
[rllib] Add microbatch optimizer with A2C example ( #6161 )
2019-11-14 12:14:00 -08:00
Ujval Misra
e3e3ad4b25
Add timeout param to ray.get ( #6107 )
2019-11-14 00:50:04 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity ( #6154 )
2019-11-13 18:50:45 -08:00