micafan
970cd78701
[GCS] refactor the GCS Client Dynamic Resource Interface ( #6266 )
2020-01-03 14:07:37 +08:00
Ujval Misra
ca651af1d7
[tune] Async restores and S3/GCP-capable trial FT ( #6376 )
...
* Initial commit for asynchronous save/restore
* Set stage for cloud checkpointable trainable.
* Refactor log_sync and sync_client.
* Add durable trainable impl.
* Support delete in cmd based client
* Fix some tests and such
* Cleanup, comments.
* Use upload_dir instead.
* Revert files belonging to other PR in split.
* Pass upload_dir into trainable init.
* Pickle checkpoint at driver, more robust checkpoint_dir discovery.
* Cleanup trainable helper functions, fix tests.
* Addressed comments.
* Fix bugs from cluster testing, add parameterized cluster tests.
* Add trainable util test
* package_ref
* pbt_address
* Fix bug after running pbt example (_save returning dir).
* get cluster tests running, other bug fixes.
* raise_errors
* Fix deleter bug, add durable trainable example.
* Fix cluster test bugs.
* filelock
* save/restore bug fixes
* .
* Working cluster tests.
* Lint, revert to tracking memory checkpoints.
* Documentation, cleanup
* fixinitialsync
* fix_one_test
* Fix cluster test bug
* nit
* lint
* Revert tune md change
* Fix basename bug for directories.
* lint
* fix_tests
* nit_fix
* Add __init__ file.
* Move to utils package
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-01-02 20:40:53 -08:00
Harrison Feng
57061a15cf
[docs] configure.rst with --num-cpus ( #6678 )
...
--num-cpus -> --num-gpus
Signed-off-by: Harrison Feng <feng.harrison@gmail.com>
2020-01-02 20:33:41 -08:00
Robert Nishihara
92e44a5dc8
Deprecate redis_address argument in favor of address. ( #6654 )
2020-01-02 20:18:34 -08:00
Jing Ge
d39e76f2ce
rename interface and class for task assigner based on suitable pattern. ( #6664 )
2020-01-03 11:13:36 +08:00
Simon Mo
9fe90cdafc
Fix async actor recursion limitation ( #6672 )
...
* Do not start threadpool when using async
* Turn function_executor into a generator
* Add new test for high concurrency and bump the default
* Set direct call
2020-01-02 19:45:13 -06:00
Robert Nishihara
39a3459886
Remove (object) from class declarations. ( #6658 )
2020-01-02 17:42:13 -08:00
Sven
f1b56fa5ee
PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). ( #6650 )
...
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).
* Fix LINT line-len errors.
* Fix LINT errors.
* Fix `tf_pg_policy` imports (formerly: `pg_policy`).
* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).
* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
then built into the Bazel/Travis test suite.
* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.
* Fix remaining import errors for agents/pg/...
* Fix circular dependency in pg imports.
* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00
Robert Nishihara
d206445caf
Use Travis deploy v2. ( #6674 )
2020-01-02 16:00:51 -08:00
Yunzhi Zhang
8a0a30b5f0
[Dashboard] display actor status and infeasible tasks ( #6652 )
...
* expose actor status and protobuf message of infeasible tasks
* move infeasible tasks into actor tree
* add pytest for displaying infeasible tasks info
* fix base64 decoding
* fix race condition after #6629 merged
2020-01-02 14:27:59 -08:00
Eric Liang
895f2727fb
Add experimental parallel iterators API ( #6644 )
2020-01-02 13:45:26 -08:00
Ion
3dddbef6d9
Release cpu blocked ( #6611 )
2020-01-02 13:43:25 -08:00
chenk008
3a2a4335b6
Ray operator go.mod file ( #6660 )
...
* change .gitignore for go.mod
* change gitignore and add go.mod for ray-operator
2020-01-02 11:55:16 -06:00
fangfengbin
a13781d70e
Add actor checkpoint methods to gcs server actor info handler ( #6663 )
2020-01-02 19:31:54 +08:00
micafan
a7e9d63979
[GCS] Add actor checkpoint related methods to accessor ( #6605 )
2020-01-02 12:36:52 +08:00
fangfengbin
255aa0796a
Add heartbeat methods to gcs server node info handler ( #6647 )
2020-01-02 12:36:23 +08:00
Robert Nishihara
9baa002069
Remove deprecated global state. ( #6655 )
2019-12-31 22:40:47 -08:00
chenk008
4150d444a1
ray-operator support bazel build ( #6639 )
...
* support bazel build
* add bazel gazelle script in README
2019-12-31 22:28:51 -08:00
Zhijun Fu
91a98d2295
[rpc] refactor GRPC client ( #6637 )
...
* refactor RPC client
* remove unused code
* format
* fix
* resolve comments
* format
* update
* fix
* fix python pb build failure
* lint
2019-12-31 22:28:25 -08:00
mehrdadn
f4b29dae9c
Perform Bazel install directly in Windows CI ( #6653 )
2019-12-31 20:48:08 -08:00
Robert Nishihara
480206eef8
Remove some Python 2 compatibility code. ( #6624 )
2019-12-31 17:14:58 -08:00
Philipp Moritz
ecddaafd94
Add actor table to global state API ( #6629 )
2019-12-31 15:11:59 -08:00
mehrdadn
a4d64de39a
Perform LLVM install directly inside Windows CI ( #6588 )
...
* Perform LLVM install directly inside Windows CI
* Pin the LLVM download version
Co-authored-by: GitHub Web Flow <noreply@github.com>
2019-12-31 13:23:19 -08:00
Robert Nishihara
d2c6457832
Remove public facing references to --redis-address. ( #6631 )
2019-12-31 13:21:53 -08:00
Michael Luo
1cb335487e
SAC for Mujoco Environments ( #6642 )
2019-12-31 00:16:54 -08:00
micafan
cdc1ce4ebf
[GCS]Add heartbeat methods to NodeInfoAccessor ( #6604 )
2019-12-31 14:17:35 +08:00
Yunzhi Zhang
65acb54553
[Dashboard] Logical view backend for dashboard ( #6590 )
2019-12-30 13:08:08 -08:00
Sven
8b16847c02
Get utils ready for better Agent torch support. ( #6561 )
2019-12-30 12:27:32 -08:00
Philipp Moritz
735f282494
Use 0.9.0.dev0 as the version tag ( #6630 )
2019-12-30 10:14:07 -08:00
Richard Liaw
646643a588
[doc] remove redundant PS example ( #6634 )
2019-12-29 20:54:42 -08:00
Edward Oakes
2a66529fb7
Add multiprocessing.Pool API ( #6194 )
2019-12-29 21:40:58 -06:00
Eric Liang
e2bc489a18
Port webui nits from original pr that enables it ( #6628 )
...
* backport changes
* Update test_webui.py
2019-12-29 19:19:43 -08:00
Mitchell Stern
3e0f07468f
Make JSON schema for projects more explicit ( #6550 )
2019-12-29 16:41:53 -08:00
Qstar
10338fde0c
Ray operator: controller code and guide to use ( #6501 )
2019-12-29 10:14:47 -06:00
Eric Liang
7c1e0e5715
Implement wait_local for wait ( #6524 )
2019-12-28 17:40:49 -08:00
Eric Liang
677004ee3d
Add 'ray stat' command for debugging ( #6622 )
...
* wip
* wip
* wip
* iterate
* move
* fix thread safety
2019-12-28 14:40:32 -08:00
Robert Nishihara
92db13023c
Fix unused variable compilation error. ( #6625 )
2019-12-28 12:50:14 -08:00
Eric Liang
022954ac09
[rllib] Tuple action dist tensors not reduced properly in eager mode ( #6615 )
2019-12-28 09:51:09 -08:00
fangfengbin
8a51efebfb
Add gcs server object info handler ( #6621 )
2019-12-28 22:44:27 +08:00
Robert Nishihara
ff82613b66
Fix test_actor.py test_kill. ( #6623 )
2019-12-27 22:39:17 -08:00
alindkhare
a76fadb899
[Serve] Adding BackendConfig ( #6541 )
2019-12-27 23:34:50 -06:00
Robert Nishihara
96f2f8ff10
Stop testing Python 2.7 and building Python 2.7 wheels. ( #6601 )
2019-12-27 20:47:49 -08:00
Robert Nishihara
8724e5ffd5
Start WebUI by default. ( #6493 )
2019-12-27 13:49:07 -08:00
Zhijun Fu
088ce2d1e1
Fix hang on actor creation task failure ( #6617 )
2019-12-27 10:48:17 -08:00
micafan
a492333f4e
[GCS] refactor the GCS Client Object Interface ( #5695 )
2019-12-27 15:18:54 +08:00
fangfengbin
3814b6d5f3
Add gcs server node info handler ( #6595 )
2019-12-27 15:08:38 +08:00
Eric Liang
3af84ada47
Revert "[rllib] remove exists call ( #6168 )" ( #6616 )
...
This reverts commit a68cda0a33
.
2019-12-26 22:44:26 -08:00
Eric Liang
46acb02aa4
Fix verbose shutdown error and test_env_with_subprocesses ( #6614 )
2019-12-26 22:43:39 -08:00
Robert Nishihara
eb0813ea35
Re-enable UI tests for wheels. ( #6602 )
2019-12-26 22:34:56 -08:00
Eric Liang
d3db9e9c1e
By default, reconstruction should only be enabled for actor creation. ( #6613 )
...
* wip
* fix
* fix
2019-12-26 19:57:50 -08:00