Commit graph

3818 commits

Author SHA1 Message Date
Michael Luo
e5dded917c SAC site changes (#6759) 2020-01-09 18:13:42 -08:00
chenk008
f69081242e Ray operator travis (#6731) 2020-01-09 16:16:08 -06:00
Sven
60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
fangfengbin
ca454c5c1b Add task reconstruction function to task info handler (#6711) 2020-01-09 15:37:42 +08:00
Yunzhi Zhang
3673835f30 Fix spurious warning message when submitting many tasks (#6752) 2020-01-08 22:52:46 -08:00
micafan
1211e6a1fc [GCS] Add async register nodes to GCS Client (#6742) 2020-01-09 10:51:22 +08:00
Eric Liang
69c5a2bc3c
Warn if OMP_NUM_THREADS is set (#6729) 2020-01-08 14:59:07 -08:00
Eric Liang
a745886242
Disable HTTP proxy for gRPC connections (#6744)
* disable http proxy for grpc

* add test
2020-01-08 09:23:22 -08:00
micafan
0a5d0109a4 add actor table data creation method to pb_util.h (#6746) 2020-01-08 22:39:17 +08:00
chaokunyang
70c7d47c09 [Streaming] java cross lang streaming graph (#6689) 2020-01-08 17:32:35 +08:00
micafan
91a3fa0157 [GCS]access task reconstruction in TaskInfoAccessor (#6688)
* add task lease interface to TaskInfoAccessor

* impl of task lease

* support accessing task lease in TaskInfoAccessor

* update raylet usage of task lease

* add comment

* fix lint

* fix UT of TaskDependencyManager

* fix UT of ReconstructionPolicy

* rm useless code from UT

* add task reconstruction methods to gcs

* fix ut of RedisGcsClient

* update test

* update comments
2020-01-08 16:59:06 +08:00
Lixin Wei
859dbad155 Fix estimate_available_memory() in utils.py (#6302) 2020-01-08 15:22:47 +08:00
fangfengbin
303d1a959b Add task lease method to task info handler (#6710)
* add task lease methods to task info handler

* rebase master
2020-01-08 14:25:55 +08:00
Tianyi Chen
9dacebec1a [Streaming] Add configuration with owner config. (#6687) 2020-01-08 11:19:01 +08:00
Frithjof
872a3522aa Add machinable to list of projects using Tune (#6737) 2020-01-07 15:10:17 -08:00
Edward Oakes
5f843cd998
Clean up stress_testing_config.yaml (#6738)
* Clean up stress_testing_config.yaml

* comment
2020-01-07 17:05:07 -06:00
Eric Liang
a6c8c342b7
Better document guarantees provided by par iter API (#6726)
* update

* Update doc/source/iter.rst

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update doc/source/iter.rst

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-01-07 14:41:50 -08:00
Zhijun Fu
329b9440ba fix missing override for HandleWaitForObjectEviction (#6733) 2020-01-07 13:20:35 -08:00
Zhijun Fu
72335dbe46
[rpc] refactor RPC server code (#6661)
* refactor RPC client

* remove unused code

* format

* fix

* resolve comments

* format

* update

* refactor rpc server

* update

* format

* fix

* fix

* Update src/ray/rpc/worker/core_worker_server.h

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* resolve comments

* format

* update

* update

* add a comment

* fix

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-01-07 22:03:42 +08:00
Michał Słapek
aaeb3c44a5 [tune] Add _change_working_directory to RayTrialExecutor (#6228) (#6320)
* [tune] Add _switch_working_directory to RayTrialExecutor (#6228)

* Make _switch_working_directory before warn_if_slow

* Rename _switch_working_directory to _change_working_directory
2020-01-07 01:51:04 -08:00
Robert Nishihara
5e43b25e8c
Document fault tolerance behavior. (#6698) 2020-01-06 22:34:06 -08:00
Ujval Misra
20ba7ef647 [tune] Move util to utils package (#6682)
* Move util.py to utils

* Fix import
2020-01-06 18:11:02 -08:00
Edward Oakes
78d6290a65
Add kubectl to autoscaler docker image (#6721) 2020-01-06 17:30:51 -06:00
Edward Oakes
2a4d2c6e9e
Basic reference counting & pinning (#6554) 2020-01-06 17:30:26 -06:00
mehrdadn
c9855c9769 Remove std::move<std::shared_ptr>(...) to avoid bugs (#6720) 2020-01-06 17:17:26 -06:00
Eric Liang
63363e19be
Update bug_report.md (#6704) 2020-01-06 10:55:04 -08:00
Zhijun Fu
5bb20f6ac9
remove unused params in grpc macros (#6677)
* remove unused params in grpc macros

* format

* fix

* format

* fix
2020-01-06 21:35:40 +08:00
mehrdadn
76c986bdc7 Windows compatibility stubs (#6706) 2020-01-05 21:21:17 -08:00
mehrdadn
e6165cb14b Fix master as it seems to have been broken via these conflicting commits: (#6708)
c51fbfb453
2228079481

Co-authored-by: GitHub Web Flow <noreply@github.com>
2020-01-06 12:29:21 +08:00
fangfengbin
1000e3322d Add gcs server task info handler (#6695) 2020-01-06 11:09:32 +08:00
Lingxuan Zuo
c51fbfb453 [streaming] Message bundle use inplacement instance (#6606)
* streaming message bundle use inplacement instance

* fix typo & enable common test

* fix compiler warning

* block copy for serilization

* add reference

* remove streaming common test to travis script
2020-01-06 11:04:29 +08:00
mehrdadn
2228079481 Fix missing overrides (#6703) 2020-01-05 17:00:23 -08:00
Philipp Moritz
e15bd8ff1a
Run core worker tests in thread sanitizer and fix thread safety issues (#6701) 2020-01-05 16:18:21 -08:00
micafan
cc110ff1e3 [GCS]Add task lease methods to TaskInfoAccessor (#6645) 2020-01-05 13:54:33 +08:00
Simon Mo
6285851743
Add sphinx copy button (#6694)
* Add sphinx copy button

* Update requirements-doc.txt

Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>
2020-01-04 19:31:49 -06:00
Yunzhi Zhang
816b84808d [Dashboard] Display memory usage of nodes and core workers (#6671) 2020-01-03 20:12:42 -08:00
micafan
fd379934b6 rm DirectActorTable (#6684) 2020-01-03 16:28:26 -08:00
Harrison Feng
ca876c1ecb Make sure dashboard link can be clicked directly. (#6683) 2020-01-03 16:17:16 -08:00
Robert Nishihara
80e77f7025 Revert accidental changes to test file. (#6681) 2020-01-03 14:23:45 -08:00
fangfengbin
b8669bc06c Add node resources methods to gcs server node info handler (#6685) 2020-01-03 20:06:49 +08:00
Ujval Misra
5b40408678 [tune] Remove py2.7-specific code (#6665)
* Remove backwards compatability py2.7 code.

* Use exists_ok=True in ray

* nit

* nit

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-01-03 01:03:13 -08:00
micafan
970cd78701 [GCS] refactor the GCS Client Dynamic Resource Interface (#6266) 2020-01-03 14:07:37 +08:00
Ujval Misra
ca651af1d7 [tune] Async restores and S3/GCP-capable trial FT (#6376)
* Initial commit for asynchronous save/restore

* Set stage for cloud checkpointable trainable.

* Refactor log_sync and sync_client.

* Add durable trainable impl.

* Support delete in cmd based client

* Fix some tests and such

* Cleanup, comments.

* Use upload_dir instead.

* Revert files belonging to other PR in split.

* Pass upload_dir into trainable init.

* Pickle checkpoint at driver, more robust checkpoint_dir discovery.

* Cleanup trainable helper functions, fix tests.

* Addressed comments.

* Fix bugs from cluster testing, add parameterized cluster tests.

* Add trainable util test

* package_ref

* pbt_address

* Fix bug after running pbt example (_save returning dir).

* get cluster tests running, other bug fixes.

* raise_errors

* Fix deleter bug, add durable trainable example.

* Fix cluster test bugs.

* filelock

* save/restore bug fixes

* .

* Working cluster tests.

* Lint, revert to tracking memory checkpoints.

* Documentation, cleanup

* fixinitialsync

* fix_one_test

* Fix cluster test bug

* nit

* lint

* Revert tune md change

* Fix basename bug for directories.

* lint

* fix_tests

* nit_fix

* Add __init__ file.

* Move to utils package

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-01-02 20:40:53 -08:00
Harrison Feng
57061a15cf [docs] configure.rst with --num-cpus (#6678)
--num-cpus -> --num-gpus

Signed-off-by: Harrison Feng <feng.harrison@gmail.com>
2020-01-02 20:33:41 -08:00
Robert Nishihara
92e44a5dc8
Deprecate redis_address argument in favor of address. (#6654) 2020-01-02 20:18:34 -08:00
Jing Ge
d39e76f2ce rename interface and class for task assigner based on suitable pattern. (#6664) 2020-01-03 11:13:36 +08:00
Simon Mo
9fe90cdafc
Fix async actor recursion limitation (#6672)
* Do not start threadpool when using async

* Turn function_executor into a generator

* Add new test for high concurrency and bump the default

* Set direct call
2020-01-02 19:45:13 -06:00
Robert Nishihara
39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Sven
f1b56fa5ee PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). (#6650)
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).

* Fix LINT line-len errors.

* Fix LINT errors.

* Fix `tf_pg_policy` imports (formerly: `pg_policy`).

* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).

* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
  then built into the Bazel/Travis test suite.

* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.

* Fix remaining import errors for agents/pg/...

* Fix circular dependency in pg imports.

* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00
Robert Nishihara
d206445caf Use Travis deploy v2. (#6674) 2020-01-02 16:00:51 -08:00