Eric Liang
9a590ac6a5
[rllib] Fix custom model metrics in multi-device case ( #7640 )
...
* fix example
* add example test
* lin
2020-03-23 12:40:22 -07:00
aannadi
8adc84ccb9
[Dashboard] Add sorted columns and TensorBoard to Tune tab ( #7140 )
2020-03-23 12:30:51 -07:00
Richard Liaw
e311013afd
[tune] Reformat Sections of API Reference ( #7706 )
...
* moveit
* moveit
* docstrings to ref
* Update tune-usage.rst
Co-authored-by: Sven Mika <sven@anyscale.io>
2020-03-23 12:23:21 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. ( #7482 )
2020-03-23 12:19:30 -07:00
Robert Nishihara
ee8c9ff732
Remove six and cloudpickle from setup.py. ( #7694 )
2020-03-23 11:42:05 -07:00
Robert Nishihara
1a0c9228d0
Remove pytest from setup.py and other minor changes. ( #7700 )
2020-03-23 08:46:56 -07:00
ZhuSenlin
74825db804
Fix TestGcsRedisFailureDetector ( #7710 )
...
* fix test_gcs_redis_failure_detector
* fix test_gcs_redis_failure_detector
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
2020-03-23 22:48:53 +08:00
Simon Mo
afad0ed085
[Serve] Add async, multi methods support for serve actors ( #7682 )
2020-03-23 00:45:26 -07:00
ZhuSenlin
039961b63a
rename ActorTable to LogBasedActorTable and add new ActorTable ( #7643 )
2020-03-23 15:05:43 +08:00
SangBin Cho
79767fe425
Fix wording in dashboard documentation. ( #7703 )
2020-03-22 22:16:40 -07:00
Robert Nishihara
8b4c2b7e88
Remove unnecessary handling of setproctitle and psutil. ( #7702 )
2020-03-22 22:06:42 -07:00
Robert Nishihara
4d722bf003
Remove dependence on funcsigs. ( #7701 )
2020-03-22 21:37:24 -07:00
Edward Oakes
8b4f5a9431
Remove non-direct-call code from core worker ( #7625 )
2020-03-22 19:20:08 -05:00
Richard Liaw
81d311031b
[tune] Update API Reference Page ( #7671 )
...
* widerdocs
* init
* docs
* fix
* moveit
* mix
* better_docs
* remove
* Apply suggestions from code review
Co-Authored-By: Sven Mika <sven@anyscale.io>
Co-authored-by: Sven Mika <sven@anyscale.io>
2020-03-22 16:42:20 -07:00
Eric Liang
288933ec6b
[rllib] Fix shared metrics context in parallel iterators ( #7666 )
...
* debug
* build
* update
* wip
* wpi
* update
* recurisve sync
* comment
* stream
* fix
* Update .travis.yml
2020-03-22 14:15:01 -07:00
Sven Mika
2fb219a658
[Ray RLlib] Fix tree import ( #7662 )
...
* Rollback.
* Fix import tree error by adding meaningful error and replacing by tf.nest wherever possible.
* LINT.
* LINT.
* Fix.
* Fix log-likelihood test case failing on travis.
2020-03-22 13:51:24 -07:00
Eric Liang
86f89fc3b3
[tune] Higher timeout for progress reporter test ( #7679 )
...
* wip
* medium size
2020-03-22 13:47:08 -07:00
Stephanie Wang
ba86a02b37
[core] Revert lineage pinning ( #7499 ) ( #7692 )
...
* Revert "fix (#7681 )"
This reverts commit 6a12a31b2e
.
* Revert "[core] Pin lineage of plasma objects that are still in scope (#7499 )"
This reverts commit 014929e658
.
2020-03-21 18:35:43 -07:00
Simon Mo
89d959fd6a
Stop gap solution for cython functions breaking in memory monitor ( #7687 )
2020-03-21 15:16:12 -07:00
Zhijun Fu
a7a5d172b1
[core] fix bug that actor tasks from reconstructed actor is ignored by scheduling queue ( #7637 )
2020-03-21 13:05:24 +08:00
SangBin Cho
1b90196bef
[doc] Dashboard documentation ( #7304 )
...
* Completed the first half of dashboard documentation.
* Dashboard document initial versions.
* Formatting.
* Fixed tune note is not visible.
* Half of comments from code reivew are handled.
* Fixed based on code review.
* Improved memory usage page.
* Addressed code review.
* Fixed image not found issue.
* Add gitkeep again.
* Refactored document.
* Addressed Robert's feedback.
* Addressed code reviews.
* Addressed last comments.
* Update doc/source/ray-dashboard.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-20 22:00:33 -07:00
Stephanie Wang
6a12a31b2e
fix ( #7681 )
2020-03-20 18:53:28 -07:00
Edward Oakes
ec50037ee1
Use go1.12 in lint build ( #7680 )
2020-03-20 14:52:41 -07:00
Edward Oakes
31845f17a5
[docs] Add documentation for reference counting and 'ray memory' ( #7661 )
2020-03-20 15:47:00 -05:00
Edward Oakes
58dc70f90e
[minor] Remove get_global_worker(), RuntimeContext ( #7638 )
2020-03-20 15:45:29 -05:00
Eric Liang
7ebc6783e4
[rllib] Add back get_policy_output method for SAC model ( #7604 )
2020-03-20 12:44:04 -07:00
Eric Liang
9392cdbf74
[rllib] Add high-performance external application connector ( #7641 )
2020-03-20 12:43:57 -07:00
Stephanie Wang
014929e658
[core] Pin lineage of plasma objects that are still in scope ( #7499 )
...
* Add a lineage_ref_count to References
* Refactor TaskManager to store TaskEntry as a struct
* Refactor to fix deadlock between TaskManager and ReferenceCounter
Add references to task specs
* Pin TaskEntries and References in the lineage of any ObjectIDs in scope
* Fix deadlock, convert num_plasma_returns to a set of object IDs
* fix unit tests
* Feature flag
* Do not release lineage for objects that were promoted to plasma
* fix build
* fix build
* Remove num executions
* Simplify num return values
* Remove unused
* doc
* Set num returns
* Move lineage pinning flag to ReferenceCounter
* comments
* Fixes
* Remove irrelevant test (replaced by ref counting tests)
2020-03-20 10:56:43 -07:00
fyrestone
a1ae935839
Java call Python use structured function descriptors ( #7634 )
2020-03-20 17:29:45 +08:00
ZhuSenlin
7d08b418fc
fix test_worker_stats ( #7655 )
...
* fix test_worker_stats
* fix lint error
* fix lint error
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
2020-03-20 14:53:40 +08:00
mehrdadn
e69664b74b
Miscellaneous Windows compatibility bugfixes ( #7658 )
...
* Windows compatibility bug fixes
* Use WSASend/WSARecv as WSASendMsg/WSARecvMsg do not work with TCP sockets
* Clean up some TODOs
* Fix duplicate compilations
* RedisAsioClient boost::asio::error::connection_reset
Co-authored-by: Mehrdad <noreply@github.com>
2020-03-19 19:32:53 -07:00
Stephanie Wang
c7cae036c3
[core] Only drain references for non-actor workers on shutdown ( #7668 )
...
* Only drain ref counter for non-actor tasks
* Don't force kill actors that have gone out of scope
2020-03-19 18:46:16 -07:00
Eric Liang
5a112ab212
Remove object store memory cap ( #7654 )
2020-03-19 16:00:30 -07:00
Clark Zinzow
c37f6e745a
Remove duplicate jsonschema from setup.py ( #7665 )
2020-03-19 13:12:47 -07:00
Edward Oakes
90b553ed05
[operator] Use headless service for head node ( #7622 )
2020-03-19 10:31:56 -05:00
Edward Oakes
c78b52b5b2
Set RayCluster as service owner ( #7621 )
2020-03-19 10:30:44 -05:00
fangfengbin
0d0a41f598
[GCS]Tie lifecycle of gcs service and redis together ( #7601 )
2020-03-19 19:52:35 +08:00
Stephanie Wang
b499100a88
Enable distributed ref counting by default ( #7628 )
...
* enable
* Turn on eager eviction
* Shorten tests and drain ReferenceCounter
* Don't force kill actor handles that have gone out of scope, lint
* Fix locks
* Cleanup Plasma Async Callback (#7452 )
* [rllib][tune] fix some nans (#7611 )
* Change /tmp to platform-specific temporary directory (#7529 )
* [Serve] UI Improvements (#7569 )
* bugfix about test_dynres.py (#7615 )
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
* Java call Python actor method use actor.call (#7614 )
* bug fix about useage of absl::flat_hash_map::erase and absl::flat_hash_set::erase (#7633 )
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
* [Java] Make both `RayActor` and `RayPyActor` inheriting from `BaseActor` (#7462 )
* [Java] Fix the issue that the cached value in `RayObject` is serialized (#7613 )
* Add failure tests to test_reference_counting (#7400 )
* Fix typo in asyncio documentation (#7602 )
* Fix segfault
* debug
* Force kill actor
* Fix test
2020-03-18 22:39:21 -07:00
fangfengbin
fca9dc73e1
Fix test_raylet_pending_tasks test case failed ( #7636 )
2020-03-19 11:09:38 +08:00
Seung Hyeon, Kim
ee49f4a875
[tune] Fix an example for _Brackets of async hyperband scheduler ( #7538 )
2020-03-18 19:06:32 -07:00
Stephanie Wang
35a4bfc885
[core] Fix leak for subscribing to object dependencies in NodeManager ( #7630 )
...
* Fix GetDependencies
* lint
2020-03-18 11:01:29 -07:00
Richard Liaw
ea10cd212c
[tune] add accessible trial_info ( #7378 )
...
* add accessible trial_info
* trial name and info
* doc
* fix
gp
* Update doc/source/tune-package-ref.rst
* Apply suggestions from code review
* fix
* trial
* fixtest
* testfix
2020-03-17 23:44:18 -07:00
Eric Liang
745b9d643d
First pass at ray memory
command for memory debugging ( #7589 )
2020-03-17 20:45:07 -07:00
Landcold7
e6a045df48
Fix typo in asyncio documentation ( #7602 )
2020-03-17 10:37:37 -05:00
Edward Oakes
c1b0f9ccdf
Add failure tests to test_reference_counting ( #7400 )
2020-03-17 10:30:21 -05:00
Hao Chen
7678418210
[Java] Fix the issue that the cached value in RayObject
is serialized ( #7613 )
2020-03-17 22:07:41 +08:00
Kai Yang
6b888b0247
[Java] Make both RayActor
and RayPyActor
inheriting from BaseActor
( #7462 )
2020-03-17 21:45:56 +08:00
ZhuSenlin
dfa5d9b8e9
bug fix about useage of absl::flat_hash_map::erase and absl::flat_hash_set::erase ( #7633 )
...
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
2020-03-17 19:39:56 +08:00
fyrestone
7697ea2be2
Java call Python actor method use actor.call ( #7614 )
2020-03-17 14:52:43 +08:00
ZhuSenlin
ffa9df4683
bugfix about test_dynres.py ( #7615 )
...
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
2020-03-17 13:58:44 +08:00