Eric Liang
1ce745cf44
Add automatic local GC and plasma debug logs every 10 minutes by default ( #12804 )
2020-12-11 17:09:58 -08:00
Sven Mika
abb1eefdc2
[RLlib] Issue 12483: Discrete observation space error: "ValueError: ('Observation ({}) outside given space ..." when doing Trainer.compute_action. ( #12787 )
2020-12-11 22:43:30 +01:00
Alex Wu
676ec363f6
[Object Manager] Pull Manager refactor ( #12335 )
2020-12-11 11:56:23 -08:00
Simon Mo
3d8c1cbae6
[Serve] Fix Serve Release Tests ( #12777 )
2020-12-11 11:53:47 -08:00
Eric Liang
4ad4463be6
Add comments to clarify purpose of new scheduler queues ( #12730 )
...
* update
* clarify
* update
2020-12-11 11:53:09 -08:00
fangfengbin
9ded69fdaa
[Hotfix] Fix python client lint error ( #12783 )
2020-12-11 10:15:53 -08:00
Simon Mo
68d7fa2137
Fix exit_actor in asyncio mode ( #12693 )
2020-12-11 09:35:17 -08:00
Edward Oakes
699ded5328
[serve] Initial commit for CLI ( #12770 )
2020-12-11 10:31:29 -06:00
Sven Mika
74c98ac38e
[RLlib] Issue 12244: Unable to restore multi-agent PPOTFPolicy's Model (from exported). ( #12786 )
2020-12-11 16:13:38 +01:00
Tao Wang
295b6e5ce4
Split heartbeat message ( #12535 )
...
* first
* xxx
* Split heartbeat message
* only report resource usage when changed
* Fix GetAllResourceUsage
* Fix report resource usage
* Increase default heartbeat interval
* regularize heartbeat interval in test case
2020-12-11 21:19:57 +08:00
Lixin Wei
867d2a8aa3
[Streaming] Add more documents. ( #12746 )
...
* add doc
draft
draft
draft
draft
draft
fix
fix
fix
fix
fix
fix
fix
Update README.md
fix
fix
fix
* md to rst
* fix
* fix
* fix
* jpg modified
* add getting envolved
* jpg modified
* Update README.rst
* fix
* fix
2020-12-11 20:36:17 +08:00
Sven Mika
a082ea18b8
[RLlib] Issue 12212: "TFEagerPolicy has no attribute action_sampler_fn.
2020-12-11 12:57:33 +01:00
Stephanie Wang
86b0741026
[new scheduler] Allocate resources for spilled back task to a local view of the remote node ( #12711 )
...
* Force report heartbeats if remote resources may be dirty
* lint
* typo
* typo
* unit test
* debug
* Revert "lint"
This reverts commit 6dc7e982ffee98185665eb7c3c8fde0d91938919.
* Revert "Force report heartbeats if remote resources may be dirty"
This reverts commit cbfa9405197df62f874107d55b46715ceae2abd2.
* Local view of resources
* debug travis
* debug
* debug
* debug
* weaken test
* cleanups
* lint
* Revert "debug travis"
This reverts commit 11ff5f4f84e64e9fbd4eecda5b3c7fd07a7130a4.
* revert
* const view, remove unused
2020-12-10 22:43:29 -05:00
Barak Michener
b7f246c451
[ray_client] Include multiple facets of the Ray API ( #12736 )
2020-12-10 19:09:34 -08:00
Sumanth Ratna
8d1ad25545
[docs] Add troubleshooting section to installation page ( #12659 )
...
* Add troubleshooting section to installation docs
* Set fix instructions lang to bash
* Update doc/source/installation.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-10 18:56:56 -08:00
Ian Rodney
9b3ef2f340
[docs] Fix Docker links ( #12702 )
...
* switch autoscaler -> ray-ml
* add more tables
2020-12-10 18:08:48 -08:00
Edward Oakes
62d6b0a558
Fix max_task_retries for named actors ( #12762 )
2020-12-10 18:24:55 -06:00
Edward Oakes
0e90cbcd19
Remove unused ci/performance_tests ( #12767 )
2020-12-10 18:23:16 -06:00
Edward Oakes
c7b6ec88ef
[serve] Make serve __del__ log DEBUG level ( #12766 )
2020-12-10 18:14:55 -06:00
Edward Oakes
3c44c0d3e4
[serve] Long polling for routes in http server ( #12724 )
2020-12-10 18:02:02 -06:00
Lee moon soo
006856b9a1
fix gpu base image name in build-docker.sh script ( #12642 )
2020-12-10 14:31:59 -08:00
Sumanth Ratna
932837eb4c
[streaming] Remove unused imports in streaming CI tests ( #12722 )
2020-12-10 16:27:06 -06:00
Ruoyun Huang
2e084959a1
Fix a wrong import in test_performance.py ( #12734 )
2020-12-10 16:26:21 -06:00
Eric Squires
231ecffa3d
add tags.lock and tags.temp to .gitignore ( #12752 )
...
These can be temporarily created by vim.
2020-12-10 14:24:32 -08:00
Eric Squires
9f70293700
Remove debug extras from setup.py ( #12751 )
2020-12-10 16:23:11 -06:00
architkulkarni
3fd3cb96ed
[Utils] Add Queue async and batch methods ( #12578 )
2020-12-10 10:49:18 -06:00
Ian Rodney
38ba238606
[serve] Create FutureResults from ControllerAPI ( #12577 )
2020-12-10 10:44:08 -06:00
Sven Mika
deb33bce84
[RLlib] Add DQN SoftQ learning test case. ( #12712 )
2020-12-10 14:55:19 +01:00
Kai Yang
e3b5deb741
[Multi-tenancy] Delete flag enable_multi_tenancy
and remove old code path ( #10573 )
2020-12-10 19:01:40 +08:00
Robert Nishihara
d681991773
Add Discourse to readme and make it more prominent in docs. ( #12740 )
2020-12-10 01:13:40 -08:00
Ian Rodney
cf30630d2e
[docker] Use legacy resolver ( #12741 )
2020-12-10 01:12:46 -08:00
Ameer Haj Ali
2f8e308444
[autoscaler] LoadMetrics missed logger.debug ( #12714 )
2020-12-09 17:19:36 -08:00
Ian Rodney
a9da4f3201
[docker] Make Ray-ml more compatible ( #12574 )
2020-12-09 16:03:39 -08:00
Stephanie Wang
a776209aec
Revert "Fix dashboard agent check ppid is raylet pid ( #12256 )" ( #12729 )
...
This reverts commit 3ce9286977
.
2020-12-09 17:20:38 -05:00
dHannasch
d455cae036
Add period to error message. ( #12716 )
2020-12-09 15:58:21 -06:00
Richard Liaw
974570b4fb
oops ( #12728 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-09 13:38:10 -08:00
Keqiu Hu
ee012532fb
[core] Use node manager client pool for GCS service #10398 ( #12368 )
...
* raylet client pool
* Fix merging conflict
* Fix documentation typo
* fix linting
* address comments
* fix typo
* remove unintended logging
* address comments
* fix bazel file lint error
2020-12-09 12:44:40 -08:00
architkulkarni
8b9197ea8c
[Doc] replace github discussion link with discourse ( #12684 )
2020-12-09 12:43:45 -08:00
Edward Oakes
c9873cdbc3
[Serve] Remove unused assign_request wrapper ( #12721 )
2020-12-09 12:22:43 -08:00
Alex Wu
0b6e44efb8
[New scheduler] Cluster Resource Scheduler dynamic resources (for placement groups) ( #12518 )
...
* prepare implemented
* dynamic resources
* .
* commit
* .
* .
* Still needs to be cleaned up
* Passes basic tests + cleanup
* .
* .
* .
* Apply suggestions from code review
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* fix
* lint
Co-authored-by: Alex <alex@anyscale.com>
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2020-12-09 12:05:31 -08:00
fangfengbin
ef9ebbc636
[GCS]GCS based Actor Scheduling support actor colocation ( #12707 )
...
* [GCS]GCS based Actor Scheduling support actor colocation
* fix review comment
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2020-12-09 11:54:23 -08:00
Sven Mika
ea25482f6a
WIP. ( #12706 )
2020-12-09 11:49:21 -08:00
Ian Rodney
19542c5eb0
[docker] Default to ray-ml image ( #12703 )
2020-12-09 11:49:16 -08:00
architkulkarni
6f3aacd087
[serve] Clarify conda env docs ( #12679 )
2020-12-09 13:35:48 -06:00
Sven Mika
f6241302a8
[RLlib] Fix issue 12678: MultiAgentBatch has no attribute total
. ( #12704 )
2020-12-09 16:41:13 +01:00
fyrestone
3ce9286977
Fix dashboard agent check ppid is raylet pid ( #12256 )
...
* Dashboard agent check ppid is raylet pid
* Improve implementation
* Refine code
* Make the RAY_NODE_PID environment required for dashboard agent
Co-authored-by: 刘宝 <po.lb@antfin.com>
2020-12-09 09:12:34 -05:00
Stephanie Wang
840de49161
Fix race condition between failure detection and references going out of scope ( #12573 )
...
* fix
* lint
* fix initialization
2020-12-08 23:49:55 -08:00
Sven Mika
28108c905b
[RLlib] Tf-eager policy bug fix: Duplicate model call in compute_gradients. ( #12682 )
2020-12-09 08:03:58 +01:00
Eric Liang
cab46b7931
Improve issue templates ( #12687 )
...
* update
* Update .github/ISSUE_TEMPLATE/bug_report.md
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-08 22:29:03 -08:00
Alex Wu
bd7e26b768
[Autoscaler] Temporarily suppress "Removed stale ip mappings" message. ( #12689 )
2020-12-08 21:55:10 -08:00