Alex Wu
9ca159aa0b
[Autoscaler] Multi node commands ( #10236 )
2020-08-25 23:35:38 -07:00
Amog Kamsetty
8c0503ddd3
[Tune] Convert PBT DCGAN Example to Function API ( #10246 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-25 22:34:19 -07:00
Antoni Baum
87ed20738e
[tune] Add on_pause, on_unpause to ConcurrencyLimiter ( #10320 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-25 22:33:17 -07:00
Simon Mo
ed3fdd2c0b
[Serve] Remove register_custom_serializer ( #10331 )
2020-08-25 21:20:43 -07:00
Edward Oakes
cbd9632f3a
Fix wait timeout logic ( #10199 )
2020-08-25 22:41:39 -05:00
fyrestone
08adbb371f
Cross language exception ( #10023 )
2020-08-26 10:46:05 +08:00
Robert Nishihara
79eefbf357
Better checking that ray.init() has been called. ( #10261 )
2020-08-25 17:13:11 -07:00
Stephanie Wang
d4537ac1ce
[core] Try to schedule tasks locally before spilling over to remote nodes ( #10302 )
...
* Regression test
* Spillback
* Remove check for actor tasks
2020-08-25 15:01:59 -07:00
Richard Liaw
146d91385c
[tune] custom trial directory name ( #10214 )
2020-08-25 12:52:54 -07:00
SangBin Cho
3b3ca96a4e
[Placement Group] Wait ( #10259 )
...
* Initial progress done.
* Fix wrong test.
* Improve tests.
* Update code.
* Addressed code review and merge conflict.
* Addressed code review.
2020-08-24 20:14:48 -07:00
Richard Liaw
6dc22a6d68
[autoscaler] Fix logging regression ( #10280 )
2020-08-24 14:25:12 -07:00
fyrestone
05c103af94
[Dashboard] Start the new dashboard ( #10131 )
...
* Use new dashboard if environment var RAY_USE_NEW_DASHBOARD exists; new dashboard startup
* Make fake client/build/static directory for dashboard
* Add test_dashboard.py for new dashboard
* Travis CI enable new dashboard test
* Update new dashboard
* Agent manager service
* Add agent manager
* Register agent to agent manager
* Add a new line to the end of agent_manager.cc
* Fix merge; Fix lint
* Update dashboard/agent.py
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* Update dashboard/head.py
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* Fix bug
* Add tests for dashboard
* Fix
* Remove const from Process::Kill() & Fix bugs
* Revert error check of execute_after
* Raise exception from DashboardAgent.run
* Add more tests.
* Fix compile on Linux
* Use dict comprehension instead of dict(generator)
* Fix lint
* Fix windows compile
* Fix lint
* Test Windows CI
* Revert "Test Windows CI"
This reverts commit 945e01051ec95cff5fcc1c0bc37045b46e7ad9a6.
* Fix ParseWindowsCommandLine bug
* Update src/ray/util/util.cc
Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>
Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>
2020-08-24 13:24:23 -07:00
Max Fitton
832f5cdccb
[Dashboard] Memory View Group by Stack Trace and UI Overhaul ( #10227 )
2020-08-24 14:54:42 -05:00
PidgeyBE
a82124d304
Update memory_monitor.py ( #9212 )
2020-08-24 10:29:01 -07:00
Eric Liang
4761eacc3e
[autoscaler] Also account for head node resources in multi node type autoscaling ( #10230 )
2020-08-24 10:26:22 -07:00
Ian Rodney
f051c2852e
[docker] docker cp
correctly into container ( #10253 )
2020-08-24 09:18:34 -07:00
SangBin Cho
1f54acd274
[Tech Debt] Use f-string for python/ray/*.py ( #10268 )
...
* In progress.
* Done with critical path.
* Modified cluster_utils.py and log_monitor.py
* Addressed code review.
2020-08-23 22:01:31 -07:00
fangfengbin
b61a79efd7
[Placement Group]Fix SigSegv bug ( #10262 )
...
* fix SigSegv bug
* fix review comments
* fix ut bug
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-23 11:33:40 -07:00
Richard Liaw
73c4246332
[Core] fix-bad-stack ( #10266 )
2020-08-23 10:33:29 -07:00
Yu Shan
5264f888e4
fix iterable dataset (issue 9899) ( #9952 )
2020-08-22 19:40:38 -07:00
Maksim Smolin
245c0a9e43
[cli] Tests ( #10057 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-22 13:29:10 -07:00
fangfengbin
8362029dcf
[Placement Group]Fix CrossLanguageInvocationTest failure ( #10257 )
...
* add part code
* rebase master
* add part code
* rebase master
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-22 12:12:00 -07:00
Richard Liaw
6bd5458bef
[tune] cleanup error messaging/diagnose_serialization helper ( #10210 )
2020-08-22 11:50:49 -07:00
Richard Liaw
24ee496b89
[tune] support rerunning failed trials ( #10060 )
2020-08-22 09:59:05 -07:00
krfricke
c31876002d
[tune/rllib] made wandb compatible with rllib trainables ( #10252 )
2020-08-21 17:25:52 -07:00
Richard Liaw
f87669372d
[cli] enable log-new-style by default ( #10213 )
2020-08-21 15:21:43 -07:00
fangfengbin
36c6c4b298
[Placement group] Check if placement group bundle index is valid ( #10194 )
...
* add part code
* rebase master
* add java testcase
* fix review comments
* fix lint error
* rebase master
* fix lint error
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-21 11:04:56 -07:00
Max Fitton
17f801dc69
Make get_py_stack return more stack frames ( #9512 )
2020-08-21 13:02:12 -05:00
SangBin Cho
92664249e8
Partially Use f string ( #10218 )
...
* flynt. trial 1.
* Trial 1.
* Addressed code review.
2020-08-20 18:21:16 -07:00
architkulkarni
07cd815e5a
[Serve] Type hints for API ( #10205 )
2020-08-20 15:33:04 -07:00
Stephanie Wang
85e57a7a98
[Object spilling] Look up the location of the primary raylet from the owner's metadata ( #10197 )
...
* Get the primary copy from the owner, python test, some node manager fixes
* fixes and todo
* update
* lint
* fix build
2020-08-20 14:46:59 -07:00
Eric Liang
0baf992a4f
[hotfix] [autoscaler] Address remaining comments on renaming instance => node ( #10229 )
...
* more renaming
* fix import
2020-08-20 14:37:41 -07:00
Eric Liang
85a6876119
[autoscaler] Rename instance_type => node_type, TAG_RAY_INSTANCE_TYPE => TAG_RAY_USER_NODE_TYPE ( #10207 )
2020-08-20 12:27:11 -07:00
Amog Kamsetty
8d466749ee
[Tune] PBT hyperparam_mutations fix ( #10217 )
2020-08-20 12:02:29 -07:00
fangfengbin
a462ae2747
[Placement Group]Add strict spread strategy ( #10174 )
...
* support STRICT_SPREAD strategy
* fix review comments
* rebase master
* fix lint error
* fix lint error
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-08-20 10:18:58 -07:00
SangBin Cho
224933b5e4
[Placement Group] Remove API part 2 ( #10215 )
...
* Initial progress done.
* Fix mistake.
* Addressed code review.
* Fix cpp build issue.
* Addressed code review.
2020-08-20 09:50:13 -07:00
Eric Liang
538cb802d5
[autoscaler] Refactor multi node type autoscaler config ( #10190 )
2020-08-19 20:46:00 -07:00
Richard Liaw
2fd59de05d
[autoscaler] hotfix - swallowed error for missing yaml ( #10212 )
2020-08-19 20:02:56 -07:00
Amog Kamsetty
9ff687c093
[SGD][Docs] docs for training/ validation results ( #10181 )
2020-08-19 17:22:28 -07:00
Amog Kamsetty
44e254788a
[Tune] PBT hyperparam_mutations improvements ( #10170 )
2020-08-19 16:50:19 -07:00
Alex Wu
b70dce0d02
[autoscaler] Hotfix bad None check ( #10196 )
2020-08-19 13:27:20 -07:00
fangfengbin
9734dbca3e
[Placement Group]Reschedule bundles when the node of bundles is dead ( #10021 )
2020-08-19 13:24:42 -07:00
Edward Oakes
888f0a2c60
[serve] Use ray.experimental.metrics ( #10185 )
2020-08-19 13:03:22 -05:00
architkulkarni
de46464aa3
[Experimental] Queue: replace polling with async actor ( #10120 )
2020-08-19 11:55:42 -05:00
Sven Mika
2cbe29a7fa
[RLlib] Curiosity minor fixes, do-overs, and testing. ( #10143 )
2020-08-19 17:49:50 +02:00
Max Fitton
9c5e5a9757
[Dashboard] Fix and Recommit Reverted Group by Actor Class PR ( #10186 )
...
* Revert "Revert "[Dashboard] Group by Actor Class (#10147 )" (#10180 )"
This reverts commit e4d2ca620a
.
* Fix metrics test to agree with the new logical view API
* lint2
Co-authored-by: Max Fitton <max@semprehealth.com>
2020-08-18 20:55:58 -07:00
Edward Oakes
ba0f531da0
[serve] Remove SLO code and blist dependency ( #10075 )
2020-08-18 17:52:36 -05:00
SangBin Cho
263df6163c
[Placement Group] Placement group remove api part 1 ( #10063 )
...
* Added basic rpc calls.
* fix issues.
* Fix the gcs server not getting request issue.
* In Progress.
* Basic logic done. Tests are required.
* In progress.
* In progress in refactoring context.
* Revert "In progress in refactoring context."
This reverts commit 38236256cf1306c60dd203e75d45ceb4509c8106.
* Working now.
* Python test works.
* Lint.
* Addressed code review.
* Addressed code review.
* Lint.
* Added unit tests.
* Done, but one of unit tests fail
* Addressed code review.
* Addressed the last code review.
* Fix the wrong test case.
2020-08-18 12:44:00 -07:00
Lixin Wei
d188becec2
[Python Worker] Add pid to log file name ( #10149 )
...
Co-authored-by: Alex Wu <alex@anyscale.io>
2020-08-18 11:48:48 -07:00
Simon Mo
bedc2c24c8
Export Metrics in OpenCensus Protobuf Format ( #10080 )
2020-08-18 11:32:42 -07:00