Dmitri Gekhtman
7fec19dad2
[kubernetes][operator][minutiae] Backwards compatibility of operator ( #13623 )
2021-01-22 14:07:25 -06:00
Sven Mika
d629292d63
[RLlib] Add grad_clip config option to MARWIL and stabilize grad clipping against inf global_norms. ( #13634 )
2021-01-22 19:36:02 +01:00
architkulkarni
da5928304a
[Metrics] Cache metrics ports in a file at each node ( #13501 )
...
* cache metric ports in a file at each node
* remove old assignment of export port
* lint
* lint
* move e2e test to top of file to avoid shutdown bug
2021-01-22 09:59:20 -08:00
Kai Yang
90f1e408de
[Java] Add fetchLocal
parameter in Ray.wait()
( #13604 )
2021-01-22 17:55:00 +08:00
Amog Kamsetty
00c14ce4a4
[Object Spilling] Skip flaky tests ( #13628 )
...
* skip flaky tests
* lint
* skip one more
* fix
2021-01-22 00:31:33 -08:00
Amog Kamsetty
39755fdb20
Revert "[Serve] Refactor BackendState" ( #13626 )
...
This reverts commit 68038741ac
.
2021-01-21 23:06:15 -08:00
Tao Wang
aa5d7a5e6c
[Dashboard]Don't set node actors when node_id of actor is Nil ( #13573 )
...
* Don't set node actors when node_id of actor is Nil
* add test per comment
2021-01-21 20:18:34 -08:00
Xianyang Liu
4ecd29ea2b
[dashboard] Fixes dashboard issues when environments have set http_proxy ( #12598 )
...
* fixes ray start with http_proxy
* format
* fixes
* fixes
* increase timeout
* address comments
2021-01-21 20:10:01 -08:00
Ameer Haj Ali
1fbb752f42
[autoscaler] remove worker_default_node_type that is useless. ( #13588 )
2021-01-21 17:04:38 -08:00
Nikita Vemuri
4e01a9ec38
[Autoscaler] Ensure ubuntu is owner of docker host mount folder ( #13579 )
...
* change ownership to ubuntu if root
* use ssh user in cluster config
* formatting
Co-authored-by: Nikita Vemuri <nikitavemuri@Nikitas-MacBook-Pro.local>
2021-01-21 17:01:55 -08:00
Stephanie Wang
0998d69968
[core] Admission control for pulling objects to the local node ( #13514 )
...
* Admission control, TODO: tests, object size
* Unit tests for admission control and some bug fixes
* Add object size to object table, only activate pull if object size is known
* Some fixes, reset timer on eviction
* doc
* update
* Trigger OOM from the pull manager
* don't spam
* doc
* Update src/ray/object_manager/pull_manager.cc
Co-authored-by: Eric Liang <ekhliang@gmail.com>
* Remove useless tests
* Fix test
* osx build
* Skip broken test
* tests
* Skip failing tests
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-01-21 16:46:42 -08:00
Amog Kamsetty
ccc901f662
add 3.8 ( #13608 )
2021-01-21 16:38:51 -08:00
Amog Kamsetty
20acc3b05e
Revert "Inline small objects in GetObjectStatus response. ( #13309 )" ( #13615 )
...
This reverts commit a82fa80f7b
.
2021-01-21 16:10:34 -08:00
Dmitri Gekhtman
87ca102c93
[Kubernetes] Unit test for cluster launch and teardown using K8s Operator ( #13437 )
2021-01-21 12:00:37 -06:00
Ian Rodney
68038741ac
[serve] Refactor BackendState to use ReplicaState classes ( #13406 )
2021-01-21 11:16:02 -06:00
Clark Zinzow
a82fa80f7b
Inline small objects in GetObjectStatus response. ( #13309 )
2021-01-21 09:15:18 -08:00
Kai Yang
92f1e0902e
[Java] Fix return of java doc ( #13601 )
2021-01-21 23:57:20 +08:00
Michael Luo
587f207c2f
[RLlib] Support for D4RL + Semi-working CQL Benchmark ( #13550 )
2021-01-21 16:43:55 +01:00
Saeid
d11e62f9e6
[RLlib] Fix problem in preprocessing nested MultiDiscrete ( #13308 )
2021-01-21 16:36:11 +01:00
Sven Mika
daf0bef285
[RLlib] Dreamer: Fix broken import and add compilation test case. ( #13553 )
2021-01-21 16:30:26 +01:00
Alex Wu
b9ac3878ae
[Autoscaler] Display node status tag in autsocaler status ( #13561 )
...
* .
* .
* .
* .
* .
* lint
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-01-20 19:20:54 -08:00
Siyuan (Ryans) Zhuang
a09997dc9e
[Core] Remove 'PlasmaBuffer' in the buffer header ( #13188 )
2021-01-20 12:01:44 -08:00
Edward Oakes
b796de4104
[metrics] Check that all tag_keys are set when recording ( #13420 )
2021-01-20 13:09:44 -06:00
dmatch01
fd6882176a
Fix for operator role definition to add raycluster/finalizer ( #13567 )
2021-01-20 13:02:02 -06:00
Kai Fricke
8804758409
[xgboost] Add XGBoost release tests ( #13456 )
...
* Add XGBoost release tests
* Add more xgboost release tests
* Use failure state manager
* Add release test documentation
* Fix wording
* Automate fault tolerance tests
2021-01-20 18:40:23 +01:00
Eric Liang
e6412efdf5
Extra fix ray client newline ( #13577 )
2021-01-20 09:23:14 -08:00
ZhuSenlin
2e7c2b774f
[Core] add thread name to help performance profiling ( #13506 )
2021-01-20 20:34:28 +08:00
Kai Fricke
6c23bef2a7
[tune] Allow actor reuse for new trials ( #13549 )
...
* Allow actor reuse for new trials
* Fix tests and update conf when starting new trial
* Move magic config to `reset_trial`
2021-01-20 11:25:33 +01:00
Daan Klijn
800304acfb
[tune] wandb - WandbLogger now also accepts wandb.data_types.Video ( #13169 )
2021-01-20 01:19:54 -08:00
Eric Liang
d0f224d5cf
Revert "Pipe monitor.err logs to driver" ( #13574 )
...
This reverts commit a0d08c2cc6
.
2021-01-20 00:29:19 -08:00
Tao Wang
b2a6e55289
[GCS]Only publish fileds used by sub clients in WorkerTableData ( #13508 )
2021-01-20 16:14:59 +08:00
Keqiu Hu
6c9088eb62
[core] refactor disconnect message processing and enrich WorkExitType ( #13527 )
...
* [core] refactor disconnect message processing and enrich WorkExitType
add changes from refactor pr
fix type typo
fix typo
fix
* address comments
* also update WorkerTableData
* fix tests
2021-01-19 22:09:46 -08:00
SangBin Cho
e544c008df
Fix restoration request dedup issues. ( #13546 )
2021-01-19 15:28:54 -08:00
Stephanie Wang
bfe147a6a8
Debug info to GCS pub sub ( #13564 )
2021-01-19 14:55:23 -08:00
Eric Liang
a0d08c2cc6
Pipe monitor.err logs to driver
2021-01-19 12:27:07 -08:00
Simon Mo
c963cbc038
Fix Docker Permission for Serve release test again ( #13543 )
2021-01-19 12:23:30 -08:00
Dmitri Gekhtman
7b4a97c610
Make AWSNodeProvider.create_node return nodes created ( #13498 )
...
* Make AWSNodeProvider.create_node return node config
* return-dict
* Node provider interface create node return type Any
* Type clarification.
* Delete debug code
* Oops reset example-full changes
* Return type specified. GCP create node returns None.
* Article
2021-01-19 12:17:46 -08:00
Amog Kamsetty
20016c983f
[Tune] MLflow Credentials ( #13533 )
2021-01-19 11:55:13 -08:00
Edward Oakes
9b071eb449
[metrics] Better validation for tags ( #13421 )
2021-01-19 13:26:51 -06:00
SangBin Cho
99375c4cfc
[Object Spilling] Remove retries and use a timer instead. ( #13175 )
2021-01-19 11:01:45 -08:00
fyrestone
86d5000047
Fix passing env on windows ( #13253 )
2021-01-19 10:04:38 -06:00
Sven Mika
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. ( #13238 )
2021-01-19 14:22:36 +01:00
Sven Mika
e74947cc94
[RLlib] Env directory cleanup and tests. ( #13082 )
2021-01-19 10:09:39 +01:00
Sven Mika
93c0a5549b
[RLlib] Deprecate vf_share_layers
in top-level PPO/MAML/MB-MPO configs. ( #13397 )
2021-01-19 09:51:35 +01:00
Sven Mika
a65ee92b69
[RLlib] MARWIL loss function test case and cleanup. ( #13455 )
2021-01-19 09:51:05 +01:00
Todd A. Anderson
2506a6cd0e
Remove PYTHON_MODE that is not defined in Ray so that import * will work from other packages. ( #13544 )
2021-01-18 23:07:01 -08:00
SameerF
701038e410
Fix typo ( #13098 )
2021-01-18 19:28:10 -08:00
Richard Liaw
7a2997ea8c
[tune] support experiment checkpointing for grid search ( #13357 )
2021-01-18 19:24:36 -08:00
Ameer Haj Ali
1fbc3ddfac
Add ability to not start Monitor when calling ray start
( #13505 )
2021-01-18 18:31:53 -08:00
Simon Mo
fb16dd5265
Add Dashboard Python Test to Buildkite ( #13530 )
2021-01-18 17:20:45 -08:00