Commit graph

8768 commits

Author SHA1 Message Date
Chen Shen
645d8fcaf0
[logging][rfc] add RAY_LOG_EVERY_N and RAY_LOG_EVERY_MS (#17018)
* introduce log-every-n

* add n

* linter

* add license
2021-07-13 19:14:28 -07:00
fyrestone
f1faa79a04
[Dashboard][event] Basic event module (#16985)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

* Make monitor events task cancellable

* Fix error

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-13 19:08:39 -07:00
Sven Mika
ce6dfc9b2d
[RLlib] Update tf1.x vs tf2.x documentation and eager example script. (#17030) 2021-07-13 20:02:17 -04:00
SangBin Cho
63ebfe2f2d
Revert back to ray.init (#17047) 2021-07-13 14:36:27 -07:00
Philipp Moritz
ac912f0ce1
Allow using breakpoint() to drop into Ray debugger (#17025)
* Set PYTHONBREAKPOINT

* update tests

* update

* update docs

* fix docs

* skip ray functions

* ok

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

* breakpoint() is only working in Python > 3.6

* add note

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-13 13:52:17 -07:00
Grzegorz Bartyzel
d553d4da6c
[RLlib] DQN (Rainbow): Fix torch noisy layer support and loss (#16716) 2021-07-13 16:48:06 -04:00
Sven Mika
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014) 2021-07-13 14:01:30 -04:00
Antoine Galataud
16f1011c07
[RLlib] Issue 15910: APEX current learning rate not updated on local worker (#15911) 2021-07-13 14:01:00 -04:00
Ian Rodney
fac6045c87
[GCP] Allow Head Node to Launch Workers with IAM Role (#17027) 2021-07-13 10:44:34 -07:00
Amog Kamsetty
38b5b6d24c
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)" (#17036)
This reverts commit e4123fff27.
2021-07-13 09:57:15 -07:00
Kai Fricke
27d80c4c88
[RLlib] ONNX export for tensorflow (1.x) and torch (#16805) 2021-07-13 12:38:11 -04:00
Kai Fricke
3380b68b54
[RLlib] Issue 16683: Fix last infos dict (#16999). 2021-07-13 11:33:48 -04:00
Edward Oakes
f7759fa484
[core] Add ray.util.list_actors() API (#16642) 2021-07-13 10:00:28 -05:00
Sven Mika
e4123fff27
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565) 2021-07-13 06:38:14 -04:00
Tao Wang
90187433b1
[Java] Remove redis dependency(jedis) in java lang layer (#17029) 2021-07-13 17:34:10 +08:00
Ian Rodney
9cb80fcf17
[Client][Proxy] Handle Non-Default Redis Password (#16885) 2021-07-12 23:57:51 -07:00
Tao Wang
5b7e76770d
[Java] Use gcs client instead of redis client to get session dir (#16773)
* Use gcs client instead of redis client to get session dir

* fix compile and add comments

* fix compile

* lint

* fix

* lint

* lint

* Update src/ray/gcs/gcs_client/global_state_accessor.h

Co-authored-by: Qing Wang <kingchin1218@126.com>

* Update java/runtime/src/main/java/io/ray/runtime/RayNativeRuntime.java

Co-authored-by: Qing Wang <kingchin1218@126.com>

* per comment

Co-authored-by: Qing Wang <kingchin1218@126.com>
2021-07-13 14:01:22 +08:00
Eric Liang
e7350ff828
Fix flaky test_plasma_unlimited::test_fallback_allocation_failure (#17016)
* fix

* fix catch
2021-07-12 20:17:23 -07:00
chenk008
f7bcfc5324
[Core] add scheduler_cpu_share_enabled (#16920) 2021-07-12 20:04:32 -07:00
Eric Liang
7a1e8fdb8b
Cleanup info logs in raylet (#17015) 2021-07-12 19:43:44 -07:00
Siyuan (Ryans) Zhuang
a8b57c78d6
[Workflow] Workflow management - Part II (#16907) 2021-07-12 17:31:23 -07:00
Eric Liang
fa0ff057d6
Add a new autoscaling shuffle test (#16948) 2021-07-12 16:54:38 -07:00
Qing Wang
4bde71ca86
[Java][Core] Support get current actor handle. (#14900) 2021-07-12 15:27:54 -07:00
Ian Rodney
8c8b7770cb
[Docs] Add Ray Client Server Port to Docs (#17003) 2021-07-12 14:05:57 -07:00
Amog Kamsetty
a14342ce6f
Revert "[Dashboard][event] Basic event module (#16698)" (#17004)
This reverts commit 66ea099897.
2021-07-12 11:22:46 -07:00
Amog Kamsetty
df3dd81348
[rllib] skip highly flaky tests (#17010) 2021-07-12 11:18:28 -07:00
Amog Kamsetty
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action, not normalize_action." (#17002)
This reverts commit 7862dd64ea.
2021-07-12 11:09:14 -07:00
corentinmarek
24e00fcb1b
Add initialization for transport params for non s3 storage (#16054) 2021-07-12 10:47:49 -07:00
Edward Oakes
87e6f99b9c
[serve] Bump timeouts on test_deploy and re-enable (#16969) 2021-07-12 11:38:02 -05:00
Wansoo Kim
c9e8c12f8c
[Refactor] Minor Refactoring and Typing (#16964) 2021-07-12 15:37:07 +01:00
Kai Fricke
fce8fa2668
[tune] use bayesopt for quick start example (which actually converges) (#16997) 2021-07-12 14:50:32 +01:00
Tao Wang
eed0ffc6ff
[Core]Align storage of session_dir in java/python so it can be accessed u… (#16958)
* Align storage of session_dir in java/python so they can be accessed using internal kv manager

* align cpp
2021-07-12 17:42:13 +08:00
qicosmos
298d2afc35
[Ray Log] remove glog dependency (#16077) 2021-07-12 17:06:52 +08:00
gurunath
e3966f59e3
[tune] explicitly raising tune import Error “[tune]” (#16575)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-11 23:40:10 -07:00
Antoni Baum
0935ec30d0
[tune] Add information about environment variables to tune.run docstring (#16980) 2021-07-11 17:20:17 -07:00
Sven Mika
55a90e670a
[RLlib] Trainer.add_policy() not working for tf, if added policy is trained afterwards. (#16927) 2021-07-11 23:41:38 +02:00
Chen Shen
667f53a0a2
add stress test (#16977) 2021-07-11 09:59:41 -07:00
Alex Wu
b08795582b
Disable runtime envs in scalability envelope (#16978)
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-07-11 09:53:15 -07:00
Julius Frost
a88b217d3f
[rllib] Enhancements to Input API for customizing offline datasets (#16957)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-10 15:05:25 -07:00
Francesco Stranieri
01c533c171
[rlib] Independent bound for each dimension AssertionError #16845 (#16860)
* Fix AssertionError for Box space type

Restored support for Box space type with independent bound for each dimension.

* Removed unnecessary assertion for Box space type
2021-07-10 14:48:35 -07:00
Scott Graham
3334357c58
[autoscaler] [azure] Fix Azure Autoscaling Failures (#16640)
Co-authored-by: Scott Graham <scgraham@microsoft.com>
2021-07-10 11:55:00 -07:00
SangBin Cho
33e319e9d7
[Tests] Remove app level error from nightly tests (#16968)
* Completed

* Fix tests

* increase the node wait timeout

Signed-off-by: SangBin Cho <rkooo567@gmail.com>
2021-07-09 12:20:42 -07:00
fyrestone
66ea099897
[Dashboard][event] Basic event module (#16698)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-09 10:25:30 -07:00
Nikita Vemuri
6d36d7ed7e
[Serve] Call FastAPIWrapper class constructor before startup hooks (#16941)
* run constructor before startup hooks

* address comments

Co-authored-by: Nikita Vemuri <nikitavemuri@nikitas-mbp.attlocal.net>
2021-07-09 09:39:32 -07:00
Dmitri Gekhtman
27a9ae5e13
[autoscaler][gcp] Retry GCP BrokenPipeError (#16952) 2021-07-08 13:54:29 -07:00
Maxim Egorushkin
9cb5c9a422
Never convert trial_id to float when loading progress.csv. (#16959)
* Never convert trial_id to float when loading progress.csv.

* Formatting updated.

Co-authored-by: Maxim Egorushkin <maxim.egorushkin@gmail.com>
2021-07-08 11:06:11 -07:00
Sven Mika
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action, not normalize_action. (#16774) 2021-07-08 17:31:34 +02:00
Sven Mika
9f6a92163b
[RLlib] Remove old UsageTrackingDict code. (#16867) 2021-07-08 17:27:52 +02:00
SongGuyang
560fd15568
[C++ worker] support build and add C++ worker to python wheel (#16496) 2021-07-08 14:42:26 +08:00
matthewdeng
264e2df7e2
[release] update modin_xgboost_test to use anyscale connect (#16942) 2021-07-07 22:37:41 -07:00