Amog Kamsetty
ae2e1f0883
[Parallel Iterators] Batching + Pipelining optimizations ( #7931 )
...
* batching + get_shard pipelining
* duplicate fix
* formatting
* adding performance benchmark
* minor changes
* turn batching off by default
2020-05-26 00:37:57 -07:00
Kai Yang
26cffb9c7c
Fix shutdown hook in worker mode ( #8098 )
2020-05-26 15:23:44 +08:00
fyrestone
f39760a4d3
Use uuid4() for actor creation function id hash ( #8589 )
2020-05-26 15:20:03 +08:00
fangfengbin
c41976938d
Add node table subscribe retry when gcs service restart ( #8591 )
2020-05-26 14:42:48 +08:00
Tao Wang
7e5b3dc0d9
GCS server task info handler use storage instead of redis accessor ( #8584 )
2020-05-26 10:38:31 +08:00
Eric Liang
90b05983d6
Lower ASAN build parallelism to avoid OOMs ( #8592 )
...
* fix it
* Update .travis.yml
2020-05-25 12:20:01 -07:00
fangfengbin
765d470c40
Add gcs object manager ( #8298 )
2020-05-25 17:21:35 +08:00
fangfengbin
f22d12d2fc
fix TestGetUncommittedLineage npe bug ( #8585 )
2020-05-25 15:48:58 +08:00
fangfengbin
229af662c6
Add job table&actor table subscribe retry when gcs service restart ( #8442 )
2020-05-25 14:38:25 +08:00
Edward Oakes
860eb6f13a
Update named actor API ( #8559 )
2020-05-24 20:08:03 -05:00
Tao Wang
92c2e41dfd
[GCS]profile info getting implementation based gcs service ( #8536 )
2020-05-24 22:23:01 +08:00
Luca Cappelletti
822de1b7f7
[Tune] Introduced preliminary random search to BayesOpt ( #8541 )
2020-05-23 12:20:43 -07:00
Jan Blumenkamp
d6f78f58dc
Fix missing learning rate and entropy coeff schedule for torch PPO ( #8572 )
2020-05-23 10:54:18 -07:00
fangfengbin
2ab1b773d4
GCS server worker info handler use storage instead of redis accessor ( #8543 )
2020-05-23 23:17:36 +08:00
Eric Liang
351839bf69
Revert "GCS server task info handler use storage instead of redis accessor ( #8531 )" ( #8562 )
...
This reverts commit 9823e15311
.
2020-05-22 19:16:43 -07:00
Kai Yang
2e5e789294
Allow enabling logging in core worker with empty log_dir ( #8529 )
2020-05-22 18:02:37 +08:00
Sven Mika
8870270164
[RLlib] Add QMIX support for complex obs spaces (Issue 8523). ( #8533 )
2020-05-22 10:17:51 +02:00
fangfengbin
9823e15311
GCS server task info handler use storage instead of redis accessor ( #8531 )
2020-05-22 12:04:03 +08:00
Siyuan (Ryans) Zhuang
83a819572b
Update the pickle5 revision to match the upstream candidate ( #8493 )
2020-05-21 18:21:37 -07:00
Eric Liang
bb8d3c5cd0
ASAN build for ray core tests ( #8431 )
2020-05-21 15:11:03 -07:00
SangBin Cho
aa1cbe8abc
[Dashboard] Ray memory dashboard backend ( #8461 )
2020-05-21 12:22:28 -07:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers ( #8345 )
2020-05-21 10:16:18 -07:00
Hao Chen
d27e6da1b2
Fix a lint issue ( #8530 )
2020-05-21 16:12:44 +08:00
Sven Mika
3a234ed9e3
[RLlib] Error: "Unknown trainable [some rllib algo name]" ( #8525 )
2020-05-21 08:59:32 +02:00
fangfengbin
e261b4778e
Adjust the state initialization sequence and put it after core worker google logging initialization ( #8511 )
2020-05-21 11:30:28 +08:00
Simon Mo
ed2f434593
[Serve] Start Replicas in Parallel ( #8433 )
2020-05-20 19:46:03 -07:00
Edward Oakes
a76434ccde
Add ability to specify worker and driver ports ( #8071 )
2020-05-20 15:31:13 -05:00
Sven Mika
d76578700d
[RLlib] Policy.compute_single_action()
broken for nested actions (Issue 8411). ( #8514 )
2020-05-20 22:29:08 +02:00
mehrdadn
ebf060d484
Make more tests run on Windows ( #8446 )
...
* Remove worker Wait() call due to SIGCHLD being ignored
* Port _pid_alive to Windows
* Show PID as well as TID in glog
* Update TensorFlow version for Python 3.8 on Windows
* Handle missing Pillow on Windows
* Work around dm-tree PermissionError on Windows
* Fix some lint errors on Windows with Python 3.8
* Simplify torch requirements
* Quiet git clean
* Handle finalizer issues
* Exit with the signal number
* Get rid of wget
* Fix some Windows compatibility issues with tests
Co-authored-by: Mehrdad <noreply@github.com>
2020-05-20 12:25:04 -07:00
Eric Liang
aa7a58e92f
[rllib] Support training intensity for dqn / apex ( #8396 )
2020-05-20 11:22:30 -07:00
Ian Rodney
f56b3be916
[Docs] Add Cancelation to main docs. ( #8508 )
...
* Update walkthrough.rst
* Adding example
* Better example
* Better example
* Adding Ray Kill Info
2020-05-20 10:31:57 -07:00
Lingxuan Zuo
cd706f40c4
[Stats] add nodeaddress tag for stats test ( #8423 )
2020-05-20 12:30:01 -05:00
Luca Cappelletti
c9898eff24
[Tune] Added method to integrate previous analysis in BO ( #8486 )
2020-05-19 23:26:43 -07:00
Bill Chambers
f8f7efc24f
[Serve] Rename RayServe -> "Ray Serve" in Documentation ( #8504 )
2020-05-19 19:13:54 -07:00
Edward Oakes
85cb721f19
[serve] Fix worker replica leak ( #8506 )
2020-05-19 20:51:50 -05:00
Simon Mo
c9c84c87f4
[Serve] Add Instructions for GPU ( #8495 )
2020-05-19 18:33:58 -07:00
Ian Rodney
1163ddbe45
Remove timeouts in test_cancel ( #8272 )
2020-05-19 12:35:16 -05:00
mehrdadn
8da084bc54
Try to address linting issues ( #8485 )
2020-05-19 10:29:17 -05:00
internetcoffeephone
a73c488c74
Change tf_utils.py get_weights to evaluate all tensors at once rather than calling tensor.eval per-tensor. ( #8491 )
2020-05-18 22:06:03 -07:00
Hao Chen
6c5ea32857
Fix installing pickle5-backport for Python 3.8.2 ( #8453 )
2020-05-18 17:03:13 -07:00
Luca Cappelletti
5b330de182
[Tune] Introduced patience to early stopping ( #8484 )
2020-05-18 13:12:16 -07:00
Luca Cappelletti
d1ef70da16
[Tune] Added default values for utility kwargs ( #8488 )
2020-05-18 13:10:43 -07:00
Robert Nishihara
14aeb30473
[Serve] Require traffic weights to sum more closely to 1. ( #8476 )
2020-05-18 11:46:34 -07:00
Max Fitton
0fadc11437
[dashboard] Only show workers from the correct cluster ( #8434 )
2020-05-18 13:30:41 -05:00
Max Fitton
13231ba63b
Rename redis-port to port and add default ( #8406 )
2020-05-18 13:25:34 -05:00
Robert Nishihara
2cff471d2c
Don't print Redis connection warning in ray.init(). ( #8475 )
2020-05-18 11:19:13 -07:00
Richard Liaw
b6c4f45ae0
[tune] Fix links ( #8477 )
2020-05-18 10:08:29 -07:00
Edward Oakes
9a721ed71a
Link to serve in tune overview ( #8487 )
2020-05-18 11:29:38 -05:00
Sven Mika
796a834c48
[RLlib] Attention Net integration into ModelV2 and learning RL example. ( #8371 )
2020-05-18 17:26:40 +02:00
fangfengbin
9347a5d10c
Add global state accessor of jobs ( #8401 )
2020-05-18 20:32:05 +08:00