Commit graph

3387 commits

Author SHA1 Message Date
Zhijun Fu
ea9376c9ce Fix flaky core worker tests because of race condition in gcs client subscription (#5735) 2019-09-24 22:47:38 +08:00
Kai Yang
c580955840 [Java] Fix some potential bugs about Ray.shutdown() (#5693) 2019-09-24 10:44:17 +08:00
Ujval Misra
a4659a8f8b [tune] Add support for function-based stopping condition (#5754) 2019-09-23 18:39:00 -07:00
Mitchell Stern
b03147e7bf Update call to py-spy to conform to new API (#5758) 2019-09-23 14:52:23 -07:00
Eric Liang
249ca2cf9e
[rllib] add blog posts to examples list (#5762)
* add blog post

* remove

* link
2019-09-23 10:42:21 -07:00
Edward Oakes
61e5d674be
Push driver task in core worker (#5752) 2019-09-23 10:53:55 -05:00
Edward Oakes
62bc30c1cf
Validate redis address parameters (#5746)
* Validate redis address params

* Fix comment

* Add check
2019-09-23 10:52:34 -05:00
Mitchell Stern
98dcc1d440 [Dashboard] Add initial version of new dashboard (#5730) 2019-09-23 08:50:40 -07:00
Eric Liang
56ab9a00bb
[autoscaler] cache stopped nodes, no screen on attach (#5741) 2019-09-22 17:30:35 -07:00
Philipp Moritz
5f5873b182
[Projects] Start multiple sessions via session start (#5740) 2019-09-22 01:36:23 -07:00
Robert Nishihara
1cfadf032e
Properly test Python wheels in Travis. (#5749) 2019-09-21 18:03:10 -07:00
Richard Liaw
e00071721a
[tune] tf2.0 testing and supporting callables (#5738) 2019-09-21 17:01:14 -07:00
Robert Nishihara
c91a37f622
Set redis password in slurm deployment documentation. (#5747) 2019-09-21 15:33:15 -07:00
Hersh Godse
d17b35494d [tune] Save/Restore for Suggestion Algs (#5719) 2019-09-21 11:11:57 -07:00
Vince Jankovics
7e214fd95e [tune] TensorBoard HParams for TF2.0 (#5678) 2019-09-21 11:06:34 -07:00
Kilian Batzner
79b9c70ad6 Add local_tf_session_args to unknown subkeys whitelist (#5742)
* Add local_tf_session_args to unknown subkeys whitelist

* Remove trailing whitespace
2019-09-20 10:32:49 -07:00
Eric Liang
fb3b232c0e
[rllib] Properly flatten 2-d observations as input to FCnet (#5733) 2019-09-19 12:10:31 -07:00
Matthew A. Wright
3131e1742d [rllib] Qmix off by 1 in double Q calculation (#5731)
* Qmix fix.

-Current version of double Q learning is incorrect; it selects actions
at timestep t instead of t+1 when computing the t+1 Q value.

* Allow extra obs dict keys

* Move Q-value-computing replay code to own function

* Run the autoformatter

* use better terms in comments ("policy" network instead of "live" network)
2019-09-18 18:12:30 -07:00
gehring
8903bcd0c3 [rllib] Tracing for eager tensorflow policies with tf.function (#5705)
* Added tracing of eager policies with `tf.function`

* lint

* add config option

* add docs

* wip

* tracing now works with a3c

* typo

* none

* file doc

* returns

* syntax error

* syntax error
2019-09-17 01:44:20 -07:00
Robert Nishihara
f74aaf2619 Add more links for getting involved.git status (#5708) 2019-09-16 20:26:03 -07:00
Philipp Moritz
a6dd794818 [Projects] Fix template path (#5716) 2019-09-16 19:58:54 -07:00
Philipp Moritz
b1aadd863b
Fix project templates in wheel (#5714) 2019-09-16 15:21:59 -07:00
Philipp Moritz
e4e1a57ca5
[Projects] Allow named sessions (#5706) 2019-09-16 13:00:46 -07:00
Philipp Moritz
f4deecb5ab Fix travis error in direct_actor_transport.cc (#5710) 2019-09-15 22:19:20 -07:00
Eric Liang
4bf7de084d Speed up TaskSpecification copy (#5709) 2019-09-15 19:57:34 -07:00
Richard Liaw
2b2eb4debb
[tune] Checkpoint and Sync at end (#5699) 2019-09-15 15:58:58 -07:00
Robert Nishihara
baac370099 Deprecate old global state API. (#5484)
* Deprecate old global state API.

* Remove unnecessary returns.
2019-09-15 09:13:15 -07:00
Eric Liang
09968a3c55
[revert] Disable monitor error logging to stdout #5692 2019-09-14 22:32:48 -07:00
Eric Liang
4979b8c4d9
Ordered execution of tasks per actor handle (#5664) 2019-09-14 22:31:33 -07:00
Edward Oakes
a8888c5ff4 [flaky test] Fix test_calling_start_ray_head (#5644) 2019-09-14 22:27:45 -07:00
Robert Nishihara
74a34b736d Call ray.put in ray.init() to speed up first object store access. (#5685) 2019-09-14 21:27:32 -07:00
Simon Mo
1560ace65a
Use set comprehensions (#5707) 2019-09-14 15:44:25 -07:00
Edward Oakes
a5d7de6aaf [core worker] Python core worker normal task submission (#5566) 2019-09-14 13:02:53 -07:00
Simon Mo
5f88823c49
[Serve] Rewrite Ray.Serve From Scratch (#5562)
* Commit and format files

* address stylistic concerns

* Replcae "Usage" by "Example" in doc

* Rename srv to serve

* Add serve to CI process; Fix 3.5 compat

* Improve determine_tests_to_run.py

* Quick cosmetic for determien_tests

* Address comments

* Address comments

* Address comment

* Fix typos and grammar

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update python/ray/experimental/serve/global_state.py

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Use __init__ for Query and WorkIntent class

* Remove dataclasses dependency

* Rename oid to object_id for clarity

* Rename produce->enqueue_request, consume->dequeue_request

* Address last round of comment
2019-09-13 21:36:56 -07:00
Si-Yuan
4c964c0941 Initial implementation for pickle5 support (#5611) 2019-09-13 17:54:14 -07:00
Simon Mo
fc9f03cd96 Fix queue actor init in setup_queue_actor fixture (#5676) 2019-09-13 12:35:44 -07:00
Eric Liang
3ed18d0b59
Fix edge case in autoscaler with poor bin packing (#5702)
* fix edge case

* fix for general case
2019-09-13 11:46:10 -07:00
Stephanie Wang
1d4a11a433 Only use git repo if .git exists (#5701) 2019-09-13 11:34:34 -07:00
Edward Oakes
07c4c6367a [core worker] Python core worker object interface (#5272) 2019-09-12 23:07:46 -07:00
Kai Yang
1b880191b0 Replace NotImplementedException with UnsupportedOperationException (#5694) 2019-09-12 00:40:26 -07:00
Edward Oakes
ee5db5b67f Raise error if space in redis password (#5673) 2019-09-11 20:58:39 -07:00
Edward Oakes
0bf79cfbde Properly short circuit core worker Get() on exception (#5672) 2019-09-11 18:38:14 -07:00
Ashwinee Panda
946ebfaa3c [rllib] Validate that entropy coeff is not an integer (#5687)
* Validate that entropy coeff is not an integer

Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float.

* Cast to float instead

Also move this check after the negative value check
2019-09-11 14:35:42 -07:00
Eric Liang
faeaa34bdd
Deflake cluster heartbeat test (#5552) 2019-09-11 12:26:04 -07:00
Eric Liang
bc6a95deb0
[rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent (#5683) 2019-09-11 12:15:34 -07:00
Eric Liang
2fdefe19b7
Take into account queue length in autoscaling (#5684) 2019-09-11 11:31:35 -07:00
Philipp Moritz
9ce6dd9b88
[Projects] Add "session execute" (#5681) 2019-09-11 00:50:05 -07:00
Hersh Godse
336aef1774 [tune] Save and Restore for bayesopt (#5623) 2019-09-10 13:11:59 -07:00
Robert Nishihara
4d16677a68 Fix PyPI version in readme. (#5662) 2019-09-09 19:54:57 -07:00
Simon Mo
147e7d46ec
[Flaky tests] FIx test fork (#5671)
* Start testing test_fork

Maybe queue actor takes too long to initialize, that's why we are
seeing "Many python processes started" since most of the python
tasks are blocked on ray.get

* Add a comment
2019-09-09 19:21:20 -07:00