Commit graph

3463 commits

Author SHA1 Message Date
Siyuan (Ryans) Zhuang
f48293f96d
Fix deprecated warning (#6142) 2019-11-11 17:49:15 -08:00
Simon Mo
c75ada9e04
[Autoscaler][K8s] Enforce memory limit in k8s yaml (#6138)
* Enforce memory limit in k8s yaml

* Update python/ray/autoscaler/kubernetes/example-full.yaml

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* Line wrap
2019-11-11 14:06:34 -08:00
Adi Zimmerman
776b071f3b [tune] Let Search Algorithms use early stopped trials (#5651) 2019-11-11 09:38:14 -08:00
Edward Oakes
5780ec1b62
Refresh ObjectIDs in raylet for stopgap GC (#6109) 2019-11-10 23:12:59 -08:00
Philipp Moritz
decaa65cd6
Use pickle by default for serialization (#5978) 2019-11-10 18:12:18 -08:00
Adam Gleave
01aee8d970 [autoscaler] Retry creating EC2 instances in new AZ (#6129) 2019-11-09 19:44:27 -08:00
Miguel Morales
d17ae5ad7a Update hyperband-cartpole.yaml (#6121)
Typo
2019-11-09 19:39:03 -08:00
Adam Gleave
c157e93ba1 [tune] Retry failed tasks with checkpointing disabled (#6126)
* Allow recovery for failed tasks without checkpointing

* Update docs
2019-11-09 19:35:27 -08:00
Philipp Moritz
ccbcc4bafa
Use GRCP and Bazel 1.0 (#6002) 2019-11-08 15:58:28 -08:00
Eric Liang
afca6d3d87
Object store full with cyclic python references (#6114) 2019-11-08 14:08:24 -08:00
Edward Oakes
83378a8610
Improve flaky test_warning_monitor_died (#6113) 2019-11-08 12:11:15 -08:00
Eric Liang
4044af8520
Try to enable dashboard (again) (#6069)
* Revert "Revert "Enable the Ray dashboard by default (#5976)" (#6068)"

This reverts commit 1a3e97cf23.

* fix tests that assume the dashboard isn't a job

* travis
2019-11-08 10:48:48 -08:00
Philipp Moritz
5a05eaaa54 Fix compilation on master (#6116) 2019-11-07 22:38:42 -08:00
Eric Liang
4a28306186
Allow large returns from direct actor calls (#6088) 2019-11-07 21:28:55 -08:00
Edward Oakes
ca53af4d0f
Add pending task dependencies to ObjectID ref counting (#6054) 2019-11-07 18:37:10 -08:00
Eric Liang
1f043daf69
[rllib] Fix and add test for LR annealing config 2019-11-07 12:17:27 -08:00
Simon Mo
fcb6bdbc39
[Doc] Document Actor.options API (#6099)
* Document Actor.options API

* Undocument _remote
2019-11-06 23:12:23 -08:00
Edward Oakes
9820c10a09 Simplify gRPC service definition for the worker (#6095) 2019-11-06 13:00:39 -08:00
David Bignell
3f83b2daa9 [rllib] Rollout extensions (#6065)
* Rollout improvements

* Make info-saving optional, to avoid breaking change.

* Store generating ray version in checkpoint metadata

* Keep the linter happy

* Add small rollout test

* Terse.

* Update test_io.py
2019-11-05 20:34:18 -08:00
Eric Liang
2a0225dd25
[rllib] RLlib chooses wrong neural network model for Atari in 0.7.5 (#6087) 2019-11-05 11:36:29 -08:00
daiyaanarfeen
8f6d73a93a [sgd] Extend distributed pytorch functionality (#5675)
* raysgd

* apply fn

* double quotes

* removed duplicate TimerStat

* removed duplicate find_free_port

* imports in pytorch_trainer

* init doc

* ray.experimental

* remove resize example

* resnet example

* cifar

* Fix up after kwargs

* data_dir and dataloader_workers args

* formatting

* loss

* init

* update code

* lint

* smoketest

* better_configs

* fix

* fix

* fix

* train_loader

* fixdocs

* ok

* ok

* fix

* fix_update

* fix

* fix

* done

* fix

* fix

* fix

* small

* lint

* fix

* fix

* fix_test

* fix

* validate

* fix

* fi
2019-11-05 11:16:46 -08:00
Mitchell Stern
82be14f943 Move gRPC calls outside of Raylet stats lock (#6090) 2019-11-05 00:47:15 -08:00
mehrdadn
e312f3d282 Compatibility issues (#6071)
* Pass -f - to tar to force stdin on Windows

* Quote paths that may contain spaces (causes issues on Windows)

* Copy over Windows code from Arrow for glog signal handle uninstall

* Add missing COPTS to build rules since we'll need them for Windows compatibility

* Begin adding COPTS for Windows compatibility

* Disable glog on Arrow until we change WIN32 to _WIN32 there

* Missing header files that cause problems on Windows

* WORD typedef conflicts with Windows; remove it

* uint -> unsigned int wherever we're dealing with milliseconds (signed version is already int)

* uint -> unsigned int for enums

* uint -> size_t, wherever we're dealing with sizes or indices into arrays

* Work around Boost 1.68 bug in detecting clang-cl (revert this after upgrading)

* Missing #include <unistd.h>

* Add check for signal handler uninstallation failure

* Linting issue
2019-11-05 00:08:14 -08:00
Philipp Moritz
fefe050a58
Fix running out of file descriptors in the WebUI (#6086) 2019-11-04 21:17:36 -08:00
Edward Oakes
043d1f4094 Return RayObjects to core worker (#6052) 2019-11-04 20:27:57 -08:00
visatish
18241f4a2d [tune] Added resources_per_trial arg to validate_save_restore u… (#6032) 2019-11-04 13:24:46 -08:00
Simon Mo
c23eae5998
[Serve] Fix router-worker communication (#5961)
* Half way there, needs the strict queuing fix

* Fix scale down, use callback

* Cleanup

* Address commments

* Comment, nit

* Fix docstring
2019-11-04 11:29:21 -08:00
Eric Liang
8485304e83
Support concurrent Actor calls in Ray (#6053) 2019-11-04 01:14:35 -08:00
Eric Liang
fbad6f543b Try fixing actor handle destruction on py2 (#6076) 2019-11-03 22:46:40 -08:00
Philipp Moritz
1c5446851a
Use Plasma with LRU refreshing integrated (#6050) 2019-11-03 16:19:05 -08:00
Philipp Moritz
894885593c Fix prometheus-cpp failure (#6073) 2019-11-03 15:05:47 -08:00
Eric Liang
1a3e97cf23
Revert "Enable the Ray dashboard by default (#5976)" (#6068)
This reverts commit 6166ef3e09.
2019-11-01 17:08:37 -07:00
Richard Liaw
e94bebb1de
[tune] Fix Jenkins tests (#6028) 2019-11-01 16:42:04 -07:00
Eric Liang
fb34928a2a
[minor] Perf optimizations for direct actor task submission (#6044)
* merge optimizations

* fix

* fix memory err

* optimize

* fix tests

* fix serialization of method handles

* document weakref

* fix check

* bazel format

* disable on 2
2019-11-01 14:41:14 -07:00
Eric Liang
eef4ad3bba
Report census view data as part of raylet node stats (#6060) 2019-11-01 14:26:09 -07:00
Simon Mo
c8d7065bf3
[CI] Use rerunfailures instead of flaky (#6061)
* Use rerunfailures instead of flaky

* Lint
2019-11-01 13:59:03 -07:00
Eric Liang
6166ef3e09
Enable the Ray dashboard by default (#5976) 2019-11-01 12:19:01 -07:00
Simon Mo
7f5b3502da
Implement Detached Actor (#6036)
* Arg propagation works

* Implement persistent actor

* Add doc

* Initialize is_persistent_

* Rename persistent->detached

* Address comment

* Make test passes

* Address comment

* Python2 compatiblity

* Fix naming, py2

* Lint
2019-11-01 10:28:23 -07:00
Philipp Moritz
f7455839bf
Expose raylet info to dashboard (#6045) 2019-10-31 17:36:59 -07:00
Eric Liang
c86f945520
Support pass by ref args in for direct actor calls (#6040) 2019-10-31 16:55:10 -07:00
Eric Liang
16891e9379
[rllib] Don't use flat weights in non-eager mode (#6001) 2019-10-31 15:16:02 -07:00
Edward Oakes
16e9dfd2e1
Exit workers when raylet dies unexpectedly (#6014) 2019-10-30 20:29:25 -07:00
Edward Oakes
e9e78871b9
Remove unused function definition caching (#6042) 2019-10-30 16:41:18 -07:00
Simon Mo
56f3e96887
[Serve] Use ray's cloudpickle (#6051)
* Revert "Add cloudpickle as doc requirements (#6037)"

This reverts commit 03ce3b7c5b.

* Use ray's vendored cloudpickle
2019-10-30 15:21:09 -07:00
Qing Wang
4636fc2b78 Fix java ci (#5964) 2019-10-30 14:50:53 -07:00
Eric Liang
8ebba202df
[minor] Reduce perf overhead of object ref tracking (#6041) 2019-10-29 18:14:51 -07:00
Eric Liang
b89cac976a
Basic direct actor call support in Python (#5991) 2019-10-28 22:09:04 -07:00
Simon Mo
4c4342c165
Bring back pytest-sugar (#6038)
* Add cloudpickle as doc requirements

* Bring back pytest-sugar

* Revert "Add cloudpickle as doc requirements"

This reverts commit 2206e9e62ee20d93638e115f07a3fc933cbad9a3.
2019-10-28 20:24:28 -07:00
Simon Mo
03ce3b7c5b
Add cloudpickle as doc requirements (#6037) 2019-10-28 18:25:02 -07:00
Simon Mo
9e2c5f8218
[Serve] Put global state in remote actor (#5937)
* Making progress

* Impl done, start debugging

* Tests all pass

* Add test, fix

* Update doc

* Fix type
2019-10-28 11:43:47 -07:00