Commit graph

3664 commits

Author SHA1 Message Date
Eric Liang
1a1324d2a2
Bump version from 0.8.0.dev6 -> 0.9.0.dev (#6508) 2019-12-16 23:57:42 -08:00
mehrdadn
9948a3779d Simplify patches and make them more robust (#6478)
* Get rid of 'index' lines in patches, which are unnecessary and likely wrong anyway (esp. when there are multiple patches)

* Simplify patches to remove unnecessary context and make them more robust
2019-12-16 19:28:06 -08:00
Edward Oakes
38b43fb3ca
Optimize O(n^2) behavior in dependency resolver (#6509)
* Optimize O(n^2) behavior in dependency resolver

* fix check

* checks
2019-12-16 18:41:02 -08:00
Mitchell Stern
6cb34b699e Expose extra node info from raylet stats (#6511) 2019-12-16 18:22:37 -08:00
Yunzhi Zhang
ce1c9a87a7 Expand dashboard by default (#6505) 2019-12-16 17:17:29 -08:00
mehrdadn
74b2e871b7 Tentative workaround for some forks and signals on Windows (#6362)
* Platform shims for Windows

* Tentative workaround for some forks and signals on Windows

* Rewrite WorkerPool::StartProcess by moving spawnvp wrapper to a separate function

* Separate spawnvp the wrappers for POSIX and Windows

* Fix rv use
2019-12-16 16:57:49 -08:00
Edward Oakes
8636d67b72 Improve release docs and add results from 0.7.7 (#6506)
* Improve docs, add logs

* add logs

* microbenchmark

* lint
2019-12-16 15:51:39 -08:00
Mitchell Stern
b7d23405fe [Dashboard] Change default port from 8080 to 8265 (#6503)
* [Dashboard] Change default port from 8080 to 8265

* Revise order of imports in pip install setup command
2019-12-16 14:25:23 -08:00
Edward Oakes
b1e83d83d1
Print summaries for stress tests (#6498) 2019-12-16 14:14:48 -08:00
Mitchell Stern
1531c21dbd [Dashboard] Add remaining features from old dashboard (#6489)
* [Dashboard] Add remaining features from old dashboard

* Fix linting errors

* Set cluster uptime statistic to N/A

* Use proper singular or plural words for workers column

* Ignore .js, .jsx, .ts, .tsx files in check-git-clang-format-output.sh

* Fix bash quote issue
2019-12-16 11:21:18 -08:00
Kai Yang
b7d5c8f220 [Java] Fix multiple FunctionManagers creating multiple ClassLoader s (#6434) 2019-12-16 14:04:44 +08:00
Ujval Misra
e38b25edfb Fix duplicate progress output. (#6497) 2019-12-15 21:53:24 -08:00
Richard Liaw
5719a05757
[sgd] Add support for multi-model multi-optimizer training (#6317) 2019-12-15 15:19:45 -08:00
Kai Yang
c2499c802f disable actor checkpointing and reconstruction test in direct call mode (#6490) 2019-12-15 17:54:31 +08:00
Kai Yang
c3ef8581d2 [Java] fix UT segmentation fault on exit (#6455)
* fix segmentation fault in Java test

* update comments

* address comments
2019-12-15 17:52:34 +08:00
Kai Yang
cd250ba0bc [Java] ID length fix (#6454) 2019-12-15 16:01:05 +08:00
Philipp Moritz
f5d10eea0b
[Projects] Refactor cluster specification (#6488) 2019-12-14 22:43:06 -08:00
Kai Yang
9cc0ecc6ff Fix duplicated logging if log dir is not set (#6342) 2019-12-15 13:29:36 +08:00
ZhuSenlin
6c0531683f Add gcs server as well as the unit test (#6401) 2019-12-15 13:23:42 +08:00
Philipp Moritz
afae8406da Make sure numpy >= 1.16.0 is installed for fast pickling support (#6486)
* Make sure numpy >= 1.16.0 is installed

* Works for 1.15.4

* lint

* formatting

* update

* put check into the right place

* lint
2019-12-14 16:36:49 -08:00
Tim Gates
ac8f8143e7 Fix simple typo: verion -> version (#6485)
Closes #6484
2019-12-14 15:37:55 -08:00
Edward Oakes
e2b7459bfc
Fix worker exit cleanup (#6450)
* working but ugly

* comments

* proper but hanging in grpc server destructor

* grpc server shutdown deadline

* fix disconnect

* lint

* shutdown_only in test

* replace shutdown
2019-12-13 16:52:50 -08:00
Eugene Vinitsky
3cb499632e (Bug Fix): Remove the extra 0.5 in the Diagonal Gaussian entropy (#6475) 2019-12-13 14:42:30 -08:00
Yuhao Yang
ad4da17899 [Tune] Add example and tutorial for DCGAN (#6400) 2019-12-13 14:15:44 -08:00
Eric Liang
be5dd8eb5e
Enable direct calls by default (#6367)
* wip

* add

* timeout fix

* const ref

* comments

* fix

* fix

* Move actor state into actor handle

* comments 2

* enable by default

* temp reorder

* some fixes

* add debug code

* tmp

* fix

* wip

* remove dbg

* fix compile

* fix

* fix check

* remove non direct tests

* Increment ref count before resolving value

* rename

* fix another bug

* tmp

* tmp

* Fix object pinning

* build change

* lint

* ActorManager

* tmp

* ActorManager

* fix test component failures

* Remove old code

* Remove unused

* fix

* fix

* fix resources

* fix advanced

* eric's diff

* blacklist

* blacklist

* cleanup

* annotate

* disable tests for now

* remove

* fix

* fix

* clean up verbosity

* fix test

* fix concurrency test

* Update .travis.yml

* Update .travis.yml

* Update .travis.yml

* split up analysis suite

* split up trial runner suite

* fix detached direct actors

* fix

* split up advanced tesT

* lint

* fix core worker test hang

* fix bad check fail which breaks test_cluster.py in tune

* fix some minor diffs in test_cluster

* less workers

* make less stressful

* split up test

* retry flaky tests

* remove old test flags

* fixes

* lint

* Update worker_pool.cc

* fix race

* fix

* fix bugs in node failure handling

* fix race condition

* fix bugs in node failure handling

* fix race condition

* nits

* fix test

* disable heartbeatS

* disable heartbeatS

* fix

* fix

* use worker id

* fix max fail

* debug exit

* fix merge, and apply [PATCH] fix concurrency test

* [patch] fix core worker test hang

* remove NotifyActorCreation, and return worker on completion of actor creation task

* remove actor diied callback

* Update core_worker.cc

* lint

* use task manager

* fix merge

* fix deadlock

* wip

* merge conflits

* fix

* better sysexit handling

* better sysexit handling

* better sysexit handling

* check id

* better debug

* task failed msg

* task failed msg

* retry failed tasks with delay

* retry failed tasks with delay

* clip deps

* fix

* fix core worker tests

* fix task manager test

* fix all tests

* cleanup

* set to 0 for direct tests

* dont check worker id for ownership rpc

* dont check worker id for ownership rpc

* debug messages

* add comment

* remove debug statements

* nit

* check worker id

* fix test

* owner

* fix tests
2019-12-13 13:58:04 -08:00
Richard Liaw
3754effafc
Make setup-dev.py more resilient (#6467)
* fix_tests

* link_tests
2019-12-13 11:32:04 -08:00
Richard Liaw
4ff6ca89f4
[docs] slight doc modifications (#6466) 2019-12-13 10:38:17 -08:00
Eric Liang
335dade1e6
Check worker id for all core worker RPCs (#6472)
* check worker id

* fix test

* owner

* fix tests

* comments
2019-12-13 10:15:56 -08:00
Eric Liang
eb6f3f86e5
Seed using multiple samples (#6471) 2019-12-12 21:41:19 -08:00
Stephanie Wang
c57dcc82d1 Port actor creation to use direct calls (#6375) 2019-12-12 19:50:51 -08:00
Philipp Moritz
74b454c614
Fix overriding of params dictionary (#6445) 2019-12-12 19:15:13 -08:00
Eric Liang
5a5c94939f
[direct call] Retry failed tasks with delay (#6453)
* retry failed tasks with delay

* set to 0 for direct tests
2019-12-12 17:12:38 -08:00
Zack Polizzi
9e9c524823 Update pong-apex tuned example (#6462) 2019-12-12 10:57:55 -08:00
Kai Yang
3adbe29450 fix core worker test hanging due to heartbeat is not working (#6416) 2019-12-12 18:16:28 +08:00
micafan
8c1520d18e [GCS] refactor the GCS Client Job Interface (#5503) 2019-12-12 16:57:32 +08:00
wanxing
40211bed4b [Streaming]Fix default JobID (#6436) 2019-12-12 14:37:17 +08:00
wanxing
64d8626d6d Optimize ray::LocalMemoryBuffer performance (#6384) 2019-12-11 21:49:52 -08:00
Edward Oakes
032e8553c7
use numpy in long-running tests (#6448) 2019-12-11 17:53:30 -08:00
alindkhare
76e678d775 [Serve] Added deadline awareness (#6442)
* [Serve] Added deadline awareness

Added deadline awareness while enqueuing a query
Using Blist sorted-list implementation (ascending order) to get queries according to their specified deadlines. [buffer_queues]
Exposed slo_ms via handle/http request
Added slo example 
The queries in example will be executed in almost the opposite order of which they are fired
Added slo pytest
Added check for slo_ms to not be negative
Included the changes suggested

* Linting Corrections

* Adding the code changes suggested by format.sh

* Added the suggested changes

Added justification for blist
Added blist in travis/ci/install-dependencies.sh

* Fixed linting issues

* Added blist to ray/doc/requirements-doc.txt
2019-12-11 16:41:54 -08:00
Maltimore
0ec613c95a [rllib] doc: fix typo: on_postprocess_batch -> on_postprocess_traj (#6438) 2019-12-11 15:00:53 -08:00
Robert Nishihara
240e8f5279 Fix error message when failing to start UI if grpcio not installed. (#6433) 2019-12-11 14:56:13 -08:00
Eric Liang
58ac8639b9
Fix bad checks and race condition from actor_deaths and node_failures tests (#6411) 2019-12-11 14:47:24 -08:00
Eric Liang
b3eb374817
[tune] Really disable retries by default 2019-12-11 13:12:28 -08:00
Edward Oakes
82f7dbc7a7
Increase TaskID size by 2 bytes, taken from JobID (#6425)
* Increase TaskID size by 2 bytes, taken from JobID

* comments

* check max job id

* fix doc

* fix local mode
2019-12-11 10:45:14 -08:00
Dean Wampler
abb4fb3f8e Added small section on installation when using Anaconda. (#6427) 2019-12-11 10:23:41 -08:00
Kai Yang
a131082767 fix startup worker process count for multi-threading (#6382) 2019-12-11 20:19:49 +08:00
Yuhao Yang
3db8faab0d [tune] fix log dir race condition (#6420) 2019-12-10 21:00:19 -08:00
Simon Mo
c61db84b8d Bump dev6->dev7 for two files not changed yet. (#6428) 2019-12-10 20:58:14 -08:00
Hao Chen
5cc3e1341a
[Java] Cache result in RayObjectImpl (#6414) 2019-12-11 11:26:01 +08:00
Edward Oakes
044527adb8
Remove ref counting dependencies on ray.get() (#6412)
* Remove ref counting dependencies on Get()

* comment

* don't send IDs when disabled

* pass through internal config

* fix

* allow reinit

* remove flag
2019-12-10 18:11:34 -08:00