Commit graph

3877 commits

Author SHA1 Message Date
Kai Yang
978d72be0a Disable port sharing in grpc server (#6479) 2019-12-18 14:48:54 +08:00
Kai Yang
7e6defca72 fix ClassLoaderTest.java (#6502) 2019-12-18 14:48:25 +08:00
Simon Mo
e530c37b0e
Use localhost and set redis password by default (#6481) 2019-12-17 19:41:19 -08:00
Philipp Moritz
bbe5d83eb8 Windows CI for Ray on github actions (#6519)
* Windows CI for Ray

* Update main.yml

* Does this work?

* Download Bazel to current directory

* Install required packages via Chocolatey

* Improve the Windows CI script
2019-12-17 17:33:33 -08:00
Eric Liang
6725a61bda
Release 0.8.0 test logs (#6512) 2019-12-17 15:56:50 -08:00
mehrdadn
ab5a9f0946 Patch hiredis for Windows (#6446) 2019-12-17 15:32:47 -08:00
Eric Liang
2530eb90dc
Move tf.test.is_gpu_available() to after session init (#6515)
* move to after session init

* script fixes
2019-12-17 14:55:39 -08:00
Philipp Moritz
4d71ab83cf require packaging (#6517) 2019-12-17 12:01:14 -08:00
Ujval Misra
81197e47c7 [tune] Refactor syncer (#6496)
* Refactor syncer and log_sync.

* Fix documentation.

* Remove delete from api

* Rename to get_node_syncer
2019-12-17 05:25:16 -08:00
mehrdadn
7a24144bfd Polish Bazel build scripts (#6424)
* Polish Bazel build scripts

* Remove glog references from streaming_logging.cc

* Move out COPTS and reference them

* Disable streaming on Windows

* Remove -fno-gnu-unique
2019-12-17 02:38:36 -08:00
mehrdadn
9d6e03aba0 Fix use of select() instead of poll() (#6477)
* Fix Arrow poll() patch

- Negative timeout for poll() was not translated to infinite timeout for select()
- Only use select() on Windows, as other systems limit the range of the file descriptors

* Apply poll() -> select() patch to Redis's ae.c as well
2019-12-17 02:33:37 -08:00
Yunzhi Zhang
166560e428 [Dashboard] displays resources row (#6516) 2019-12-17 01:05:57 -08:00
Simon Mo
840d9c126f
Move travis build script to after the deploy stage (#6518)
* move travis build script to after the deploy stage

* Add skip cleanup
2019-12-17 00:03:05 -08:00
Eric Liang
1a1324d2a2
Bump version from 0.8.0.dev6 -> 0.9.0.dev (#6508) 2019-12-16 23:57:42 -08:00
mehrdadn
9948a3779d Simplify patches and make them more robust (#6478)
* Get rid of 'index' lines in patches, which are unnecessary and likely wrong anyway (esp. when there are multiple patches)

* Simplify patches to remove unnecessary context and make them more robust
2019-12-16 19:28:06 -08:00
Edward Oakes
38b43fb3ca
Optimize O(n^2) behavior in dependency resolver (#6509)
* Optimize O(n^2) behavior in dependency resolver

* fix check

* checks
2019-12-16 18:41:02 -08:00
Mitchell Stern
6cb34b699e Expose extra node info from raylet stats (#6511) 2019-12-16 18:22:37 -08:00
Yunzhi Zhang
ce1c9a87a7 Expand dashboard by default (#6505) 2019-12-16 17:17:29 -08:00
mehrdadn
74b2e871b7 Tentative workaround for some forks and signals on Windows (#6362)
* Platform shims for Windows

* Tentative workaround for some forks and signals on Windows

* Rewrite WorkerPool::StartProcess by moving spawnvp wrapper to a separate function

* Separate spawnvp the wrappers for POSIX and Windows

* Fix rv use
2019-12-16 16:57:49 -08:00
Edward Oakes
8636d67b72 Improve release docs and add results from 0.7.7 (#6506)
* Improve docs, add logs

* add logs

* microbenchmark

* lint
2019-12-16 15:51:39 -08:00
Mitchell Stern
b7d23405fe [Dashboard] Change default port from 8080 to 8265 (#6503)
* [Dashboard] Change default port from 8080 to 8265

* Revise order of imports in pip install setup command
2019-12-16 14:25:23 -08:00
Edward Oakes
b1e83d83d1
Print summaries for stress tests (#6498) 2019-12-16 14:14:48 -08:00
Mitchell Stern
1531c21dbd [Dashboard] Add remaining features from old dashboard (#6489)
* [Dashboard] Add remaining features from old dashboard

* Fix linting errors

* Set cluster uptime statistic to N/A

* Use proper singular or plural words for workers column

* Ignore .js, .jsx, .ts, .tsx files in check-git-clang-format-output.sh

* Fix bash quote issue
2019-12-16 11:21:18 -08:00
Kai Yang
b7d5c8f220 [Java] Fix multiple FunctionManagers creating multiple ClassLoader s (#6434) 2019-12-16 14:04:44 +08:00
Ujval Misra
e38b25edfb Fix duplicate progress output. (#6497) 2019-12-15 21:53:24 -08:00
Richard Liaw
5719a05757
[sgd] Add support for multi-model multi-optimizer training (#6317) 2019-12-15 15:19:45 -08:00
Kai Yang
c2499c802f disable actor checkpointing and reconstruction test in direct call mode (#6490) 2019-12-15 17:54:31 +08:00
Kai Yang
c3ef8581d2 [Java] fix UT segmentation fault on exit (#6455)
* fix segmentation fault in Java test

* update comments

* address comments
2019-12-15 17:52:34 +08:00
Kai Yang
cd250ba0bc [Java] ID length fix (#6454) 2019-12-15 16:01:05 +08:00
Philipp Moritz
f5d10eea0b
[Projects] Refactor cluster specification (#6488) 2019-12-14 22:43:06 -08:00
Kai Yang
9cc0ecc6ff Fix duplicated logging if log dir is not set (#6342) 2019-12-15 13:29:36 +08:00
ZhuSenlin
6c0531683f Add gcs server as well as the unit test (#6401) 2019-12-15 13:23:42 +08:00
Philipp Moritz
afae8406da Make sure numpy >= 1.16.0 is installed for fast pickling support (#6486)
* Make sure numpy >= 1.16.0 is installed

* Works for 1.15.4

* lint

* formatting

* update

* put check into the right place

* lint
2019-12-14 16:36:49 -08:00
Tim Gates
ac8f8143e7 Fix simple typo: verion -> version (#6485)
Closes #6484
2019-12-14 15:37:55 -08:00
Edward Oakes
e2b7459bfc
Fix worker exit cleanup (#6450)
* working but ugly

* comments

* proper but hanging in grpc server destructor

* grpc server shutdown deadline

* fix disconnect

* lint

* shutdown_only in test

* replace shutdown
2019-12-13 16:52:50 -08:00
Eugene Vinitsky
3cb499632e (Bug Fix): Remove the extra 0.5 in the Diagonal Gaussian entropy (#6475) 2019-12-13 14:42:30 -08:00
Yuhao Yang
ad4da17899 [Tune] Add example and tutorial for DCGAN (#6400) 2019-12-13 14:15:44 -08:00
Eric Liang
be5dd8eb5e
Enable direct calls by default (#6367)
* wip

* add

* timeout fix

* const ref

* comments

* fix

* fix

* Move actor state into actor handle

* comments 2

* enable by default

* temp reorder

* some fixes

* add debug code

* tmp

* fix

* wip

* remove dbg

* fix compile

* fix

* fix check

* remove non direct tests

* Increment ref count before resolving value

* rename

* fix another bug

* tmp

* tmp

* Fix object pinning

* build change

* lint

* ActorManager

* tmp

* ActorManager

* fix test component failures

* Remove old code

* Remove unused

* fix

* fix

* fix resources

* fix advanced

* eric's diff

* blacklist

* blacklist

* cleanup

* annotate

* disable tests for now

* remove

* fix

* fix

* clean up verbosity

* fix test

* fix concurrency test

* Update .travis.yml

* Update .travis.yml

* Update .travis.yml

* split up analysis suite

* split up trial runner suite

* fix detached direct actors

* fix

* split up advanced tesT

* lint

* fix core worker test hang

* fix bad check fail which breaks test_cluster.py in tune

* fix some minor diffs in test_cluster

* less workers

* make less stressful

* split up test

* retry flaky tests

* remove old test flags

* fixes

* lint

* Update worker_pool.cc

* fix race

* fix

* fix bugs in node failure handling

* fix race condition

* fix bugs in node failure handling

* fix race condition

* nits

* fix test

* disable heartbeatS

* disable heartbeatS

* fix

* fix

* use worker id

* fix max fail

* debug exit

* fix merge, and apply [PATCH] fix concurrency test

* [patch] fix core worker test hang

* remove NotifyActorCreation, and return worker on completion of actor creation task

* remove actor diied callback

* Update core_worker.cc

* lint

* use task manager

* fix merge

* fix deadlock

* wip

* merge conflits

* fix

* better sysexit handling

* better sysexit handling

* better sysexit handling

* check id

* better debug

* task failed msg

* task failed msg

* retry failed tasks with delay

* retry failed tasks with delay

* clip deps

* fix

* fix core worker tests

* fix task manager test

* fix all tests

* cleanup

* set to 0 for direct tests

* dont check worker id for ownership rpc

* dont check worker id for ownership rpc

* debug messages

* add comment

* remove debug statements

* nit

* check worker id

* fix test

* owner

* fix tests
2019-12-13 13:58:04 -08:00
Richard Liaw
3754effafc
Make setup-dev.py more resilient (#6467)
* fix_tests

* link_tests
2019-12-13 11:32:04 -08:00
Richard Liaw
4ff6ca89f4
[docs] slight doc modifications (#6466) 2019-12-13 10:38:17 -08:00
Eric Liang
335dade1e6
Check worker id for all core worker RPCs (#6472)
* check worker id

* fix test

* owner

* fix tests

* comments
2019-12-13 10:15:56 -08:00
Eric Liang
eb6f3f86e5
Seed using multiple samples (#6471) 2019-12-12 21:41:19 -08:00
Stephanie Wang
c57dcc82d1 Port actor creation to use direct calls (#6375) 2019-12-12 19:50:51 -08:00
Philipp Moritz
74b454c614
Fix overriding of params dictionary (#6445) 2019-12-12 19:15:13 -08:00
Eric Liang
5a5c94939f
[direct call] Retry failed tasks with delay (#6453)
* retry failed tasks with delay

* set to 0 for direct tests
2019-12-12 17:12:38 -08:00
Zack Polizzi
9e9c524823 Update pong-apex tuned example (#6462) 2019-12-12 10:57:55 -08:00
Kai Yang
3adbe29450 fix core worker test hanging due to heartbeat is not working (#6416) 2019-12-12 18:16:28 +08:00
micafan
8c1520d18e [GCS] refactor the GCS Client Job Interface (#5503) 2019-12-12 16:57:32 +08:00
wanxing
40211bed4b [Streaming]Fix default JobID (#6436) 2019-12-12 14:37:17 +08:00
wanxing
64d8626d6d Optimize ray::LocalMemoryBuffer performance (#6384) 2019-12-11 21:49:52 -08:00