Commit graph

208 commits

Author SHA1 Message Date
Kai Fricke
d8d8901192
[ci/tune] Remove deprecated jenkins_only tag from test tags (#19287) 2021-10-12 10:05:46 +01:00
Matti Picus
9ca34c7192
add dependencies to BUILD.bazel and update windows bazel to 4.2.1 (#19132)
* add dependencies to BUILD.bazel and update windows bazel to 4.2.1

* fixes from review
2021-10-11 10:25:19 -07:00
SangBin Cho
0ef0d9a77d
Revert "[core] Assign tasks to the first available worker (#18167)" (#19180)
This reverts commit 545db13800.
2021-10-07 10:38:37 -07:00
Stephanie Wang
545db13800
[core] Assign tasks to the first available worker (#18167)
* Convert worker pool to queue

* Start up to backlog size more workers

* fixes

* Prestart workers according to num available CPUs

* lint

* x

* Update src/ray/raylet/worker_pool.h

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* Update src/ray/raylet/worker_pool.h

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* dedicated workers

* Fix tests

* x

* fix

* asan

* asan

* Workers can only exec tasks with same job ID

* size_t for runtime env hash, fix unit tests

* include job ID in runtime env hash, remove from worker registration msg

* x

* conflict

* debug

* Schedule and dispatch periodically, skip if no new tasks

* Update src/ray/common/task/task_spec.h

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* Update src/ray/raylet/scheduling/cluster_task_manager.h

Co-authored-by: Eric Liang <ekhliang@gmail.com>

* Update src/ray/raylet/worker_pool.h

Co-authored-by: Eric Liang <ekhliang@gmail.com>

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-10-05 13:45:50 -07:00
Kai Fricke
3dc176c42e
[ci/tune] Add SGD and Tune GPU pipeline step to CI (#18469)
* [ci/tune] Add Tune GPU pipeline step to CI

* cont.

* add sgd gpu tests

* format yaml, fix imports

* install horovod; fix line wrapping

* set GPU per worker to 0.5

* fix import

* move test to 4gpu machine

* fix lint

* lint

* set visible devices

* pull in tf gpu fix

* Fix Tune GPU pipeline step

* nit

* Disable GPU tests until we have some

* Re-add empty rllib tests

Co-authored-by: Matthew Deng <matthew.j.deng@gmail.com>
2021-10-01 18:34:05 -07:00
architkulkarni
0f0b161ea1
Revert "Revert "[Serve] [doc] Improve runtime env doc"" (#18943)
* Revert "Revert "[Serve] [doc] Improve runtime env doc (#18782)" (#18935)"

This reverts commit e4f4c79252.
2021-09-30 13:28:44 -05:00
Yi Cheng
e4f4c79252
Revert "[Serve] [doc] Improve runtime env doc (#18782)" (#18935)
This reverts commit d4d71985d5.
2021-09-27 21:52:13 -07:00
architkulkarni
d4d71985d5
[Serve] [doc] Improve runtime env doc (#18782) 2021-09-27 16:12:03 -05:00
mwtian
43ac18bbc0
[Build] include minimal debug info in C++ build; upgrade clang-format to 12 (#18888)
* Revert "Revert "[Build] include minimal debug info in C++ build; upgrade clang-format to 12 (#18840)" (#18886)"

This reverts commit f851a072f3.

* use gcc 8
2021-09-24 17:59:05 -07:00
Chen Shen
f851a072f3
Revert "[Build] include minimal debug info in C++ build; upgrade clang-format to 12 (#18840)" (#18886)
This reverts commit 07e1366383.
2021-09-24 12:55:08 -07:00
mwtian
07e1366383
[Build] include minimal debug info in C++ build; upgrade clang-format to 12 (#18840)
* debug info and clang-format

* doc

* fix

* no clang-format on all files

* gcc

* keep gcc 7
2021-09-24 12:26:33 -07:00
Chen Shen
35aa944ef4
Fix thread-safety in global state accessor (#18746) 2021-09-19 12:01:31 -07:00
mwtian
efdbfcfdfb
[Build] Generate Bazel config for compiling with clang and libc++ in CI (#18622)
* Add Bazel config for building with llvm. Upgrade C++ std to 17.

* Fix redis. Try fixing asan and tsan

* Fix asan and format

* Update comments.

Co-authored-by: Chen Shen <scv119@gmail.com>
2021-09-17 19:01:07 -07:00
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591) 2021-09-15 22:16:48 +02:00
Antoni Baum
7e95f330d5
[ci] Fix xgboost_ray install from git (#18640) 2021-09-15 18:07:15 +01:00
Edward Oakes
7736cdd91d
[dashboard] Rename "new_dashboard" -> "dashboard" (#18214) 2021-09-15 11:17:15 -05:00
Antoni Baum
eeb67a42cc
pip install xgboost_ray -> xgboost_ray[default] (#18607)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-09-15 14:45:56 +01:00
Simon Mo
497c5f56fa
[CI] Temporary disable worker-in-container test (#18606)
* revert again

* disable tmp
2021-09-14 22:38:20 -07:00
SangBin Cho
0684531e22
[Test] Break down placement group tests (#18612) 2021-09-14 21:55:18 -07:00
mwtian
a3f399ef10
[Client] fix propagating errors to async calls during disconnect, and other cleanup (#18539)
* cleanup tests and errors for clients

* Fix lock and async get

* rerun

* Avoid running callback under lock. Make lock non-reentrant

* Add all necessary apis

* Removed unused APIs
2021-09-14 18:48:27 +03:00
Yi Cheng
7d1f408de9
[workflow] Move experimental/workflow to workflow (#18521) 2021-09-13 17:45:18 -07:00
Chen Shen
5f57079041
use clang for C++ debug testing (#18343) 2021-09-09 15:48:36 -07:00
mwtian
26fd10c9e8
[CI] Add clang-tidy to lint (#18124)
* clang-tidy

* fix

* fix script

* test clang compiler

* fix clang-tidy rules

* Fix windows and other issues.

* Fix

* Improve information when running check-git-clang-tidy-output.sh on different OS
2021-09-09 00:41:53 -07:00
Simon Mo
a29da81cfc
Revert "Revert "Fix tracing bug when actors are defined before connecting to …" (#16122) 2021-09-07 16:19:49 -07:00
ellimac54
772d25cc38
Add Initial Windows Dockerfile (#17474) 2021-09-03 11:41:06 -07:00
Kai Fricke
fb38d06cfb
Move RLLib GPU release test dependencies to ml docker (#18208) 2021-09-03 09:35:18 +01:00
matthewdeng
a3123b6860
[SGD] v2 Horovod backend (#18047)
* [SGD] add Horovod backend

* address comments: set CUDA_VISIBLE_DEVICES, refactor code

* fix gpu test

* fix lint/test import

* address comments, add example cluster config

* delay horovod imports
2021-08-31 12:54:59 -07:00
Kai Fricke
a8dbc44f9a
[ci] minimal dependency install test (#18071) 2021-08-31 15:26:25 +02:00
Sven Mika
599e589481
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00
Kai Fricke
012f9eb687
[buildkite] Fix jar upload directory (#18253) 2021-08-31 11:18:34 +02:00
Simon Mo
2e0b816d64
[Buildkite] Upload jars to os specific dir (#18229) 2021-08-31 09:32:01 +02:00
Antoni Baum
5be6bda4cf
[tests] Add Ludwig CI test (#18126) 2021-08-30 12:27:39 -07:00
Amog Kamsetty
3b77840c1b
PyTorch Lightning Updates (#17876) 2021-08-27 23:15:51 -07:00
Chen Shen
7e3e0d1535
[Test] Add C++ tsan test (#17875) 2021-08-24 00:57:32 -07:00
Kai Fricke
d058f98546
[RLlib] Add GPU tests to CI (run per-PR). (#17891)
Co-authored-by: simon-mo <simon.mo@hey.com>
2021-08-24 09:20:45 +02:00
Chen Shen
0f894e9cbd
revert ebs cold start (#18010) 2021-08-23 13:40:31 -07:00
Chen Shen
e369ecab43
Fix EBS cold start in Mac (#18001) 2021-08-22 20:03:59 -07:00
Chen Shen
31482563c2
[Test] fix-mac-test by avoiding cold start (#17988) 2021-08-20 15:04:29 -07:00
Chen Shen
880797d5c2
[Core][Test] Add ubsan support for C++ tests (#17812)
* support ubsan

* update
2021-08-17 10:22:03 -07:00
SangBin Cho
4971e13941
[Build] Asan wheel test (#17685)
* in progerss

* ASAN tests.

* d

* in progress

* in progress without the asan wheel

* Support the asan wheel.

* Support the asan wheels

* Not build a binary for asan

* Fix issues

* Remove a wrong build

* Separate out asan wheel build

* Try preparing more deps.

* ip

* Try different version

* done

* d

* Trial

* Another try

* Another try

* skip cpp build to see what happens

* add more des

* ip

* abc

* Try next

* completed

* try

* Try without static libasan

* dbg

* Try static link

* Fix issues

* abc
2021-08-17 10:21:41 -07:00
Sven Mika
f3bbe4ea44
[RLlib] Test cases/BUILD cleanup; split "everything else" (longest running one rn) tests in 2. (#17640) 2021-08-16 22:01:01 +02:00
Simon Mo
61ac06cc6d
[Buildkite] Fix zsh bug so latest wheels are pushed correctly (#17831) 2021-08-13 15:33:14 -07:00
qicosmos
a2a1c46c83
[C++ Worker]Fix for mac (#17633)
* linkopts shared

* replace gflags with absl flags

* fix

* add test option

* fix

* add cpp worker to mac ci

* fix

* support empty redis password;mod arc argv

* add encoding

* test

* ignore example test on mac

* support mac

* fix

* fix and update doc

* fix

* fix run.sh

* fix init

* fix typo

* fix run.sh

* fix lint

Co-authored-by: 久龙 <guyang.sgy@antfin.com>
2021-08-13 12:22:37 +08:00
Clark Zinzow
d6eeb5dc70
[Datasets] Add local and S3 filesystem test coverage for file-based datasources. (#17158) 2021-08-12 08:39:31 -07:00
Simon Mo
c315596ed2
[Buildkite] Migrate macOS wheel builds (#16913) 2021-08-07 21:54:34 -07:00
Chen Shen
0fd3f761b9
[ci][rfc] build debug wheels and run python test on debug build (#17399)
* enable debug mode

* add

* :upload debug wheels

* upload debug wheels

* add

* fix bug

* add dbg

* Update python/setup.py

Co-authored-by: Simon Mo <simon.mo@hey.com>

* skip windows

Co-authored-by: Simon Mo <simon.mo@hey.com>
2021-08-05 17:58:19 -07:00
Eric Liang
d4f9d3620e
Move ray.data out of experimental (#17560) 2021-08-04 13:31:10 -07:00
architkulkarni
756a4e7a90
[Core] [runtime env] update tests to use ray.init(runtime_env=...) and add e2e test (#17232) 2021-07-26 11:21:30 -05:00
Sven Mika
5231fdd996
[Testing] Split RLlib example scripts CI tests into 4 jobs (from 2). (#17331) 2021-07-26 10:52:55 -04:00
matthewdeng
fdbeef6046
[SGD] RaySGD v2 skeleton code (#17300)
* [SGD] RaySGD v2 skeleton code

* add build file

* move file

* empty

* rename

* address comments

* add method interfaces

* move BUILD file out of tests dir

Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
2021-07-25 17:39:24 -07:00