Commit graph

250 commits

Author SHA1 Message Date
mehrdadn
ad4ac9aa70
Add clang-iwyu (#7081)
* Add iwyu

Co-authored-by: GitHub Web Flow <noreply@github.com>
2020-02-07 16:19:46 -08:00
ijrsvt
0826f95e1c
Including psutil & setproctitle (#7031) 2020-02-05 14:16:58 -08:00
Eric Liang
fbc545c03b
[rllib] Support parallel, parameterized evaluation (#6981)
* eval api

* update

* sync eval filters

* sync fix

* docs

* update

* docs

* update

* link

* nit

* doc updates

* format
2020-02-01 22:12:12 -08:00
SangBin Cho
c9f5def56a
Show lint download commands if tools not installed (#6984) 2020-01-31 10:42:09 -08:00
Ameer Haj Ali
b8135da122
Adding dependencies for scikit-learn in travis (#6969)
* Revert "Revert "Support of scikit-learn with ray joblib backend (#6925)" (#6957)"

This reverts commit 86100bc119.

* adding scikit-learn to dependencies
2020-01-30 09:46:54 -08:00
Amog Kamsetty
11d90d6d0c
Change files tested in Travis by changing git diff from 2-dot to 3-dot (#6960). 2020-01-30 09:26:44 -08:00
Richard Liaw
037aa2b961
[sgd] Refactor PyTorch SGD Documentation. (#6910)
* Refactor documentation and directory structurre

* update loss

* ,ore examples

* fix comments

* more code

* svgs

* formatting

* more_docs

* more writing

* comments ready

* move

* whitespace

* examples

* fix

* bold

* pytorch

* batch

* fix

* fix test

* Apply suggestions from code review

* quarantinegp

* tests/

* fix missing
2020-01-29 08:51:01 -08:00
Sven Mika
446cbdf2e0 [RLlib] Fix issue (bug): LSTM + non-shared vf + PPO + tuple actions (#6890)
* Add `RandomEnv` example to examples folder.
Convert warning into Error message when using an LSTM in a non-shared-vf network (after the warning, the program would crash).

* LINT.

* Fix issue #6884. LSTM + non-shared vf NN + PPO crashes when using a Tuple action space.

* LINT

* Change warning message for Model: shared_vf=False, LSTM=True cases.

* Bug fix.

* Add examples/random_env.py test to Jenkins.
2020-01-24 10:29:35 -08:00
Sven Mika
ae9a3a2237 [RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865) 2020-01-22 17:02:58 -08:00
Sven Mika
c957ed58ed [RLlib] Implement PPO torch version. (#6826) 2020-01-20 23:06:50 -08:00
Edward Oakes
b750bd7fc9
Use 2xlarge instances in long running tests (#6802) 2020-01-15 19:47:59 -06:00
mehrdadn
4780b52ea8 Polish clang-format and let it run in a local repo (#6793) 2020-01-15 11:17:49 -08:00
Edward Oakes
fc473e6a08
Use us-west-2 for application stress tests (#6782) 2020-01-13 15:01:03 -06:00
chaokunyang
723fe86882 [Java] Fix building Java with maven (#6764)
* lint

* gen_maven_deps for ray_java_pkg
2020-01-11 14:26:27 +08:00
Sven
60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Edward Oakes
5f843cd998
Clean up stress_testing_config.yaml (#6738)
* Clean up stress_testing_config.yaml

* comment
2020-01-07 17:05:07 -06:00
Sven
f1b56fa5ee PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). (#6650)
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).

* Fix LINT line-len errors.

* Fix LINT errors.

* Fix `tf_pg_policy` imports (formerly: `pg_policy`).

* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).

* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
  then built into the Bazel/Travis test suite.

* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.

* Fix remaining import errors for agents/pg/...

* Fix circular dependency in pg imports.

* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00
mehrdadn
f4b29dae9c Perform Bazel install directly in Windows CI (#6653) 2019-12-31 20:48:08 -08:00
Robert Nishihara
d2c6457832
Remove public facing references to --redis-address. (#6631) 2019-12-31 13:21:53 -08:00
Philipp Moritz
735f282494
Use 0.9.0.dev0 as the version tag (#6630) 2019-12-30 10:14:07 -08:00
Robert Nishihara
96f2f8ff10 Stop testing Python 2.7 and building Python 2.7 wheels. (#6601) 2019-12-27 20:47:49 -08:00
Robert Nishihara
eb0813ea35
Re-enable UI tests for wheels. (#6602) 2019-12-26 22:34:56 -08:00
Philipp Moritz
eaee672b7f
Revert "Perform Bazel install directly in Windows CI (#6529)" (#6593)
This reverts commit c5f141013b.
2019-12-24 16:39:07 -08:00
micafan
687de41273 [GCS] refactor the GCS Client Node Interface (#6010) 2019-12-24 20:36:37 +08:00
mehrdadn
c5f141013b Perform Bazel install directly in Windows CI (#6529) 2019-12-22 16:14:51 -08:00
Chaokun Yang
7bbfa85c66 [Streaming] Streaming data transfer java (#6474) 2019-12-22 10:56:05 +08:00
Simon Mo
26ec500ef9
Implement async get for direct actor call (#6339) 2019-12-18 11:50:21 -08:00
Eric Liang
6725a61bda
Release 0.8.0 test logs (#6512) 2019-12-17 15:56:50 -08:00
Eric Liang
1a1324d2a2
Bump version from 0.8.0.dev6 -> 0.9.0.dev (#6508) 2019-12-16 23:57:42 -08:00
Edward Oakes
b1e83d83d1
Print summaries for stress tests (#6498) 2019-12-16 14:14:48 -08:00
Mitchell Stern
1531c21dbd [Dashboard] Add remaining features from old dashboard (#6489)
* [Dashboard] Add remaining features from old dashboard

* Fix linting errors

* Set cluster uptime statistic to N/A

* Use proper singular or plural words for workers column

* Ignore .js, .jsx, .ts, .tsx files in check-git-clang-format-output.sh

* Fix bash quote issue
2019-12-16 11:21:18 -08:00
Richard Liaw
5719a05757
[sgd] Add support for multi-model multi-optimizer training (#6317) 2019-12-15 15:19:45 -08:00
Philipp Moritz
f5d10eea0b
[Projects] Refactor cluster specification (#6488) 2019-12-14 22:43:06 -08:00
Yuhao Yang
ad4da17899 [Tune] Add example and tutorial for DCGAN (#6400) 2019-12-13 14:15:44 -08:00
Eric Liang
be5dd8eb5e
Enable direct calls by default (#6367)
* wip

* add

* timeout fix

* const ref

* comments

* fix

* fix

* Move actor state into actor handle

* comments 2

* enable by default

* temp reorder

* some fixes

* add debug code

* tmp

* fix

* wip

* remove dbg

* fix compile

* fix

* fix check

* remove non direct tests

* Increment ref count before resolving value

* rename

* fix another bug

* tmp

* tmp

* Fix object pinning

* build change

* lint

* ActorManager

* tmp

* ActorManager

* fix test component failures

* Remove old code

* Remove unused

* fix

* fix

* fix resources

* fix advanced

* eric's diff

* blacklist

* blacklist

* cleanup

* annotate

* disable tests for now

* remove

* fix

* fix

* clean up verbosity

* fix test

* fix concurrency test

* Update .travis.yml

* Update .travis.yml

* Update .travis.yml

* split up analysis suite

* split up trial runner suite

* fix detached direct actors

* fix

* split up advanced tesT

* lint

* fix core worker test hang

* fix bad check fail which breaks test_cluster.py in tune

* fix some minor diffs in test_cluster

* less workers

* make less stressful

* split up test

* retry flaky tests

* remove old test flags

* fixes

* lint

* Update worker_pool.cc

* fix race

* fix

* fix bugs in node failure handling

* fix race condition

* fix bugs in node failure handling

* fix race condition

* nits

* fix test

* disable heartbeatS

* disable heartbeatS

* fix

* fix

* use worker id

* fix max fail

* debug exit

* fix merge, and apply [PATCH] fix concurrency test

* [patch] fix core worker test hang

* remove NotifyActorCreation, and return worker on completion of actor creation task

* remove actor diied callback

* Update core_worker.cc

* lint

* use task manager

* fix merge

* fix deadlock

* wip

* merge conflits

* fix

* better sysexit handling

* better sysexit handling

* better sysexit handling

* check id

* better debug

* task failed msg

* task failed msg

* retry failed tasks with delay

* retry failed tasks with delay

* clip deps

* fix

* fix core worker tests

* fix task manager test

* fix all tests

* cleanup

* set to 0 for direct tests

* dont check worker id for ownership rpc

* dont check worker id for ownership rpc

* debug messages

* add comment

* remove debug statements

* nit

* check worker id

* fix test

* owner

* fix tests
2019-12-13 13:58:04 -08:00
Edward Oakes
032e8553c7
use numpy in long-running tests (#6448) 2019-12-11 17:53:30 -08:00
alindkhare
76e678d775 [Serve] Added deadline awareness (#6442)
* [Serve] Added deadline awareness

Added deadline awareness while enqueuing a query
Using Blist sorted-list implementation (ascending order) to get queries according to their specified deadlines. [buffer_queues]
Exposed slo_ms via handle/http request
Added slo example 
The queries in example will be executed in almost the opposite order of which they are fired
Added slo pytest
Added check for slo_ms to not be negative
Included the changes suggested

* Linting Corrections

* Adding the code changes suggested by format.sh

* Added the suggested changes

Added justification for blist
Added blist in travis/ci/install-dependencies.sh

* Fixed linting issues

* Added blist to ray/doc/requirements-doc.txt
2019-12-11 16:41:54 -08:00
Simon Mo
c61db84b8d Bump dev6->dev7 for two files not changed yet. (#6428) 2019-12-10 20:58:14 -08:00
Chaokun Yang
6272907a57 [Streaming] Streaming data transfer and python integration (#6185) 2019-12-10 20:33:24 +08:00
Victor Le
4e24c805ee AlphaZero and Ranked reward implementation (#6385) 2019-12-07 12:08:40 -08:00
Edward Oakes
f63b64310a
Bump version to 0.8.0.dev7 (#6303) 2019-12-05 18:33:54 -08:00
Philipp Moritz
a454c815f1
Fix long running stress tests (#6374) 2019-12-05 18:29:41 -08:00
Philipp Moritz
dd27bfbb75
Rename .rayproject to ray-project (#6278) 2019-12-05 16:15:42 -08:00
Eric Liang
4c6739476b
[rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() (#6365) 2019-12-05 10:13:54 -08:00
Simon Mo
31113aeded
Use rayproject repo (#6353) 2019-12-03 22:36:40 -08:00
Eric Liang
e5863d7914
Force tune tests to run in direct call mode (#6301)
* force tune direct mode

* force tune

* fix

* Update run_multi_node_tests.sh
2019-11-27 19:58:33 -08:00
Simon Mo
dd80c6e6d4 Hotfix make docker images building optional (#6309)
* Make docker build optional

* Fix syntax error
2019-11-27 20:52:21 -06:00
Simon Mo
22b305223a
Build Docker Containers for Linux Wheels (#6233) 2019-11-27 17:05:36 -08:00
Edward Oakes
141d667cee
Fix bash syntax error in test-wheels.sh (#6290) 2019-11-26 13:15:54 -06:00
Edward Oakes
7f8de61441 [hotfix] Remove python/ray/tests/__init__.py (#6279)
* Remove python/ray/tests/__init__.py for bazel

* Comment out checks
2019-11-25 17:04:20 -08:00