Commit graph

3935 commits

Author SHA1 Message Date
Eric Liang
8b4b49662b
Force OMP_NUM_THREADS=1 if unset (#6998)
* force omp

* update

* set

* workers

* link
2020-02-01 11:46:11 -08:00
roireshef
3c60caa448
[rllib] implemented compute_advantages without gae (#6941) 2020-01-31 22:25:45 -08:00
Edward Oakes
92525f35d1
Remove raylet client from Python worker (#6018) 2020-01-31 18:23:01 -08:00
Edward Oakes
341a921d81
Remove vanilla pickle serialization for task arguments (#6948) 2020-01-31 16:52:43 -08:00
Yutai Zhou
9b6794cbb0
[rllib] updated policy definition link (#6989) 2020-01-31 16:22:11 -08:00
Edward Oakes
8f07d21d19
Remove thread sanitizer from CI (#6996) 2020-01-31 14:15:27 -08:00
Jaroslaw Rzepecki
67319bc887
[RLlib] Update MARWIL to use tf policy template (#6975)
* update MARWIL to use tf policy template

* formatting fixes
2020-01-31 12:57:52 -08:00
Edward Oakes
4a78b60cf7
Remove link to meetup RSVP from docs (#6995) 2020-01-31 11:32:50 -08:00
SangBin Cho
c9f5def56a
Show lint download commands if tools not installed (#6984) 2020-01-31 10:42:09 -08:00
Sven Mika
211a9be9a5
[RLlib] Bug fix: PR anneals beta parameter beyond final given value. (#6973)
* Bug fix: PR anneals beta parameter beyond final given value.

* LINT.

* Trigger travis re-test.
2020-01-31 09:55:03 -08:00
Sven Mika
2ccf08ad10
[RLlib] Bug fix: DQN goes into negative epsilon values after reaching explora… (#6971)
* Bug fix: DQN goes into negative epsilon values after reaching exploration percentage.

* Add `epsilon_initial_eps` to SAC to pass test_nested_spaces.py.

* Add `exploration_initial_eps` to QMIX default config.
2020-01-31 09:54:12 -08:00
Simon Mo
4e2c4302e8
Remove test_gather_benchmark (#6983) 2020-01-31 09:42:05 -08:00
Maksim Smolin
64c8996a43
[raysgd] Update to fix examples out of the box (#6966)
* Update tf-example-sgd dependencies, AMI, and instance type

* Make PyTorch dependency optional

* Re-implement optional torch import

* Update tensorflow_train_example

* Setup tf-example-sgd config for SGD development

* Document the MultiWorkerMirroredStrategy behavior

* Run scripts/format

* Undo GPU default for CI

* Remove dev deploy file_mounts

* Update docs on tf_runner and tf_trainer

* Fix formatting

* Remove the debug file-mounts again

* Disable cifar example GPU usage by default so CI runs properly

* Mark failing PyTorch test as flaky

* Clarify the tf SGD sanity check

* Run format script

* Update tf-example-sgd.yaml

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-01-31 01:16:57 -08:00
roireshef
dc7a555260
[rllib] Feature/histograms in tensorboard (#6942)
* Added histogram functionality to custom metrics infrastructure (another tab in tensorboard)

* updated example to include histogram metric

* added histograms to TBXLogger

* add episode rewards

* lint

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-01-30 22:02:53 -08:00
SangBin Cho
df518849ed
Remove ray.wait timeout warning for milliseconds (#6980) 2020-01-30 19:07:52 -08:00
Amog Kamsetty
c8bf0715a6
[Parallel Iterator] Local Shuffle (#6921)
* adding local shuffle and corresponding tests

* fix quotes

* addressing comments and adding seed argument

* formatting

* fix formatting issues

* change test size from small to medium

* addressing comments
2020-01-30 12:27:38 -08:00
Sven Mika
136ada5fb9
[RLlib] Experiment with py_func as a means to further unify tf and torch (Schedule classes). (#6951) 2020-01-30 11:27:57 -08:00
Ameer Haj Ali
b8135da122
Adding dependencies for scikit-learn in travis (#6969)
* Revert "Revert "Support of scikit-learn with ray joblib backend (#6925)" (#6957)"

This reverts commit 86100bc119.

* adding scikit-learn to dependencies
2020-01-30 09:46:54 -08:00
Simon Mo
660eef6502
[Serve] Async Router (#6873) 2020-01-30 09:34:47 -08:00
Simon Mo
1e3a34b223
Rewrite the async api documentation (#6936)
* Rewrite the async api documentation

* Apply suggestions from code review

Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>

* clearify comment

* Add quickstart

* Add reference for async in ray.get ray.wait docstring

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-01-30 09:34:09 -08:00
Amog Kamsetty
11d90d6d0c
Change files tested in Travis by changing git diff from 2-dot to 3-dot (#6960). 2020-01-30 09:26:44 -08:00
Richard Liaw
5ab395236b
[tune] Experiment stopping API (#6886) 2020-01-30 00:34:08 -08:00
Simon Mo
5bdfc50bf6
Update the macos wheel name (#6961) 2020-01-29 15:23:43 -08:00
Eric Liang
86100bc119
Revert "Support of scikit-learn with ray joblib backend (#6925)" (#6957)
This reverts commit a7ecda6017.
2020-01-29 14:56:09 -08:00
ijrsvt
ff706660d2
Kill Actor UI addition (#6955) 2020-01-29 14:32:19 -08:00
Edward Oakes
c2be794f10
Remove try/except import asyncio for python 2 (#6947) 2020-01-29 09:17:07 -08:00
Richard Liaw
037aa2b961
[sgd] Refactor PyTorch SGD Documentation. (#6910)
* Refactor documentation and directory structurre

* update loss

* ,ore examples

* fix comments

* more code

* svgs

* formatting

* more_docs

* more writing

* comments ready

* move

* whitespace

* examples

* fix

* bold

* pytorch

* batch

* fix

* fix test

* Apply suggestions from code review

* quarantinegp

* tests/

* fix missing
2020-01-29 08:51:01 -08:00
Edward Oakes
bfaee49880
Add 0.8.1 release test results (#6943) 2020-01-28 14:14:33 -08:00
Sven Mika
4c97348cb6 [RLlib] Schedule-classes multi-framework support. (#6926) 2020-01-28 11:07:55 -08:00
Simon Mo
26d749bc18
[Dashboard] Render HTML inline (#6932) 2020-01-28 10:39:22 -08:00
Ameer Haj Ali
81238945b9 Update index.rst (#6935) 2020-01-27 18:35:48 -06:00
Eric Liang
e659699ca9
[tune] Fix directory naming regression (#6839) 2020-01-27 15:53:40 -08:00
Alex Wu
d9a2294298 Ssh identities only (#6931) 2020-01-27 17:01:21 -06:00
Richard Liaw
e0078a0d78
[autoscaler][minor] default -> latest_dlami (#6922)
* config

* latest

* Update python/ray/autoscaler/aws/config.py
2020-01-27 14:34:07 -08:00
Ameer Haj Ali
a7ecda6017 Support of scikit-learn with ray joblib backend (#6925) 2020-01-27 15:00:00 -06:00
Simon Mo
396d7fafc8
UI improvement for asyncio (#6905) 2020-01-27 12:45:51 -08:00
mehrdadn
bde575b8dd Revert "Use Boost.Process instead of pid_t (#6510)" (#6909)
This reverts commit fb8e3615d5.
2020-01-26 10:26:44 -06:00
Eric Liang
2fb53396ad [rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918) 2020-01-25 22:36:43 -08:00
hyggan
552156f22d [tune] Handles nan case for AsyncHyperBand (#6916) 2020-01-25 17:26:30 -08:00
Ujval Misra
ed9de8b2fa [tune] Expose progress reporter to users (#6915)
* Pluggable progress reporter

* Fix types

* Fix bug, address comments

* lint

* Add convenience function and test

* lint

* Use trials instead of trial_runner

* Add docs

* Update docs

* Fix doc examples

* More doc updates

* Address comments, add configurable frequency

* use reward
2020-01-25 12:28:05 -08:00
Eric Liang
2e88e2e773 Split up bazel test into tune / non tune tests (#6846)
* fix it

* move

* Update .travis.yml
2020-01-25 12:25:12 -08:00
Yunzhi Zhang
aa5427ca78 [Dashboard] Kill actor (#6906) 2020-01-24 17:21:44 -08:00
Mitchell Stern
33423627ca [Dashboard] Add profiling button to logical view (#6901) 2020-01-24 11:52:14 -08:00
Sven Mika
446cbdf2e0 [RLlib] Fix issue (bug): LSTM + non-shared vf + PPO + tuple actions (#6890)
* Add `RandomEnv` example to examples folder.
Convert warning into Error message when using an LSTM in a non-shared-vf network (after the warning, the program would crash).

* LINT.

* Fix issue #6884. LSTM + non-shared vf NN + PPO crashes when using a Tuple action space.

* LINT

* Change warning message for Model: shared_vf=False, LSTM=True cases.

* Bug fix.

* Add examples/random_env.py test to Jenkins.
2020-01-24 10:29:35 -08:00
Daniel Edgecumbe
e516c50745 [autoscaler]: Kill workers if the monitor raises an exception (#3977)
Co-authored-by: CJosephides <cjosephides@gmail.com>
2020-01-23 14:12:52 -06:00
Qing Wang
cfbde39ba8
[Java] Generate head redis port randomly (#6879)
* Random head port

* address comments.
2020-01-23 23:37:41 +08:00
AnanthHari
aa2a0cb6da Fixes empty state argument in compute_single_action method (#6894)
* Fixes empty `state` parameter in compute_single_action method

* Fixed style
2020-01-23 00:42:52 -08:00
Ujval Misra
1558307ac4 [tune] Prevent MEMORY checkpoints from breaking trial FT (#6691)
* Prevent MEMORY checkpoints from breaking FT

* Add save/pause/resume/restore test

* change checkpoint return value based on status

* Fix test_checkpoint_manager_tests.

* Fix test + checkpoint manager bug

* lint

* Add docstring

* Add docstring to checkpoint_manager constructor

* Change variable name for clarity

* Revert on_checkpoint docstring wording

* Break after success

* nit: more informative warning

* Quarantine test
2020-01-22 23:17:09 -08:00
Yunzhi Zhang
0834bda8c1 [Dashboard] Display actor task execution info (#6705)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
2020-01-22 22:33:55 -08:00
Sven Mika
ae9a3a2237 [RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865) 2020-01-22 17:02:58 -08:00