Commit graph

2221 commits

Author SHA1 Message Date
ijrsvt
ff706660d2
Kill Actor UI addition (#6955) 2020-01-29 14:32:19 -08:00
Edward Oakes
c2be794f10
Remove try/except import asyncio for python 2 (#6947) 2020-01-29 09:17:07 -08:00
Richard Liaw
037aa2b961
[sgd] Refactor PyTorch SGD Documentation. (#6910)
* Refactor documentation and directory structurre

* update loss

* ,ore examples

* fix comments

* more code

* svgs

* formatting

* more_docs

* more writing

* comments ready

* move

* whitespace

* examples

* fix

* bold

* pytorch

* batch

* fix

* fix test

* Apply suggestions from code review

* quarantinegp

* tests/

* fix missing
2020-01-29 08:51:01 -08:00
Simon Mo
26d749bc18
[Dashboard] Render HTML inline (#6932) 2020-01-28 10:39:22 -08:00
Eric Liang
e659699ca9
[tune] Fix directory naming regression (#6839) 2020-01-27 15:53:40 -08:00
Alex Wu
d9a2294298 Ssh identities only (#6931) 2020-01-27 17:01:21 -06:00
Richard Liaw
e0078a0d78
[autoscaler][minor] default -> latest_dlami (#6922)
* config

* latest

* Update python/ray/autoscaler/aws/config.py
2020-01-27 14:34:07 -08:00
Ameer Haj Ali
a7ecda6017 Support of scikit-learn with ray joblib backend (#6925) 2020-01-27 15:00:00 -06:00
Simon Mo
396d7fafc8
UI improvement for asyncio (#6905) 2020-01-27 12:45:51 -08:00
mehrdadn
bde575b8dd Revert "Use Boost.Process instead of pid_t (#6510)" (#6909)
This reverts commit fb8e3615d5.
2020-01-26 10:26:44 -06:00
Eric Liang
2fb53396ad [rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918) 2020-01-25 22:36:43 -08:00
hyggan
552156f22d [tune] Handles nan case for AsyncHyperBand (#6916) 2020-01-25 17:26:30 -08:00
Ujval Misra
ed9de8b2fa [tune] Expose progress reporter to users (#6915)
* Pluggable progress reporter

* Fix types

* Fix bug, address comments

* lint

* Add convenience function and test

* lint

* Use trials instead of trial_runner

* Add docs

* Update docs

* Fix doc examples

* More doc updates

* Address comments, add configurable frequency

* use reward
2020-01-25 12:28:05 -08:00
Yunzhi Zhang
aa5427ca78 [Dashboard] Kill actor (#6906) 2020-01-24 17:21:44 -08:00
Mitchell Stern
33423627ca [Dashboard] Add profiling button to logical view (#6901) 2020-01-24 11:52:14 -08:00
Daniel Edgecumbe
e516c50745 [autoscaler]: Kill workers if the monitor raises an exception (#3977)
Co-authored-by: CJosephides <cjosephides@gmail.com>
2020-01-23 14:12:52 -06:00
Ujval Misra
1558307ac4 [tune] Prevent MEMORY checkpoints from breaking trial FT (#6691)
* Prevent MEMORY checkpoints from breaking FT

* Add save/pause/resume/restore test

* change checkpoint return value based on status

* Fix test_checkpoint_manager_tests.

* Fix test + checkpoint manager bug

* lint

* Add docstring

* Add docstring to checkpoint_manager constructor

* Change variable name for clarity

* Revert on_checkpoint docstring wording

* Break after success

* nit: more informative warning

* Quarantine test
2020-01-22 23:17:09 -08:00
Yunzhi Zhang
0834bda8c1 [Dashboard] Display actor task execution info (#6705)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
2020-01-22 22:33:55 -08:00
Sven Mika
ae9a3a2237 [RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865) 2020-01-22 17:02:58 -08:00
Simon Mo
5f527816fe
Fix async actor high cpu utilization when idle (#6877) 2020-01-22 16:07:08 -08:00
Simon Mo
4dd41844d0
Ignore blocking ray.wait if timeout is zero (#6891) 2020-01-22 16:05:34 -08:00
Richard Liaw
2b0e93586f
[autoscaler] Auto-replace "DEFAULT" with most recent DLAMI (#6848)
* try_this

* fix

* actual fix

* default
2020-01-21 13:54:04 -08:00
Richard Liaw
4edfaf2f38
[tune] Support callable objects in variant generation (#6849)
* minorcallable

* format
2020-01-21 10:24:25 -08:00
Stephanie Wang
815cd0e39a
Task and actor fate sharing with the owner process (#6818)
* Add test

* Kill workers leased by failed workers

* merge

* shorten test

* Add node failure test case

* Fix FromBinary for nil IDs, add assertions

* Test

* Fate sharing on node removal, fix owner address bug

* lint

* Update src/ray/raylet/node_manager.cc

Co-Authored-By: Zhijun Fu <37800433+zhijunfu@users.noreply.github.com>

* fix

* Remove unneeded test

* fix IDs

Co-authored-by: Zhijun Fu <37800433+zhijunfu@users.noreply.github.com>
2020-01-20 16:44:04 -08:00
Philipp Moritz
96e2c1ae74
[Projects] Add small tutorial for projects (#6641) 2020-01-20 09:33:41 -08:00
Robert Nishihara
c2cbb85a43
Fix flaky test test_feature_flag (#6850) 2020-01-19 20:59:03 -08:00
Richard Liaw
341ddd0a09
[tune] Default to TensorboardX and include in requirements. (#6836) 2020-01-19 01:49:33 -08:00
Richard Liaw
8a9bd18606
[tune] Remove keras dependency (#6827) 2020-01-18 23:24:42 -08:00
Yuhao Yang
9b1d2953de [tune] set correct path when deleting checkpoint folder (#6758) 2020-01-17 23:11:03 -08:00
Mitchell Stern
763818b476 [Dashboard] Add static assets for speedscope v1.5.3 (#6822) 2020-01-17 20:53:53 -08:00
Yunzhi Zhang
3acf3c7675 [Dashboard] Add actor task counter (#6820) 2020-01-17 15:43:56 -08:00
Simon Mo
8f246c17b5
Initialize async plasma for async actors (#6813)
* Initialize async plasma for async actors

* Address comment
2020-01-17 14:58:06 -08:00
Ameer Haj Ali
9f9c3f5026 adding context parameter for pool with a warning for not being supported (#6776) 2020-01-17 16:57:18 -06:00
Edward Oakes
30776450a3
num_cpus=1 by default in Pool (#6812) 2020-01-17 13:28:25 -06:00
Qstar
0f3205af0b [Projects] Delete pods associated with the project when running ray session stop (#6787) 2020-01-17 10:42:30 -08:00
Mitchell Stern
9f96091aef [Dashboard] Add logical view displaying actor tree (#6810)
* [Dashboard] Add logical view displaying actor tree

* Fix key error in test_raylet_info_endpoint
2020-01-17 10:25:27 -08:00
Yuhao Yang
5f36e6eacb [tune] get checkpoints paths for a trial after tuning (#6643) 2020-01-17 10:15:04 -08:00
Mitchell Stern
8e8b66a4b8 Add route for /favicon.ico to fix missing favicon (#6815) 2020-01-16 21:03:21 -06:00
Richard Liaw
232be5a058
[sgd] fault tolerance for pytorch + revamp documentation (#6465) 2020-01-16 18:38:27 -08:00
Mitchell Stern
05674c219f Accept any port in test_get_webui in test_webui.py (#6804) 2020-01-15 23:16:35 -06:00
mehrdadn
fb8e3615d5 Use Boost.Process instead of pid_t (#6510)
* Use Boost.Process instead of pid_t

This will let us handle child processes (mostly) uniformly across platforms.
TODO: There is no SIGTERM on Windows; achieving something equivalent is fairly involved.
2020-01-15 20:05:02 -08:00
Ziyad Edher
c480d1d1e4 Treat static methods as class methods instead of instance methods in actors (#6756)
* Treat static methods as class methods rather than instance methods

* Add tests for static methods in actors

* Revert formatting changes

* Readd future imports

* Restructure static method check

* Documentation enhancements

* Fix linting issues
2020-01-15 19:38:41 -06:00
Edward Oakes
4227fd1b60
fix flaky test_wait (#6791) 2020-01-14 14:43:16 -06:00
Edward Oakes
3ea3b56eb1
Hotfix missing fields in multiprocessing.Pool (#6784) 2020-01-13 16:39:33 -06:00
Sven Mika
4ee566129f Ignore io.UnsupportedOperation error when "Enabling nice stack traces on SIGSEGV etc." in worker.py::connect(). (#6771)
- Fixes RLlib tf-eager test cases for all agents when run locally on Ubuntu and Mac.
2020-01-13 14:31:13 -08:00
Philipp Moritz
a26431f587
Upgrade react-scripts to fix #6739 (#6769) 2020-01-13 11:58:21 -08:00
Edward Oakes
a950e95c7d
Use exit() in __kill_actor__ (#6760) 2020-01-13 11:37:59 -06:00
chaokunyang
4097d076d4 Package ray java jars into wheels (#6600) 2020-01-10 11:41:00 +08:00
Sven
60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Eric Liang
69c5a2bc3c
Warn if OMP_NUM_THREADS is set (#6729) 2020-01-08 14:59:07 -08:00