Edward Oakes
7f9ddfcfd8
Only access route_table and policy_table in master actor ( #7835 )
2020-04-02 14:44:53 -07:00
Edward Oakes
cbe494ab13
[flaky test] Fix flaky test_heartbeats_single ( #7857 )
2020-04-02 16:23:28 -05:00
ijrsvt
9bfc2c4b54
Moving Local Mode to C++ ( #7670 )
2020-04-01 15:50:57 -05:00
mehrdadn
65054a2c7c
Python 3.8 compatibility ( #7754 )
2020-04-01 10:03:23 -07:00
Richard Liaw
24bf6ad607
[raysgd] Improve raysgd examples ( #7818 )
...
* better_example
* test
* improve some usability things
* submit
* fix
* flake
* Update python/ray/util/sgd/torch/training_operator.py
* trythis
* fix
* fix
* smoke
* fail
* fix
* fix
2020-04-01 08:58:39 -07:00
Edward Oakes
f4239d27fa
[serve] Create all other actors in master actor ( #7791 )
2020-04-01 10:15:04 -05:00
Robert Nishihara
b011c604d7
Remove ray.tasks() from API. ( #7807 )
2020-04-01 10:10:40 -05:00
SangBin Cho
c23e56ce9a
Metrics Export Service ( #7809 )
2020-03-30 23:28:32 -07:00
mehrdadn
8958728139
Windows bug fixes ( #7740 )
2020-03-30 20:39:23 -05:00
Simon Mo
dc9b62e007
Deserialize Args in Event Loop Thread ( #7806 )
2020-03-30 18:28:13 -07:00
Richard Liaw
fbf02fa7f7
[Hotfix] Lint for Documentation ( #7817 )
2020-03-30 11:49:05 -07:00
Richard Liaw
18327254b6
[docs] Fix readthedocs rendering ( #7810 )
2020-03-30 11:40:08 -07:00
Richard Liaw
86cff17e7e
[tune/raysgd] Tune API for TorchTrainer + Fix State Restoration ( #7547 )
2020-03-30 12:58:49 -05:00
Edward Oakes
3a53ea60d9
[Serve] Push route table updates to HTTP proxy ( #7774 )
2020-03-30 09:53:05 -07:00
Philipp Moritz
eb61036ba2
Revert "Pyarrow Segfault Regression Test ( #7568 )" ( #7805 )
...
This reverts commit 57599f075c
.
2020-03-29 20:59:05 -07:00
ijrsvt
57599f075c
Pyarrow Segfault Regression Test ( #7568 )
2020-03-29 16:15:24 -07:00
Simon Mo
353d7e107f
[Serve] Improve Serialization ( #7688 )
2020-03-29 14:57:19 -07:00
mehrdadn
fc23f79f82
Windows process issues ( #7739 )
2020-03-29 12:48:32 -07:00
Edward Oakes
d87563937e
Revert "[Dashboard] Metrics Export Service. ( #7728 )" ( #7789 )
2020-03-28 19:27:34 -07:00
Maksim Smolin
7b27ce2b23
[RaySGD] Convert the head worker to a local model ( #7746 )
...
Why are these changes needed?
Running a worker on head (locally, not as a Ray actor) allows for easier handling of stateful stuff like logging and for easier debugging.
2020-03-27 20:19:15 -07:00
Mitchell Stern
090a8474b0
[Dashboard] Update dependencies and add linting rules ( #7779 )
2020-03-27 16:53:49 -07:00
SangBin Cho
86e19959a5
[Dashboard] Tune dashboard bug fix ( #7766 )
...
* Figured out why Tune was unavailable.
* Minor fix.
2020-03-27 09:02:30 -07:00
SangBin Cho
7a0befb0a7
[Dashboard] Metrics Export Service. ( #7728 )
2020-03-26 14:03:00 -07:00
hhoke
af3a5705ca
--redis-address -> --address ( #7760 )
...
Exception tells user to use --redis-address, but it deprecated. This tells the user to use the current --address.
2020-03-26 13:52:39 -07:00
Cloud Han
c1b05b720d
calling register_custom_serializer require ray to be initialized ( #7752 )
2020-03-26 10:24:06 -07:00
fangfengbin
e196fcdbaf
Add gcs_service_enabled function to avoid getting environment variable directly ( #7742 )
2020-03-26 22:02:53 +08:00
Richard Liaw
ca6eabc9cb
[tune] Fail Fast ( #7528 )
...
* pytest
* init cancel
* testing
* Update python/ray/tune/tests/test_tune_server.py
Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>
* change-test
* Apply suggestions from code review
* Apply suggestions from code review
* finished
* set_finished
* tune
* fix
Co-authored-by: ijrsvt <ian.rodney@gmail.com>
2020-03-26 00:04:09 -07:00
Eric Liang
23b6fdcda1
ray memory
should collect statistics from all nodes (#7721 )
2020-03-25 16:31:31 -07:00
Stephanie Wang
46404d8a0b
[core] Pin lineage of plasma objects that are still in scope ( #7690 )
...
* Fix deadlock in DrainAndShutdown
* Revert "[core] Revert lineage pinning (#7499 ) (#7692 )"
This reverts commit ba86a02b37
.
* debug rllib
* debug rllib
* turn on all rllib tests again
* debug rllib
* Fix drain bug, check number of pending tasks
* revert rllib debug
* remove todo
* Trigger rllib tests
* revert rllib debug commit
2020-03-25 09:29:32 -07:00
Richard Liaw
82b792be33
[tune] IP Check, Flatten Results for TBX ( #7705 )
...
* support_flattened
* loggers
* Format logger changes
Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-25 09:18:03 +00:00
Maksim Smolin
e95455b7d7
[RaySGD] Add tqdm logging to TorchTrainer ( #7588 )
...
* Update issue templates
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* first_pass
* add overrides
* override
* fixing up operators
* format
* sgd
* constants
* rm
* revert
* Checkpoint the basics
* End of day checkpoint
* Checkpoint log-to-head implementation
* Checkpoint
* Add actor-based batch log reporting, currently segfaults
* Work around progress segfault
* Fix some stuff in quicktorch
* Make things more customizable
* Quality of life fixes
* More quality of life
* Move tqdm logic to training_operator
* Update examples
* Fix some minor bugs
* Fix merge
* Fix small things, add pbar to dcgan
* Run format.sh
* Fix missing epoch number for batch pbar
* Address PR comments
* Fix float is not subscriptable
* Add train_loss to pbar by default
* Isolate tqdm code into a handler system
* Format
* Remove the batch_logs_reporter from distributed runner as well
* Check if the train_loss is avaialbale before using it
* Enable tqdm in the dcgan example
* Fix a crash in no-handler trainers
* Fix
* Allow not calling set_reporters for tests
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-03-24 23:43:56 -07:00
Richard Liaw
54a892bb84
[tune] Cancel Experiment via Client ( #7719 )
...
* init cancel
* testing
* Update python/ray/tune/tests/test_tune_server.py
Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>
* Apply suggestions from code review
* Apply suggestions from code review
* finished
* set_finished
Co-authored-by: ijrsvt <ian.rodney@gmail.com>
2020-03-24 20:30:12 -07:00
Simon Mo
a519b4f2a9
[Serve] Enhancement in HTTP Methods and Multi-route support ( #7709 )
2020-03-24 20:25:05 -07:00
Xianyang Liu
cc0490b55b
Several small fixes for function_manager ( #7685 )
2020-03-24 14:28:15 -07:00
fangfengbin
bf866de6fd
Enable GCS Service by default ( #7541 )
2020-03-24 14:20:23 +08:00
mehrdadn
b4030cdbbe
File HANDLE/descriptor translation layer for Windows ( #7657 )
...
* Use TCP sockets on Windows with custom HANDLE <-> FD translation layer
* Get Plasma working on Windows
Co-authored-by: Mehrdad <noreply@github.com>
2020-03-23 21:08:25 -07:00
Robert Nishihara
2b80310e6f
Remove setup.py dependence on packaging. ( #7714 )
2020-03-23 16:21:17 -07:00
Edward Oakes
9318b29f5e
Remove is_direct logic from the raylet ( #7698 )
2020-03-23 17:09:35 -05:00
Stephanie Wang
7f38cc1d03
Debug statements and increase timeout for test array ( #7713 )
2020-03-23 13:02:14 -07:00
aannadi
8adc84ccb9
[Dashboard] Add sorted columns and TensorBoard to Tune tab ( #7140 )
2020-03-23 12:30:51 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. ( #7482 )
2020-03-23 12:19:30 -07:00
Robert Nishihara
ee8c9ff732
Remove six and cloudpickle from setup.py. ( #7694 )
2020-03-23 11:42:05 -07:00
Robert Nishihara
1a0c9228d0
Remove pytest from setup.py and other minor changes. ( #7700 )
2020-03-23 08:46:56 -07:00
Simon Mo
afad0ed085
[Serve] Add async, multi methods support for serve actors ( #7682 )
2020-03-23 00:45:26 -07:00
Robert Nishihara
8b4c2b7e88
Remove unnecessary handling of setproctitle and psutil. ( #7702 )
2020-03-22 22:06:42 -07:00
Robert Nishihara
4d722bf003
Remove dependence on funcsigs. ( #7701 )
2020-03-22 21:37:24 -07:00
Edward Oakes
8b4f5a9431
Remove non-direct-call code from core worker ( #7625 )
2020-03-22 19:20:08 -05:00
Richard Liaw
81d311031b
[tune] Update API Reference Page ( #7671 )
...
* widerdocs
* init
* docs
* fix
* moveit
* mix
* better_docs
* remove
* Apply suggestions from code review
Co-Authored-By: Sven Mika <sven@anyscale.io>
Co-authored-by: Sven Mika <sven@anyscale.io>
2020-03-22 16:42:20 -07:00
Eric Liang
288933ec6b
[rllib] Fix shared metrics context in parallel iterators ( #7666 )
...
* debug
* build
* update
* wip
* wpi
* update
* recurisve sync
* comment
* stream
* fix
* Update .travis.yml
2020-03-22 14:15:01 -07:00
Eric Liang
86f89fc3b3
[tune] Higher timeout for progress reporter test ( #7679 )
...
* wip
* medium size
2020-03-22 13:47:08 -07:00