Philipp Moritz
eb61036ba2
Revert "Pyarrow Segfault Regression Test ( #7568 )" ( #7805 )
...
This reverts commit 57599f075c
.
2020-03-29 20:59:05 -07:00
ijrsvt
57599f075c
Pyarrow Segfault Regression Test ( #7568 )
2020-03-29 16:15:24 -07:00
Simon Mo
353d7e107f
[Serve] Improve Serialization ( #7688 )
2020-03-29 14:57:19 -07:00
mehrdadn
fc23f79f82
Windows process issues ( #7739 )
2020-03-29 12:48:32 -07:00
fangfengbin
6ce8b63bb6
fix TestTaskLeaseRenewal test failure ( #7765 )
2020-03-29 11:18:47 +08:00
Edward Oakes
d87563937e
Revert "[Dashboard] Metrics Export Service. ( #7728 )" ( #7789 )
2020-03-28 19:27:34 -07:00
Eric Liang
d6255c3395
Fix build breakage due to soft torch import ( #7790 )
2020-03-28 19:08:31 -07:00
Sven Mika
e4bd5db4d8
[RLlib] Minimal ParamNoise PR. ( #7772 )
2020-03-28 16:16:30 -07:00
Eric Liang
5cebee68d6
[rllib] Add scaling guide to documentation, improve bandit docs ( #7780 )
...
* update
* reword
* update
* ms
* multi node sgd
* reorder
* improve bandit docs
* contrib
* update
* ref
* improve refs
* fix build
* add pillow dep
* add pil
* update pil
* pillow
* remove false
2020-03-27 22:05:43 -07:00
Maksim Smolin
7b27ce2b23
[RaySGD] Convert the head worker to a local model ( #7746 )
...
Why are these changes needed?
Running a worker on head (locally, not as a Ray actor) allows for easier handling of stateful stuff like logging and for easier debugging.
2020-03-27 20:19:15 -07:00
Richard Liaw
875309fc48
Revert wide docs ( #7782 )
2020-03-27 17:46:08 -07:00
Richard Liaw
e10dc91821
Fix doc build ( #7781 )
2020-03-27 17:39:38 -07:00
Mitchell Stern
090a8474b0
[Dashboard] Update dependencies and add linting rules ( #7779 )
2020-03-27 16:53:49 -07:00
Carl Balmer
0cfb6488a7
changed get_agent_class to from get_trainable_cls ( #7758 )
2020-03-27 12:17:16 -07:00
Simon Mo
838c1e854f
Add results from 0.8.3 release ( #7745 )
2020-03-27 11:14:15 -07:00
SangBin Cho
86e19959a5
[Dashboard] Tune dashboard bug fix ( #7766 )
...
* Figured out why Tune was unavailable.
* Minor fix.
2020-03-27 09:02:30 -07:00
Kai Yang
6a3503c494
Fix reusing the cached hash of nil ID ( #7753 )
2020-03-27 23:40:03 +08:00
SongGuyang
c195dc8f88
Basic C++ worker implementation ( #6125 )
2020-03-27 23:01:08 +08:00
Sven Mika
93b5c38b7d
[RLlib] Noisy layers in DQN throw different errors (issue #7635 ). ( #7750 )
...
* Rollback.
* Fix issue 7635.
* Fix issue 7635.
* LINT and bug fix.
2020-03-26 22:08:34 -07:00
Sven Mika
369a3417c4
[RLlib] Add tf-graph by default when doing Policy.export_model()
. ( #7759 )
...
* Rollback.
* WIP.
* WIP.
* Fix.
* LINT.
2020-03-26 22:07:10 -07:00
SangBin Cho
7a0befb0a7
[Dashboard] Metrics Export Service. ( #7728 )
2020-03-26 14:03:00 -07:00
hhoke
af3a5705ca
--redis-address -> --address ( #7760 )
...
Exception tells user to use --redis-address, but it deprecated. This tells the user to use the current --address.
2020-03-26 13:52:39 -07:00
Saurabh Gupta
6ddf84b019
Contextual Bandit algorithms (WIP) ( #7642 )
2020-03-26 13:41:16 -07:00
Cloud Han
c1b05b720d
calling register_custom_serializer require ray to be initialized ( #7752 )
2020-03-26 10:24:06 -07:00
Sven Mika
bcf963a53b
[RLlib] Bug default policy overrides torch policy. ( #7756 )
...
* Rollback.
* Bug fix!
2020-03-26 10:03:20 -07:00
fangfengbin
e196fcdbaf
Add gcs_service_enabled function to avoid getting environment variable directly ( #7742 )
2020-03-26 22:02:53 +08:00
Richard Liaw
ca6eabc9cb
[tune] Fail Fast ( #7528 )
...
* pytest
* init cancel
* testing
* Update python/ray/tune/tests/test_tune_server.py
Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>
* change-test
* Apply suggestions from code review
* Apply suggestions from code review
* finished
* set_finished
* tune
* fix
Co-authored-by: ijrsvt <ian.rodney@gmail.com>
2020-03-26 00:04:09 -07:00
hubcity
3d0a8662b3
#7246 - Fixing broken links ( #7247 )
...
* #7246 - Fixing broken links
* Apply suggestions from code review
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-25 21:46:13 -07:00
Eric Liang
23b6fdcda1
ray memory
should collect statistics from all nodes (#7721 )
2020-03-25 16:31:31 -07:00
Stephanie Wang
46404d8a0b
[core] Pin lineage of plasma objects that are still in scope ( #7690 )
...
* Fix deadlock in DrainAndShutdown
* Revert "[core] Revert lineage pinning (#7499 ) (#7692 )"
This reverts commit ba86a02b37
.
* debug rllib
* debug rllib
* turn on all rllib tests again
* debug rllib
* Fix drain bug, check number of pending tasks
* revert rllib debug
* remove todo
* Trigger rllib tests
* revert rllib debug commit
2020-03-25 09:29:32 -07:00
Richard Liaw
82b792be33
[tune] IP Check, Flatten Results for TBX ( #7705 )
...
* support_flattened
* loggers
* Format logger changes
Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-25 09:18:03 +00:00
Maksim Smolin
e95455b7d7
[RaySGD] Add tqdm logging to TorchTrainer ( #7588 )
...
* Update issue templates
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* first_pass
* add overrides
* override
* fixing up operators
* format
* sgd
* constants
* rm
* revert
* Checkpoint the basics
* End of day checkpoint
* Checkpoint log-to-head implementation
* Checkpoint
* Add actor-based batch log reporting, currently segfaults
* Work around progress segfault
* Fix some stuff in quicktorch
* Make things more customizable
* Quality of life fixes
* More quality of life
* Move tqdm logic to training_operator
* Update examples
* Fix some minor bugs
* Fix merge
* Fix small things, add pbar to dcgan
* Run format.sh
* Fix missing epoch number for batch pbar
* Address PR comments
* Fix float is not subscriptable
* Add train_loss to pbar by default
* Isolate tqdm code into a handler system
* Format
* Remove the batch_logs_reporter from distributed runner as well
* Check if the train_loss is avaialbale before using it
* Enable tqdm in the dcgan example
* Fix a crash in no-handler trainers
* Fix
* Allow not calling set_reporters for tests
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-03-24 23:43:56 -07:00
Richard Liaw
54a892bb84
[tune] Cancel Experiment via Client ( #7719 )
...
* init cancel
* testing
* Update python/ray/tune/tests/test_tune_server.py
Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>
* Apply suggestions from code review
* Apply suggestions from code review
* finished
* set_finished
Co-authored-by: ijrsvt <ian.rodney@gmail.com>
2020-03-24 20:30:12 -07:00
Simon Mo
a519b4f2a9
[Serve] Enhancement in HTTP Methods and Multi-route support ( #7709 )
2020-03-24 20:25:05 -07:00
Stephanie Wang
a1cee6af7b
Revert "New scheduler local node ( #7441 )" ( #7732 )
...
This reverts commit 6141fdab95
.
2020-03-24 18:32:16 -07:00
Xianyang Liu
cc0490b55b
Several small fixes for function_manager ( #7685 )
2020-03-24 14:28:15 -07:00
Ion
6141fdab95
New scheduler local node ( #7441 )
2020-03-24 13:59:50 -05:00
fangfengbin
bf866de6fd
Enable GCS Service by default ( #7541 )
2020-03-24 14:20:23 +08:00
mehrdadn
b4030cdbbe
File HANDLE/descriptor translation layer for Windows ( #7657 )
...
* Use TCP sockets on Windows with custom HANDLE <-> FD translation layer
* Get Plasma working on Windows
Co-authored-by: Mehrdad <noreply@github.com>
2020-03-23 21:08:25 -07:00
Robert Nishihara
2b80310e6f
Remove setup.py dependence on packaging. ( #7714 )
2020-03-23 16:21:17 -07:00
Edward Oakes
9318b29f5e
Remove is_direct logic from the raylet ( #7698 )
2020-03-23 17:09:35 -05:00
Richard Liaw
3fa2e4a346
[docs] Fix import breaking docs build ( #7715 )
...
* psutil missing
* ok
2020-03-23 13:21:39 -07:00
Stephanie Wang
7f38cc1d03
Debug statements and increase timeout for test array ( #7713 )
2020-03-23 13:02:14 -07:00
Eric Liang
9a590ac6a5
[rllib] Fix custom model metrics in multi-device case ( #7640 )
...
* fix example
* add example test
* lin
2020-03-23 12:40:22 -07:00
aannadi
8adc84ccb9
[Dashboard] Add sorted columns and TensorBoard to Tune tab ( #7140 )
2020-03-23 12:30:51 -07:00
Richard Liaw
e311013afd
[tune] Reformat Sections of API Reference ( #7706 )
...
* moveit
* moveit
* docstrings to ref
* Update tune-usage.rst
Co-authored-by: Sven Mika <sven@anyscale.io>
2020-03-23 12:23:21 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. ( #7482 )
2020-03-23 12:19:30 -07:00
Robert Nishihara
ee8c9ff732
Remove six and cloudpickle from setup.py. ( #7694 )
2020-03-23 11:42:05 -07:00
Robert Nishihara
1a0c9228d0
Remove pytest from setup.py and other minor changes. ( #7700 )
2020-03-23 08:46:56 -07:00
ZhuSenlin
74825db804
Fix TestGcsRedisFailureDetector ( #7710 )
...
* fix test_gcs_redis_failure_detector
* fix test_gcs_redis_failure_detector
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
2020-03-23 22:48:53 +08:00