Richard Liaw
26a724c5e6
[core] Support kwargs and positionals in Ray remote calls ( #5606 )
2019-10-20 22:40:54 -07:00
Edward Oakes
fc56872012
Send active object IDs to the raylet ( #5803 )
...
* Send active object IDs to the raylet
* comment
* comments
* dedup
* signed int in config
* comments
* Remove object ID from monitor
* Fix test
* re-add check
* fix cast
* check if core worker
* Add comment
* Reservoir sampling
* Fix lint
* Pointer return
* tmp
* Fix merge
* Initialize object ids properly
* Fix lint
2019-10-20 22:05:28 -07:00
Zhuohan Li
f286356e06
[docs] add pages about examples on training language models with fairseq ( #5755 )
...
* add pages about examples on training language models with fairseq and ray autoscaler
* better format
* update ray_train.sh
* Move EFS to the autoscaler file
* nits
* add comments to the code & use a new way to implement checkpoint hook
* small bug fix
* polish the doc
* fix formatting
* yaml
* update docs
* fix the bugs and add preprocess.sh
* fix lint
* Reduce batch size & fix lint
* shorttitle
2019-10-20 20:28:16 -07:00
Simon Mo
6b36ef1138
[Serve] Ensure strict traffic splitting ( #5929 )
...
* [Serve] Ensure strict traffic splitting
* Fix test
2019-10-20 20:18:14 -07:00
Stephanie Wang
bc4a0de4da
Fix multiple drivers for named actors and add test ( #5956 )
2019-10-20 16:04:21 -07:00
Richard Liaw
74852c80cb
[docs] Improve more serialization Errors ( #5658 )
2019-10-20 14:06:00 -07:00
Richard Liaw
91acecc9f9
[tune][minor] gpu warning ( #5948 )
...
* gpu
* formaat
* defaults
* format_and_check
* better registration
* fix
* fix
* trial
* foramt
* tune
2019-10-19 17:09:48 -07:00
Philipp Moritz
d23696de17
Introduce flag to use pickle for serialization ( #5805 )
2019-10-18 22:29:36 -07:00
Philipp Moritz
29eee7f970
Forward multiple ports for autoscaler ( #5893 )
2019-10-18 16:50:46 -07:00
Richard Liaw
48ba484640
[tune] Test TF2.0, TF1.14, TF1.12 Tensorboard support ( #5931 )
2019-10-18 13:50:42 -07:00
Stephanie Wang
697f765efc
Refactor CoreWorker to remove TaskInterface ( #5924 )
...
* Remove TaskInterface
* Remove Status return value
* Remove CActorHandle, some return values, TaskSubmitter
* lint
* doc
* doc
* fix build
* lint
* Return Status, guarded by annotation, fail tasks for RECONSTRUCTING actors
* fix
* move annotation
* revert
* Fix core worker test
* nits
2019-10-18 00:03:57 -04:00
Stephanie Wang
3ac8592dcf
Remove actor handle IDs ( #5889 )
...
* Remove actor handle ID from main ActorHandle constructor
* Set the actor caller ID when calling submit task instead of in the actor handle
* Remove ActorHandle::Fork, remove actor handle ID from protobuf
* Make inner actor handle const, remove new_actor_handles
* Move caller ID into the common task spec, start refactoring raylet
* Some fixes for forking actor handles
* Store ActorHandle state in CoreWorker, only expose actor ID to Python
* Remove some unused fields
* lint
* doc
* fix merge
* Remove ActorHandleID from python/cpp
* doc
* Fix core worker test
* Move actor table subscription to CoreWorker, reset actor handles on actor failure
* lint
* Remove GCS client from direct actor
* fix tests
* Fix
* Fix tests for raylet codepath
* Fix local mode
* Fix multithreaded test
* Fix AsyncSubscribe issue...
* doc
* fix serve
* Revert bazel
2019-10-17 12:36:34 -04:00
Stefan Otte
d70abcfd70
Fix typo in examples/centralized_critic.py ( #5943 )
...
`opp_ops` should be `opp_obs`.
2019-10-17 08:42:50 -07:00
Alexander Scammon
4d08d3c188
Add dependencies for dashboard to installation.rst ( #5942 )
...
Updating the docs to include pip installing `aiohttp` and `psutil`, both of which the dashboard requires. Since the whole dashboard section is optional, I thought I'd just add it in the docs rather than make it an explicit requirement of the project. Tell me if you'd prefer them as requirements in the `setup.py`, though.
2019-10-17 00:39:56 -07:00
Philipp Moritz
32b2907457
Update max resource label and give better error message ( #5916 )
2019-10-16 22:37:01 -07:00
Peter Schafhalter
6c11b534c8
[Autoscaler] Update AWS Deep Learning AMI to version 24.3 ( #5932 )
2019-10-16 16:50:54 -07:00
Richard Liaw
d52a4983af
Update TF documentation ( #5918 )
2019-10-16 01:31:27 -07:00
Richard Liaw
9f23620412
[tune] tf2.0 mnist example ( #5898 )
...
* tfmnistexample
* tfmnist
* add_to_ci
* format
* exampledownlaod
* fix
2019-10-15 22:25:01 -07:00
Eric Liang
6843a01a7f
Automatically create custom node id resource ( #5882 )
...
* node id
* comment
* comments
* fix tests
2019-10-15 21:31:11 -07:00
Richard Liaw
c52bb0621d
[tune] Support TF2.0 on Keras Callback ( #5912 )
2019-10-15 10:49:50 -07:00
Eric Liang
69d5c1b53a
remove evil redirects ( #5919 )
2019-10-14 19:41:04 -07:00
Philipp Moritz
5382a26c2e
Deactivate bazel caching for linux wheels ( #5915 )
2019-10-14 15:48:23 -07:00
Camille Couturier
320cba313f
[tune] Explicitly set scheduler in run() ( #5871 )
...
* Explicitely set scheduler in run()
* Better formatting/indentation (after running format.sh)
* Remove accidental paste in parameters definitions.
* format
2019-10-14 15:44:59 -07:00
Richard Liaw
7f4141df4e
[docs] Pictures for all the Examples ( #5859 )
...
* image
* plot resnet
* hyperparam
* fixup_pictures
* custom_direct
2019-10-14 14:18:52 -07:00
Philipp Moritz
8fd23c0c3f
Add back TensorFlow test ( #5885 )
2019-10-14 11:26:02 -07:00
Richard Liaw
20c0cdee4f
[autoscaler] Worker-Head termination + Better Scale-up message ( #5909 )
2019-10-14 10:37:50 -07:00
Edward Oakes
abbfe7392f
Bump dev version to 0.8.0.dev6 ( #5906 )
2019-10-14 11:36:13 +01:00
Richard Liaw
1650f7b174
[tune] Remove TF MNIST example + add TrialRunner hook to execut… ( #5868 )
...
* remove test
* add trial runner
* remvoerestore
* Remove other mnist examples
* tunetest
* revert
* v1
* Revert "v1"
This reverts commit c8bddaf2db7a8270c43c02021cac0e75df15ed20.
* Revert "revert"
This reverts commit b58f56884a0c288d3a6f997d149ab4d496ddd7a3.
* errors
* format
2019-10-13 20:33:56 -07:00
Richard Liaw
52e5c9b22d
[tune] CPU-Only Head Node support ( #5900 )
...
* trialqueue
* add tests
2019-10-13 20:31:42 -07:00
Eric Liang
2cbc67f3d5
Fix test_dying_worker_get ( #5908 )
2019-10-13 18:06:28 -07:00
Richard Liaw
0f24509c30
[autoscaler] uptime redirect fix ( #5907 )
...
* small change
* comment
2019-10-13 23:25:15 +01:00
Edward Oakes
6eaa8e31fa
[autoscaler] Revert to double-spawning updater threads ( #5903 )
...
* [autoscaler] Revert to double-spawning threads
* Use log prefix
* add comment
2019-10-13 20:00:06 +01:00
Simon Mo
97a786cf11
[Serve] Remove handle passing in tail recursion ( #5894 )
...
* Remove handle pass in tail recursion
* Quick fix
* Fix worker timeout issue
2019-10-12 20:13:20 -07:00
Matthew A. Wright
0110941de5
rllib: use pytorch's fn to see if gpu is available ( #5890 )
2019-10-12 00:13:00 -07:00
Richard Liaw
898652837c
[minor][docs] Remove example link ( #5880 )
2019-10-11 11:49:18 -07:00
Eric Liang
0e8c3c0346
Don't wrap RayError with RayTaskError ( #5870 )
2019-10-11 11:00:08 -07:00
Edward Oakes
779f91523b
[autoscaler] Fix quoting ( #5891 )
2019-10-11 00:40:26 -07:00
Simon Mo
4b99cb429e
[Serve] Hotfix: Fix actor handle hashing in metric monitoring ( #5886 )
2019-10-11 00:31:42 -07:00
Robert Nishihara
523c764c25
Python 2 compatibility. ( #5887 )
2019-10-10 19:09:25 -07:00
Eric Liang
c3b2ae26c5
Fix str of RayTaskError ( #5878 )
...
* fix key error
* fix
2019-10-10 16:53:18 -07:00
Philipp Moritz
1100556ba2
Fix linux wheel build ( #5881 )
2019-10-10 16:15:26 -07:00
Mitchell Stern
195ca43e9c
[Dashboard] Improve handling of logs and errors in dashboard backend ( #5857 )
...
* Improve handling of logs and errors in dashboard backend
* Update nested dict comprehension for clarity
2019-10-10 11:59:54 -07:00
Eric Liang
1a8ac3db46
Implement fair task queueing to prevent task starvation ( #5851 )
...
* initial commit
* lint
* clarify
* add feature flag
* comment
* add timeout to test
* fix print
* comment
* use id for scheduling class
* lint
* dad warn
* flake
2019-10-08 21:04:25 -07:00
Richard Liaw
1181924077
[tune][minor] formatting examples, fix travis ( #5869 )
...
* formatting
* formatting
2019-10-08 17:58:43 -07:00
Ujval Misra
a851d7eb87
[tune] Readable trial progress output ( #5822 )
...
* Cleaner, tabulated progress output.
* Minor HTML changes, trial ID instead of name
* Revert basic variant changes
* Cleanup, address richard's comments, add progress_reporter.py
* Add tabulate dependency
* Added more info to table, auto-hide columns with no data.
* lint
* Address comments
* Replace experiment tag w/ trial ID
* Fixed tests.
* Fixed test
* Added requirement
* Fix formatting
2019-10-08 16:38:39 -07:00
Philipp Moritz
24b79fd0a6
temporarily remove tensorflow test ( #5866 )
2019-10-08 14:13:54 -07:00
zhu-eric
3845c97dd0
[doc] Hyperparameter Tuning Gallery Entry ( #5786 )
...
* mod_table
* Example fix for gallery
* lint
* nit
* nit
* fix
* gallery
* remove table for now
* training, object store, tune, actors, advanced
* start tf code
* first cut tf
* yapf
* pytorch
* add torch example
* torch
* parallel
* tune
* tuning
* reviewsready
* finetune
* fix
* move_code
* update conf
* compile
* init hyperparameter
* Start images
* overview
* extra
* fix
* works
* update-ps-example
* param_actor
* fix
* examples
* simple
* simplify_pong
* flake8 and run hyperopt
* add comments
* add comments
* add suggestion
* add suggestion
* suggestions
* add suggestion
* add suggestions
* fixed in wrong area
* last edit
* finish changes
* add line
* hyperparameter
2019-10-08 14:13:17 -07:00
Matthew A. Wright
4aa06918ae
Qmix on gpu and with non-stacked-obs environment state support ( #5751 )
2019-10-08 13:18:07 -07:00
Edward Oakes
42dd0fae96
Fix actor ID collision in local mode ( #5863 )
...
* Fixed local mode actor id
* Update python/ray/actor.py
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* Added hyphen to match comments
* Added tests to test_local_mode
* Helloworld
* Better test naming
* lint
2019-10-08 13:07:42 -07:00
Ujval Misra
375852af23
[tune] Check node liveness before result fetch ( #5844 )
...
* Check if trial's node is alive before trying to fetch result
* Added function for failed trials to trial_executor interface
* Address comments, add test.
2019-10-08 11:41:01 -07:00