Richard Liaw
e311013afd
[tune] Reformat Sections of API Reference ( #7706 )
...
* moveit
* moveit
* docstrings to ref
* Update tune-usage.rst
Co-authored-by: Sven Mika <sven@anyscale.io>
2020-03-23 12:23:21 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. ( #7482 )
2020-03-23 12:19:30 -07:00
Robert Nishihara
1a0c9228d0
Remove pytest from setup.py and other minor changes. ( #7700 )
2020-03-23 08:46:56 -07:00
SangBin Cho
79767fe425
Fix wording in dashboard documentation. ( #7703 )
2020-03-22 22:16:40 -07:00
Robert Nishihara
8b4c2b7e88
Remove unnecessary handling of setproctitle and psutil. ( #7702 )
2020-03-22 22:06:42 -07:00
Robert Nishihara
4d722bf003
Remove dependence on funcsigs. ( #7701 )
2020-03-22 21:37:24 -07:00
Richard Liaw
81d311031b
[tune] Update API Reference Page ( #7671 )
...
* widerdocs
* init
* docs
* fix
* moveit
* mix
* better_docs
* remove
* Apply suggestions from code review
Co-Authored-By: Sven Mika <sven@anyscale.io>
Co-authored-by: Sven Mika <sven@anyscale.io>
2020-03-22 16:42:20 -07:00
Eric Liang
288933ec6b
[rllib] Fix shared metrics context in parallel iterators ( #7666 )
...
* debug
* build
* update
* wip
* wpi
* update
* recurisve sync
* comment
* stream
* fix
* Update .travis.yml
2020-03-22 14:15:01 -07:00
SangBin Cho
1b90196bef
[doc] Dashboard documentation ( #7304 )
...
* Completed the first half of dashboard documentation.
* Dashboard document initial versions.
* Formatting.
* Fixed tune note is not visible.
* Half of comments from code reivew are handled.
* Fixed based on code review.
* Improved memory usage page.
* Addressed code review.
* Fixed image not found issue.
* Add gitkeep again.
* Refactored document.
* Addressed Robert's feedback.
* Addressed code reviews.
* Addressed last comments.
* Update doc/source/ray-dashboard.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-20 22:00:33 -07:00
Edward Oakes
31845f17a5
[docs] Add documentation for reference counting and 'ray memory' ( #7661 )
2020-03-20 15:47:00 -05:00
Eric Liang
9392cdbf74
[rllib] Add high-performance external application connector ( #7641 )
2020-03-20 12:43:57 -07:00
Eric Liang
745b9d643d
First pass at ray memory
command for memory debugging ( #7589 )
2020-03-17 20:45:07 -07:00
Landcold7
e6a045df48
Fix typo in asyncio documentation ( #7602 )
2020-03-17 10:37:37 -05:00
Scott Graham
37e4d29f87
[autoscaler] Adding Azure Support ( #7080 )
...
* adding directory and node_provider entry for azure autoscaler
* adding initial cut at azure autoscaler functionality, needs testing and node_provider methods need updating
* adding todos and switching to auth file for service principal authentication
* adding role / scope to service principal
* resolving issues with app credentials
* adding retry for setting service principal role
* typo and adding retry to nic creation
* adding nsg to config, moving nic/public ip to node provider, cleanup node_provider, leaving in NodeProvider stub for testing
* linting
* updating cleanup and fixing bugs
* adding directory and node_provider entry for azure autoscaler
* adding initial cut at azure autoscaler functionality, needs testing and node_provider methods need updating
* adding todos and switching to auth file for service principal authentication
* adding role / scope to service principal
* resolving issues with app credentials
* adding retry for setting service principal role
* typo and adding retry to nic creation
* adding nsg to config, moving nic/public ip to node provider, cleanup node_provider, leaving in NodeProvider stub for testing
* linting
* updating cleanup and fixing bugs
* minor fixes
* first working version :)
* added tag support
* added msi identity intermediate
* enable MSI through user managed identity
* updated schema
* extend yaml schema
remove service principal code
add re-use of managed user identity
* fix rg_id
* fix logging
* replace manual cluster yaml validation with json schema
- improved error message
- support for intellisense in VSCode (or other IDEs)
* run linting
* updating yaml configs and formatting
* updating yaml configs and formatting
* typo in example config
* pulling default config from example-full
* resetting min, init worker prop
* adding docs for azure autoscaler and fixing status
* add azure to docs, fix config for spot instances, update azure provider to avoid caching issues during deployment
* fix for default subscription in azure node provider
* vm dev image build
* minor change
* keeping example-full.yaml in autoscaler/azure, updating azure example config
* linting azure config
* extending retries on azure config
* lint
* support for internal ips, fix to azure docs, and new azure gpu example config
* linting
* Update python/ray/autoscaler/azure/node_provider.py
Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>
* revert_this
* remove_schema
* updating configs and removing ssh keygen, tweak azure node provider terminate
* minor tweaks
Co-authored-by: Markus Cozowicz <marcozo@microsoft.com>
Co-authored-by: Ubuntu <marcozo@mc-ray-jumpbox.chcbtljllnieveqhw3e4c1ducc.xx.internal.cloudapp.net>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-15 14:48:27 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length ( #7503 )
...
* bulk rename
* deprecation warn
* update doc
* update fig
* line length
* rename
* make pytest comptaible
* fix test
* fi sys
* rename
* wip
* fix more
* lint
* update svg
* comments
* lint
* fix use of batch steps
2020-03-14 12:05:04 -07:00
Eric Liang
52cf77f5a9
[rllib] SAC no_done_at_end should default to False ( #7594 )
...
* update
* update doc
* stochastic
* cleanu
2020-03-14 11:16:54 -07:00
Markus Cozowicz
ba1b081477
Azure Portal cluster deployment | Support spot instances ( #7558 )
...
* added priority option
* added head node priority
* upgrade api version
2020-03-11 18:46:11 -07:00
Richard Liaw
b38ed4be71
[raysgd] Fix More Docs ( #7565 )
2020-03-11 14:17:47 -07:00
Richard Liaw
d046faeb9c
[sgd] Readme fix ( #7564 )
...
* readme fix
* replicas
2020-03-11 13:40:18 -07:00
Richard Liaw
b70f31339c
[sgd] Benchmark Fixes ( #7553 )
...
* fix
* fix
2020-03-11 13:08:27 -07:00
Richard Liaw
fbac256982
[sgd] Add benchmarks ( #7454 )
...
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* first_pass
* add overrides
* override
* fixing up operators
* format
* sgd
* constants
* rm
* revert
* save
* failures
* fixes
* trainer
* run test
* operator
* code
* op
* ok done
* operator
* sgd test fixes
* ok
* trainer
* format
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* Update doc/source/raysgd/raysgd_pytorch.rst
* docstring
* dcgan
* doc
* commits
* nit
* testing
* revert
* Start renaming pytorch to torch
* Rename PyTorchTrainer to TorchTrainer
* Rename PyTorch runners to Torch runners
* Finish renaming API
* Rename to torch in tests
* Finish renaming docs + tests
* Run format + fix DeprecationWarning
* fix
* move tests up
* benchmarks
* rename
* remove some args
* better metrics output
* fix up the benchmark
* benchmark-yaml
* horovod-benchmark
* benchmarks
* Remove benchmark code for cleanups
* benchmark-code
* nits
* benchmark yamls
* benchmark yaml
* ok
* ok
* ok
* benchmark
* nit
* finish_bench
* makedatacreator
* relax
* metrics
* autosetsampler
* profile
* movements
* OK
* smoothen
* fix
* nitdocs
* loss
* envflag
* comments
* nit
* format
* visible
* images
* move_images
* fix
* rernder
* rrender
* rest
* multgpu
* fix
* nit
* finish
* extrra
* setup
* revert
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Maksim Smolin <maximsmol@gmail.com>
2020-03-11 01:09:08 -07:00
Richard Liaw
d192ef0611
[raysgd] Cleanup User API ( #7384 )
...
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* first_pass
* add overrides
* override
* fixing up operators
* format
* sgd
* constants
* rm
* revert
* save
* failures
* fixes
* trainer
* run test
* operator
* code
* op
* ok done
* operator
* sgd test fixes
* ok
* trainer
* format
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* Update doc/source/raysgd/raysgd_pytorch.rst
* docstring
* dcgan
* doc
* commits
* nit
* testing
* revert
* Start renaming pytorch to torch
* Rename PyTorchTrainer to TorchTrainer
* Rename PyTorch runners to Torch runners
* Finish renaming API
* Rename to torch in tests
* Finish renaming docs + tests
* Run format + fix DeprecationWarning
* fix
* move tests up
* benchmarks
* rename
* remove some args
* better metrics output
* fix up the benchmark
* benchmark-yaml
* horovod-benchmark
* benchmarks
* Remove benchmark code for cleanups
* makedatacreator
* relax
* metrics
* autosetsampler
* profile
* movements
* OK
* smoothen
* fix
* nitdocs
* loss
* comments
* fix
* fix
* runner_tests
* codes
* example
* fix_test
* fix
* tests
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Maksim Smolin <maximsmol@gmail.com>
2020-03-10 08:41:42 -07:00
Anthony Yu
89ec4adb72
[tune] Dragonfly Optimizer ( #5955 )
...
* Add sample example
* Copy relevant lines of ask from inherited Optimizer
* Ignore strategy
* Additional changes
* Add DragonflySearch for tune connector for Dragonfly
* Add example and fix small errors
* lint
* Remove skopt references
* Update example based off of Dragonfly changes
* Edit example for final Dragonfly edits
* Formatting and documentation edits
* Add documentation and add to test pipeline
* Address PR comments
* Fix Jenkins test
* Adjust Dragonfly to PR#7366
* Lint
* fix_tests
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-10 08:40:36 -07:00
Edward Oakes
4ab80eafb9
Deprecate use_pickle flag ( #7474 )
2020-03-09 16:03:56 -07:00
Markus Cozowicz
e03259455f
[autoscaler] azure init script path ( #7515 )
2020-03-09 09:49:07 -07:00
Markus Cozowicz
145ebe14c7
added Azure Resource Manager (ARM) template ( #7494 )
...
* added Azure Resource Manager (ARM) template
* removed Azure doc (moved to separate PR)
* nit
* fixpaths
* nit
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-08 22:29:10 -07:00
Richard Liaw
115468de2c
[tune] Repeated evals ( #7366 )
...
* easyrepeat
* done
* suggest
* doc
* ok
* commit
* Apply suggestions from code review
Co-Authored-By: Ujval Misra <misraujval@gmail.com>
* Apply suggestions from code review
Co-Authored-By: Ujval Misra <misraujval@gmail.com>
* Apply suggestions from code review
* ok
* docs
Co-authored-by: Ujval Misra <misraujval@gmail.com>
2020-03-07 11:08:23 -08:00
Sven Mika
510c850651
[RLlib] SAC add discrete action support. ( #7320 )
...
* Exploration API (+EpsilonGreedy sub-class).
* Exploration API (+EpsilonGreedy sub-class).
* Cleanup/LINT.
* Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents).
* Add `error` option to deprecation_warning().
* WIP.
* Bug fix: Get exploration-info for tf framework.
Bug fix: Properly deprecate some DQN config keys.
* WIP.
* LINT.
* WIP.
* Split PerWorkerEpsilonGreedy out of EpsilonGreedy.
Docstrings.
* Fix bug in sampler.py in case Policy has self.exploration = None
* Update rllib/agents/dqn/dqn.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Update rllib/agents/trainer.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Change requests.
* LINT
* In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set
* Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps).
* Update rllib/evaluation/worker_set.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Review fixes.
* Fix default value for DQN's exploration spec.
* LINT
* Fix recursion bug (wrong parent c'tor).
* Do not pass timestep to get_exploration_info.
* Update tf_policy.py
* Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs.
* Bug fix tf-action-dist
* DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG).
* Switch off exploration when getting action probs from off-policy-estimator's policy.
* LINT
* Fix test_checkpoint_restore.py.
* Deprecate all SAC exploration (unused) configs.
* Properly use `model.last_output()` everywhere. Instead of `model._last_output`.
* WIP.
* Take out set_epsilon from multi-agent-env test (not needed, decays anyway).
* WIP.
* Trigger re-test (flaky checkpoint-restore test).
* WIP.
* WIP.
* Add test case for deterministic action sampling in PPO.
* bug fix.
* Added deterministic test cases for different Agents.
* Fix problem with TupleActions in dynamic-tf-policy.
* Separate supported_spaces tests so they can be run separately for easier debugging.
* LINT.
* Fix autoregressive_action_dist.py test case.
* Re-test.
* Fix.
* Remove duplicate py_test rule from bazel.
* LINT.
* WIP.
* WIP.
* SAC fix.
* SAC fix.
* WIP.
* WIP.
* WIP.
* FIX 2 examples tests.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Renamed test file.
* WIP.
* Add unittest.main.
* Make action_dist_class mandatory.
* fix
* FIX.
* WIP.
* WIP.
* Fix.
* Fix.
* Fix explorations test case (contextlib cannot find its own nullcontext??).
* Force torch to be installed for QMIX.
* LINT.
* Fix determine_tests_to_run.py.
* Fix determine_tests_to_run.py.
* WIP
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Rename some stuff.
* Rename some stuff.
* WIP.
* update.
* WIP.
* Gumbel Softmax Dist.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP
* WIP.
* WIP.
* Hypertune.
* Hypertune.
* Hypertune.
* Lock-in.
* Cleanup.
* LINT.
* Fix.
* Update rllib/policy/eager_tf_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/agents/sac/sac_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/agents/sac/sac_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/models/tf/tf_action_dist.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/models/tf/tf_action_dist.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Fix items from review comments.
* Add dm_tree to RLlib dependencies.
* Add dm_tree to RLlib dependencies.
* Fix DQN test cases ((Torch)Categorical).
* Fix wrong pip install.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-06 10:37:12 -08:00
Eric Liang
476b5c6196
[Parallel Iterators] Allow for operator chaining after repartition ( #7268 )
...
* bug fix repartition
* change add_transform from private to inner
* formatting
* addressing comments
* formatting
2020-03-04 14:42:52 -08:00
Richard Liaw
c7f0b303f3
Mention that calling some_function.remote() is non-blocking ( #7417 )
...
* Mention that calling some_function.remote() is non-blocking.
* Apply suggestions from code review
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-04 13:35:46 -08:00
Richard Liaw
beddaf65b4
Small correction in documentation ( #7453 )
...
* corrected import statement in docs
* Update doc/source/tune-usage.rst
Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-04 13:28:28 -08:00
Philipp Moritz
0d7ef46c83
Bazel improvements ( #7427 )
...
* Make wget quiet
* Make sphinx-build quiet
* Remove -q from pip install in CI script as config already takes care of it
* Add documentation on custom dependencies
* formatting
* python
2020-03-04 13:13:21 -08:00
Maksim Smolin
3a134c7224
[RaySGD] Rename PyTorch API endpoints to start with Torch ( #7425 )
...
* Start renaming pytorch to torch
* Rename PyTorchTrainer to TorchTrainer
* Rename PyTorch runners to Torch runners
* Finish renaming API
* Rename to torch in tests
* Finish renaming docs + tests
* Run format + fix DeprecationWarning
* fix
* move tests up
* rename
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-03 16:44:42 -08:00
Philipp Moritz
c2c6d96490
Fix install documentation on readthedocs ( #7423 )
2020-03-03 11:03:18 -08:00
Richard Liaw
48cdca843f
[raysgd] Custom training operator ( #7211 )
2020-03-01 21:22:48 -08:00
Sven Mika
2d97650b1e
[RLlib] Add Exploration API documentation. ( #7373 )
...
* Add Exploration API documentation.
* Add Exploration API documentation.
* Add Exploration API documentation.
* Update exporation docs.
2020-03-01 16:55:41 -08:00
Sven Mika
83e06cd30a
[RLlib] DDPG refactor and Exploration API action noise classes. ( #7314 )
...
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix
* WIP.
* Add TD3 quick Pendulum regresison.
* Cleanup.
* Fix.
* LINT.
* Fix.
* Sort quick_learning test cases, add TD3.
* Sort quick_learning test cases, add TD3.
* Revert test_checkpoint_restore.py (debugging) changes.
* Fix old soft_q settings in documentation and test configs.
* More doc fixes.
* Fix test case.
* Fix test case.
* Lower test load.
* WIP.
2020-03-01 11:53:35 -08:00
SangBin Cho
50145e668d
Fix the problem that ray.remote reference is not visible at a document. ( #7311 )
2020-02-28 14:03:08 -08:00
Edward Oakes
93fe4b0b58
Change actor.__ray_kill__() to ray.kill(actor) ( #7360 )
2020-02-28 11:55:13 -06:00
Simon Mo
29b08ddc09
Improve release process from 0.8.2 ( #7303 )
2020-02-24 21:18:53 -08:00
ijrsvt
325fc24afa
Removing unused Pyarrow Info ( #7207 )
2020-02-21 17:07:26 -08:00
Simon Mo
b804d40c04
Stop vendoring pyarrow ( #7233 )
2020-02-19 19:01:26 -08:00
Simon Mo
7bef7031c2
Revert "Revert "Revert "Removing Pyarrow dependency ( #7146 )" ( #7209 ) ( #7214 )" ( #7232 )
2020-02-19 13:35:29 -08:00
Simon Mo
e8941b1b79
Revert "Revert "Removing Pyarrow dependency ( #7146 )" ( #7209 ) ( #7214 )
2020-02-19 10:08:52 -08:00
Eric Liang
0aa9373d62
Revert "Removing Pyarrow dependency ( #7146 )" ( #7209 )
...
This reverts commit 2116fd3bca
.
2020-02-18 14:12:06 -08:00
Eric Liang
5df801605e
Add ray.util package and move libraries from experimental ( #7100 )
2020-02-18 13:43:19 -08:00
ijrsvt
2116fd3bca
Removing Pyarrow dependency ( #7146 )
2020-02-17 18:00:13 -08:00
Richard Liaw
94e2fcea2e
[sgd] fp16 (apex) and scheduler support + move examples page ( #7061 )
...
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* fix tests'
* testmode
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-02-16 19:04:08 -08:00
Edward Oakes
dc5a27dac0
Move ray.experimental.multiprocessing to ray.util.multiprocessing ( #7149 )
2020-02-14 16:17:05 -08:00
Richard Liaw
fc9352c588
[docs] Make walkthrough and starting Ray materials clear ( #7099 )
...
* make starting ray a separate page
* concept
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* more fics
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-02-11 23:17:30 -08:00