Landcold7
e6a045df48
Fix typo in asyncio documentation ( #7602 )
2020-03-17 10:37:37 -05:00
Scott Graham
37e4d29f87
[autoscaler] Adding Azure Support ( #7080 )
...
* adding directory and node_provider entry for azure autoscaler
* adding initial cut at azure autoscaler functionality, needs testing and node_provider methods need updating
* adding todos and switching to auth file for service principal authentication
* adding role / scope to service principal
* resolving issues with app credentials
* adding retry for setting service principal role
* typo and adding retry to nic creation
* adding nsg to config, moving nic/public ip to node provider, cleanup node_provider, leaving in NodeProvider stub for testing
* linting
* updating cleanup and fixing bugs
* adding directory and node_provider entry for azure autoscaler
* adding initial cut at azure autoscaler functionality, needs testing and node_provider methods need updating
* adding todos and switching to auth file for service principal authentication
* adding role / scope to service principal
* resolving issues with app credentials
* adding retry for setting service principal role
* typo and adding retry to nic creation
* adding nsg to config, moving nic/public ip to node provider, cleanup node_provider, leaving in NodeProvider stub for testing
* linting
* updating cleanup and fixing bugs
* minor fixes
* first working version :)
* added tag support
* added msi identity intermediate
* enable MSI through user managed identity
* updated schema
* extend yaml schema
remove service principal code
add re-use of managed user identity
* fix rg_id
* fix logging
* replace manual cluster yaml validation with json schema
- improved error message
- support for intellisense in VSCode (or other IDEs)
* run linting
* updating yaml configs and formatting
* updating yaml configs and formatting
* typo in example config
* pulling default config from example-full
* resetting min, init worker prop
* adding docs for azure autoscaler and fixing status
* add azure to docs, fix config for spot instances, update azure provider to avoid caching issues during deployment
* fix for default subscription in azure node provider
* vm dev image build
* minor change
* keeping example-full.yaml in autoscaler/azure, updating azure example config
* linting azure config
* extending retries on azure config
* lint
* support for internal ips, fix to azure docs, and new azure gpu example config
* linting
* Update python/ray/autoscaler/azure/node_provider.py
Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>
* revert_this
* remove_schema
* updating configs and removing ssh keygen, tweak azure node provider terminate
* minor tweaks
Co-authored-by: Markus Cozowicz <marcozo@microsoft.com>
Co-authored-by: Ubuntu <marcozo@mc-ray-jumpbox.chcbtljllnieveqhw3e4c1ducc.xx.internal.cloudapp.net>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-15 14:48:27 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length ( #7503 )
...
* bulk rename
* deprecation warn
* update doc
* update fig
* line length
* rename
* make pytest comptaible
* fix test
* fi sys
* rename
* wip
* fix more
* lint
* update svg
* comments
* lint
* fix use of batch steps
2020-03-14 12:05:04 -07:00
Eric Liang
52cf77f5a9
[rllib] SAC no_done_at_end should default to False ( #7594 )
...
* update
* update doc
* stochastic
* cleanu
2020-03-14 11:16:54 -07:00
Richard Liaw
b38ed4be71
[raysgd] Fix More Docs ( #7565 )
2020-03-11 14:17:47 -07:00
Richard Liaw
d046faeb9c
[sgd] Readme fix ( #7564 )
...
* readme fix
* replicas
2020-03-11 13:40:18 -07:00
Richard Liaw
b70f31339c
[sgd] Benchmark Fixes ( #7553 )
...
* fix
* fix
2020-03-11 13:08:27 -07:00
Richard Liaw
fbac256982
[sgd] Add benchmarks ( #7454 )
...
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* first_pass
* add overrides
* override
* fixing up operators
* format
* sgd
* constants
* rm
* revert
* save
* failures
* fixes
* trainer
* run test
* operator
* code
* op
* ok done
* operator
* sgd test fixes
* ok
* trainer
* format
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* Update doc/source/raysgd/raysgd_pytorch.rst
* docstring
* dcgan
* doc
* commits
* nit
* testing
* revert
* Start renaming pytorch to torch
* Rename PyTorchTrainer to TorchTrainer
* Rename PyTorch runners to Torch runners
* Finish renaming API
* Rename to torch in tests
* Finish renaming docs + tests
* Run format + fix DeprecationWarning
* fix
* move tests up
* benchmarks
* rename
* remove some args
* better metrics output
* fix up the benchmark
* benchmark-yaml
* horovod-benchmark
* benchmarks
* Remove benchmark code for cleanups
* benchmark-code
* nits
* benchmark yamls
* benchmark yaml
* ok
* ok
* ok
* benchmark
* nit
* finish_bench
* makedatacreator
* relax
* metrics
* autosetsampler
* profile
* movements
* OK
* smoothen
* fix
* nitdocs
* loss
* envflag
* comments
* nit
* format
* visible
* images
* move_images
* fix
* rernder
* rrender
* rest
* multgpu
* fix
* nit
* finish
* extrra
* setup
* revert
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Maksim Smolin <maximsmol@gmail.com>
2020-03-11 01:09:08 -07:00
Richard Liaw
d192ef0611
[raysgd] Cleanup User API ( #7384 )
...
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* first_pass
* add overrides
* override
* fixing up operators
* format
* sgd
* constants
* rm
* revert
* save
* failures
* fixes
* trainer
* run test
* operator
* code
* op
* ok done
* operator
* sgd test fixes
* ok
* trainer
* format
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* Update doc/source/raysgd/raysgd_pytorch.rst
* docstring
* dcgan
* doc
* commits
* nit
* testing
* revert
* Start renaming pytorch to torch
* Rename PyTorchTrainer to TorchTrainer
* Rename PyTorch runners to Torch runners
* Finish renaming API
* Rename to torch in tests
* Finish renaming docs + tests
* Run format + fix DeprecationWarning
* fix
* move tests up
* benchmarks
* rename
* remove some args
* better metrics output
* fix up the benchmark
* benchmark-yaml
* horovod-benchmark
* benchmarks
* Remove benchmark code for cleanups
* makedatacreator
* relax
* metrics
* autosetsampler
* profile
* movements
* OK
* smoothen
* fix
* nitdocs
* loss
* comments
* fix
* fix
* runner_tests
* codes
* example
* fix_test
* fix
* tests
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Maksim Smolin <maximsmol@gmail.com>
2020-03-10 08:41:42 -07:00
Anthony Yu
89ec4adb72
[tune] Dragonfly Optimizer ( #5955 )
...
* Add sample example
* Copy relevant lines of ask from inherited Optimizer
* Ignore strategy
* Additional changes
* Add DragonflySearch for tune connector for Dragonfly
* Add example and fix small errors
* lint
* Remove skopt references
* Update example based off of Dragonfly changes
* Edit example for final Dragonfly edits
* Formatting and documentation edits
* Add documentation and add to test pipeline
* Address PR comments
* Fix Jenkins test
* Adjust Dragonfly to PR#7366
* Lint
* fix_tests
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-10 08:40:36 -07:00
Markus Cozowicz
145ebe14c7
added Azure Resource Manager (ARM) template ( #7494 )
...
* added Azure Resource Manager (ARM) template
* removed Azure doc (moved to separate PR)
* nit
* fixpaths
* nit
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-08 22:29:10 -07:00
Richard Liaw
115468de2c
[tune] Repeated evals ( #7366 )
...
* easyrepeat
* done
* suggest
* doc
* ok
* commit
* Apply suggestions from code review
Co-Authored-By: Ujval Misra <misraujval@gmail.com>
* Apply suggestions from code review
Co-Authored-By: Ujval Misra <misraujval@gmail.com>
* Apply suggestions from code review
* ok
* docs
Co-authored-by: Ujval Misra <misraujval@gmail.com>
2020-03-07 11:08:23 -08:00
Sven Mika
510c850651
[RLlib] SAC add discrete action support. ( #7320 )
...
* Exploration API (+EpsilonGreedy sub-class).
* Exploration API (+EpsilonGreedy sub-class).
* Cleanup/LINT.
* Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents).
* Add `error` option to deprecation_warning().
* WIP.
* Bug fix: Get exploration-info for tf framework.
Bug fix: Properly deprecate some DQN config keys.
* WIP.
* LINT.
* WIP.
* Split PerWorkerEpsilonGreedy out of EpsilonGreedy.
Docstrings.
* Fix bug in sampler.py in case Policy has self.exploration = None
* Update rllib/agents/dqn/dqn.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Update rllib/agents/trainer.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Change requests.
* LINT
* In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set
* Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps).
* Update rllib/evaluation/worker_set.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Review fixes.
* Fix default value for DQN's exploration spec.
* LINT
* Fix recursion bug (wrong parent c'tor).
* Do not pass timestep to get_exploration_info.
* Update tf_policy.py
* Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs.
* Bug fix tf-action-dist
* DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG).
* Switch off exploration when getting action probs from off-policy-estimator's policy.
* LINT
* Fix test_checkpoint_restore.py.
* Deprecate all SAC exploration (unused) configs.
* Properly use `model.last_output()` everywhere. Instead of `model._last_output`.
* WIP.
* Take out set_epsilon from multi-agent-env test (not needed, decays anyway).
* WIP.
* Trigger re-test (flaky checkpoint-restore test).
* WIP.
* WIP.
* Add test case for deterministic action sampling in PPO.
* bug fix.
* Added deterministic test cases for different Agents.
* Fix problem with TupleActions in dynamic-tf-policy.
* Separate supported_spaces tests so they can be run separately for easier debugging.
* LINT.
* Fix autoregressive_action_dist.py test case.
* Re-test.
* Fix.
* Remove duplicate py_test rule from bazel.
* LINT.
* WIP.
* WIP.
* SAC fix.
* SAC fix.
* WIP.
* WIP.
* WIP.
* FIX 2 examples tests.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Renamed test file.
* WIP.
* Add unittest.main.
* Make action_dist_class mandatory.
* fix
* FIX.
* WIP.
* WIP.
* Fix.
* Fix.
* Fix explorations test case (contextlib cannot find its own nullcontext??).
* Force torch to be installed for QMIX.
* LINT.
* Fix determine_tests_to_run.py.
* Fix determine_tests_to_run.py.
* WIP
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Rename some stuff.
* Rename some stuff.
* WIP.
* update.
* WIP.
* Gumbel Softmax Dist.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP
* WIP.
* WIP.
* Hypertune.
* Hypertune.
* Hypertune.
* Lock-in.
* Cleanup.
* LINT.
* Fix.
* Update rllib/policy/eager_tf_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/agents/sac/sac_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/agents/sac/sac_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/models/tf/tf_action_dist.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/models/tf/tf_action_dist.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Fix items from review comments.
* Add dm_tree to RLlib dependencies.
* Add dm_tree to RLlib dependencies.
* Fix DQN test cases ((Torch)Categorical).
* Fix wrong pip install.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-06 10:37:12 -08:00
Eric Liang
476b5c6196
[Parallel Iterators] Allow for operator chaining after repartition ( #7268 )
...
* bug fix repartition
* change add_transform from private to inner
* formatting
* addressing comments
* formatting
2020-03-04 14:42:52 -08:00
Richard Liaw
c7f0b303f3
Mention that calling some_function.remote() is non-blocking ( #7417 )
...
* Mention that calling some_function.remote() is non-blocking.
* Apply suggestions from code review
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-04 13:35:46 -08:00
Richard Liaw
beddaf65b4
Small correction in documentation ( #7453 )
...
* corrected import statement in docs
* Update doc/source/tune-usage.rst
Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-04 13:28:28 -08:00
Philipp Moritz
0d7ef46c83
Bazel improvements ( #7427 )
...
* Make wget quiet
* Make sphinx-build quiet
* Remove -q from pip install in CI script as config already takes care of it
* Add documentation on custom dependencies
* formatting
* python
2020-03-04 13:13:21 -08:00
Maksim Smolin
3a134c7224
[RaySGD] Rename PyTorch API endpoints to start with Torch ( #7425 )
...
* Start renaming pytorch to torch
* Rename PyTorchTrainer to TorchTrainer
* Rename PyTorch runners to Torch runners
* Finish renaming API
* Rename to torch in tests
* Finish renaming docs + tests
* Run format + fix DeprecationWarning
* fix
* move tests up
* rename
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-03 16:44:42 -08:00
Philipp Moritz
c2c6d96490
Fix install documentation on readthedocs ( #7423 )
2020-03-03 11:03:18 -08:00
Richard Liaw
48cdca843f
[raysgd] Custom training operator ( #7211 )
2020-03-01 21:22:48 -08:00
Sven Mika
2d97650b1e
[RLlib] Add Exploration API documentation. ( #7373 )
...
* Add Exploration API documentation.
* Add Exploration API documentation.
* Add Exploration API documentation.
* Update exporation docs.
2020-03-01 16:55:41 -08:00
Sven Mika
83e06cd30a
[RLlib] DDPG refactor and Exploration API action noise classes. ( #7314 )
...
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix
* WIP.
* Add TD3 quick Pendulum regresison.
* Cleanup.
* Fix.
* LINT.
* Fix.
* Sort quick_learning test cases, add TD3.
* Sort quick_learning test cases, add TD3.
* Revert test_checkpoint_restore.py (debugging) changes.
* Fix old soft_q settings in documentation and test configs.
* More doc fixes.
* Fix test case.
* Fix test case.
* Lower test load.
* WIP.
2020-03-01 11:53:35 -08:00
SangBin Cho
50145e668d
Fix the problem that ray.remote reference is not visible at a document. ( #7311 )
2020-02-28 14:03:08 -08:00
Edward Oakes
93fe4b0b58
Change actor.__ray_kill__() to ray.kill(actor) ( #7360 )
2020-02-28 11:55:13 -06:00
ijrsvt
325fc24afa
Removing unused Pyarrow Info ( #7207 )
2020-02-21 17:07:26 -08:00
Eric Liang
5df801605e
Add ray.util package and move libraries from experimental ( #7100 )
2020-02-18 13:43:19 -08:00
Richard Liaw
94e2fcea2e
[sgd] fp16 (apex) and scheduler support + move examples page ( #7061 )
...
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* fix tests'
* testmode
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-02-16 19:04:08 -08:00
Edward Oakes
dc5a27dac0
Move ray.experimental.multiprocessing to ray.util.multiprocessing ( #7149 )
2020-02-14 16:17:05 -08:00
Richard Liaw
fc9352c588
[docs] Make walkthrough and starting Ray materials clear ( #7099 )
...
* make starting ray a separate page
* concept
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* more fics
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-02-11 23:17:30 -08:00
Simon Mo
039d2cde88
Change log level for OMP warning ( #7114 )
2020-02-11 14:15:38 -08:00
Eric Liang
026f6884b5
[rllib] Add Decentralized DDPPO trainer and documentation ( #7088 )
2020-02-10 15:28:27 -08:00
Alex Wu
3f99be8dad
Add 'ray dashboard' command ( #6959 )
2020-02-10 12:55:21 -08:00
Eric Liang
48e2adbc21
[tune] Remove unused TF loggers ( #7090 )
2020-02-09 13:58:24 -08:00
fyrestone
0648bd28ef
[xlang] Cross language Python support ( #6709 )
2020-02-08 13:01:28 +08:00
Sven Mika
0e3960893a
[RLlib] Add rainbow config hint to algo-documentation. ( #7052 )
2020-02-05 12:01:43 -08:00
Dean Wampler
9b9c7f86f7
Improved contributor instructions ( #7026 )
...
* Added small section on installation when using Anaconda. Also fixed an obsolete link to Anaconda.
* Delete more temporary directories when running the doc "make clean".
* Added a link to the docs for building and contributing
* Minor comment
2020-02-04 14:07:51 -08:00
Eric Liang
dc7e78deb4
[rllib] Add -v and --torch flags to first page of docs ( #7032 )
...
* add verbose doc
* torch
2020-02-04 10:17:51 -08:00
Eric Liang
fbc545c03b
[rllib] Support parallel, parameterized evaluation ( #6981 )
...
* eval api
* update
* sync eval filters
* sync fix
* docs
* update
* docs
* update
* link
* nit
* doc updates
* format
2020-02-01 22:12:12 -08:00
Yutai Zhou
9b6794cbb0
[rllib] updated policy definition link ( #6989 )
2020-01-31 16:22:11 -08:00
Edward Oakes
4a78b60cf7
Remove link to meetup RSVP from docs ( #6995 )
2020-01-31 11:32:50 -08:00
Ameer Haj Ali
b8135da122
Adding dependencies for scikit-learn in travis ( #6969 )
...
* Revert "Revert "Support of scikit-learn with ray joblib backend (#6925 )" (#6957 )"
This reverts commit 86100bc119
.
* adding scikit-learn to dependencies
2020-01-30 09:46:54 -08:00
Simon Mo
1e3a34b223
Rewrite the async api documentation ( #6936 )
...
* Rewrite the async api documentation
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* clearify comment
* Add quickstart
* Add reference for async in ray.get ray.wait docstring
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-01-30 09:34:09 -08:00
Richard Liaw
5ab395236b
[tune] Experiment stopping API ( #6886 )
2020-01-30 00:34:08 -08:00
Simon Mo
5bdfc50bf6
Update the macos wheel name ( #6961 )
2020-01-29 15:23:43 -08:00
Eric Liang
86100bc119
Revert "Support of scikit-learn with ray joblib backend ( #6925 )" ( #6957 )
...
This reverts commit a7ecda6017
.
2020-01-29 14:56:09 -08:00
Richard Liaw
037aa2b961
[sgd] Refactor PyTorch SGD Documentation. ( #6910 )
...
* Refactor documentation and directory structurre
* update loss
* ,ore examples
* fix comments
* more code
* svgs
* formatting
* more_docs
* more writing
* comments ready
* move
* whitespace
* examples
* fix
* bold
* pytorch
* batch
* fix
* fix test
* Apply suggestions from code review
* quarantinegp
* tests/
* fix missing
2020-01-29 08:51:01 -08:00
Ameer Haj Ali
81238945b9
Update index.rst ( #6935 )
2020-01-27 18:35:48 -06:00
Eric Liang
e659699ca9
[tune] Fix directory naming regression ( #6839 )
2020-01-27 15:53:40 -08:00
Richard Liaw
e0078a0d78
[autoscaler][minor] default -> latest_dlami ( #6922 )
...
* config
* latest
* Update python/ray/autoscaler/aws/config.py
2020-01-27 14:34:07 -08:00
Ameer Haj Ali
a7ecda6017
Support of scikit-learn with ray joblib backend ( #6925 )
2020-01-27 15:00:00 -06:00