Eric Liang
9d012626e5
[rllib] Distributed exec workflow for impala ( #8321 )
2020-05-11 20:24:43 -07:00
mehrdadn
66b3edccb9
Prefer built-in system compilers over Clang download ( #8355 )
...
Co-authored-by: Mehrdad <noreply@github.com>
2020-05-11 11:53:35 -05:00
Sven Mika
2b0817cbd3
[RLlib] Retry pip installs (after waiting n seconds) in install-dependencies.sh ( #8354 )
2020-05-07 17:39:35 +02:00
Simon Mo
c5a5a5de89
[Serve] Refactor Metric System: Counter + Measure Support ( #8114 )
2020-05-06 17:44:02 -07:00
mehrdadn
4bdef78e2e
Various CI fixes and cleanup ( #8289 )
2020-05-05 10:47:49 -07:00
Maksim Smolin
c2acb7ffe2
[SGD] Add imagenet example CI ( #8150 )
2020-05-02 16:48:35 -07:00
mehrdadn
ff68fb8c7c
Try to fix tests running all the time ( #8280 )
...
Co-authored-by: Mehrdad <noreply@github.com>
2020-05-02 15:37:52 -05:00
Edward Oakes
22cab930cd
Retry actor failures in serve failure test ( #8282 )
2020-05-02 10:19:44 -05:00
Edward Oakes
019030cb4d
Add long-running serve failure test ( #8277 )
2020-05-01 21:07:14 -05:00
mehrdadn
bf074073e7
Deploy Windows wheels to Amazon S3 ( #8237 )
...
* Deploy to Amazon S3
* Install specifically requested Python version
Co-authored-by: Mehrdad <noreply@github.com>
2020-05-01 14:08:57 -07:00
Edward Oakes
13f718846d
[serve] Always use internal KV store ( #8270 )
2020-05-01 14:18:18 -05:00
Edward Oakes
421b3c9d8b
Fix serve long running test ( #8268 )
2020-05-01 11:54:27 -05:00
mehrdadn
254b1ec370
Set up testing and wheels for Windows on GitHub Actions ( #8131 )
...
* Move some Java tests into ci.sh
* Move C++ worker tests into ci.sh
* Define run()
* Prepare to move Python tests into ci.sh
* Fix issues in install-dependencies.sh
* Reload environment for GitHub Actions
* Move wheels to ci.sh and fix related issues
* Don't bypass failures in install-ray.sh anymore
* Make CI a little quieter
* Move linting into ci.sh
* Add vitals test right after build
* Fix os.uname() unavailability on Windows
Co-authored-by: Mehrdad <noreply@github.com>
2020-04-29 21:19:02 -07:00
Simon Mo
1b1fe0cc5b
Fix Serve long running test ( #8223 )
2020-04-29 09:32:39 -07:00
Sven Mika
eb91619175
Fix release 0.8.5 tests for PPO torch Breakout. ( #8226 )
2020-04-29 10:36:41 +02:00
Simon Mo
101255f782
[Serve] RayServe TF, PyTorch, Sklearn Examples ( #8156 )
2020-04-28 22:24:55 -07:00
Richard Liaw
87557a00fa
[tune] Refactor search algorithms ( #7037 )
...
* start refactoring of search algorithms
* format
* needs tests
* fix
* suggestions
* Fix PBT
* lint
* refactoring
* hyperopt_working
* dragonfly
* hyperopt
* change_half_of_algs
* save
* code-removed
* remove_lots_of_unneccessary
* changes
* formatting
* suggest
* reset
* rm
* tests
* search-change
* exception
* refactor-doc
* search
* py
* moredocs
* Update doc/source/tune-searchalg.rst
* concurrency
* max
* tune
* betterwarning
* bohb
* tests
* test-change
Co-authored-by: ujvl <misraujval@gmail.com>
2020-04-27 08:51:13 -07:00
mehrdadn
0a54407961
[CI] Factor out more Travis code and update GitHub Actions ( #8085 )
2020-04-21 09:53:08 -07:00
Richard Liaw
9f3e9e7e9f
[tune] Add more intensive tests ( #7667 )
...
* make_heavier_tests
* help
2020-04-20 11:14:44 -07:00
Richard Liaw
6545534805
[tune/sgd] DCGAN example self-contained, turn example into modu… ( #8012 )
...
* ok
* done
* run_benchmarks
* should_make_examples_usable
2020-04-16 17:55:27 -07:00
mehrdadn
42f88ecf9d
Hotfix CI Export Tests to Skip ( #8058 )
...
Co-authored-by: Mehrdad <noreply@github.com>
2020-04-16 15:23:00 -07:00
Servon
5c274fe631
[Tune] Add ZOOpt search algorithm ( #7960 )
...
* add zoopt
* add zoopt search algo
* add zoopt
* fix zoopt
* add zoopt requirements
* fix zoopt
* remove generated guides
* Apply suggestions from code review
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-04-15 21:13:29 -07:00
mehrdadn
956ea7c944
Hotfix CI determine_tests_to_run ( #8039 )
2020-04-15 17:00:38 -07:00
mehrdadn
ba00c29b67
Factor out Travis 'install' sections for use with GitHub Actions ( #7988 )
2020-04-15 08:10:22 -07:00
mehrdadn
4aa68b82fa
[CI] Various Improvements to Travis Scripts ( #7956 )
...
* Delete LINT section of install-ray.sh since it appears unused
* Delete install.sh since it appears unused
* Delete run_test.sh since it appears unused
* Put environment variables on separate lines in .travis.yml
* Move --jobs 50 out of install-ray.sh
* Delete upgrade-syn.sh since it appears unused
* Move CI bazel flags to .bazelrc via --config
* Make installations quieter
* Get rid of verbose Maven messages
* Install Bazel system-wide for CI so that there's no need to update PATH
* Recognize Windows as valid platform
Co-authored-by: Mehrdad <noreply@github.com>
2020-04-10 13:26:28 -07:00
Sven Mika
0a5b6d1f57
[Testing] Do not run any non-RLlib/core tests if only RLLib affected (except wheels). ( #7892 )
...
* Do not run any non-RLlib/core tests if only RLLib affected, except for generating the 2 wheels (OSX and Linux).
* Test noop RLlib change.
* Test noop RLlib change.
* Fix broken RLlib tests in master.
* Split BAZEL learning tests into cartpole and pendulum (reached the 60min barrier).
* Fix error_outputs option in BAZEL for RLlib regression tests.
* Fix.
* Test.
* WIP.
* Add env flag RAY_CI_ONLY_RLLIB_AFFECTED to refrain from testing most ray-core stuff (except wheels) if only RLlib changed.
* Test RLlib-only change.
2020-04-09 14:36:06 -07:00
Simon Mo
59867dad75
Move Jenkins test to Github action ( #7342 )
2020-04-09 10:27:19 -07:00
mehrdadn
65054a2c7c
Python 3.8 compatibility ( #7754 )
2020-04-01 10:03:23 -07:00
Richard Liaw
24bf6ad607
[raysgd] Improve raysgd examples ( #7818 )
...
* better_example
* test
* improve some usability things
* submit
* fix
* flake
* Update python/ray/util/sgd/torch/training_operator.py
* trythis
* fix
* fix
* smoke
* fail
* fix
* fix
2020-04-01 08:58:39 -07:00
mehrdadn
f86e623095
Fix & improve GitHub Actions CI builds ( #7784 )
2020-03-30 16:29:54 -07:00
Edward Oakes
d87563937e
Revert "[Dashboard] Metrics Export Service. ( #7728 )" ( #7789 )
2020-03-28 19:27:34 -07:00
Simon Mo
838c1e854f
Add results from 0.8.3 release ( #7745 )
2020-03-27 11:14:15 -07:00
SongGuyang
c195dc8f88
Basic C++ worker implementation ( #6125 )
2020-03-27 23:01:08 +08:00
SangBin Cho
7a0befb0a7
[Dashboard] Metrics Export Service. ( #7728 )
2020-03-26 14:03:00 -07:00
Robert Nishihara
1a0c9228d0
Remove pytest from setup.py and other minor changes. ( #7700 )
2020-03-23 08:46:56 -07:00
Robert Nishihara
8b4c2b7e88
Remove unnecessary handling of setproctitle and psutil. ( #7702 )
2020-03-22 22:06:42 -07:00
tison
ffeab5d2bf
Support configurable python executable in format.sh ( #7513 )
2020-03-14 12:27:41 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length ( #7503 )
...
* bulk rename
* deprecation warn
* update doc
* update fig
* line length
* rename
* make pytest comptaible
* fix test
* fi sys
* rename
* wip
* fix more
* lint
* update svg
* comments
* lint
* fix use of batch steps
2020-03-14 12:05:04 -07:00
Ujval Misra
6022eb53c4
[tune] Use newest checkpoint in normal operation ( #7563 )
...
* Use persistent checkpoint for failures
* Fix test
* Add unpause test
* move test
* Fix tests
* remove debug statement
* Mark test as flaky
2020-03-12 22:21:42 -07:00
Richard Liaw
d192ef0611
[raysgd] Cleanup User API ( #7384 )
...
* Init fp16
* fp16 and schedulers
* scheduler linking and fp16
* to fp16
* loss scaling and documentation
* more documentation
* add tests, refactor config
* moredocs
* more docs
* fix logo, add test mode, add fp16 flag
* fix tests
* fix scheduler
* fix apex
* improve safety
* fix tests
* fix tests
* remove pin memory default
* rm
* fix
* Update doc/examples/doc_code/raysgd_torch_signatures.py
* fix
* migrate changes from other PR
* ok thanks
* pass
* signatures
* lint'
* Update python/ray/experimental/sgd/pytorch/utils.py
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* should address most comments
* comments
* fix this ci
* first_pass
* add overrides
* override
* fixing up operators
* format
* sgd
* constants
* rm
* revert
* save
* failures
* fixes
* trainer
* run test
* operator
* code
* op
* ok done
* operator
* sgd test fixes
* ok
* trainer
* format
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* Update doc/source/raysgd/raysgd_pytorch.rst
* docstring
* dcgan
* doc
* commits
* nit
* testing
* revert
* Start renaming pytorch to torch
* Rename PyTorchTrainer to TorchTrainer
* Rename PyTorch runners to Torch runners
* Finish renaming API
* Rename to torch in tests
* Finish renaming docs + tests
* Run format + fix DeprecationWarning
* fix
* move tests up
* benchmarks
* rename
* remove some args
* better metrics output
* fix up the benchmark
* benchmark-yaml
* horovod-benchmark
* benchmarks
* Remove benchmark code for cleanups
* makedatacreator
* relax
* metrics
* autosetsampler
* profile
* movements
* OK
* smoothen
* fix
* nitdocs
* loss
* comments
* fix
* fix
* runner_tests
* codes
* example
* fix_test
* fix
* tests
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Maksim Smolin <maximsmol@gmail.com>
2020-03-10 08:41:42 -07:00
Anthony Yu
89ec4adb72
[tune] Dragonfly Optimizer ( #5955 )
...
* Add sample example
* Copy relevant lines of ask from inherited Optimizer
* Ignore strategy
* Additional changes
* Add DragonflySearch for tune connector for Dragonfly
* Add example and fix small errors
* lint
* Remove skopt references
* Update example based off of Dragonfly changes
* Edit example for final Dragonfly edits
* Formatting and documentation edits
* Add documentation and add to test pipeline
* Address PR comments
* Fix Jenkins test
* Adjust Dragonfly to PR#7366
* Lint
* fix_tests
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-10 08:40:36 -07:00
Landcold7
beb9b02dbd
Add numba test ( #7298 ) ( #7487 )
2020-03-07 11:12:25 -08:00
Sven Mika
510c850651
[RLlib] SAC add discrete action support. ( #7320 )
...
* Exploration API (+EpsilonGreedy sub-class).
* Exploration API (+EpsilonGreedy sub-class).
* Cleanup/LINT.
* Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents).
* Add `error` option to deprecation_warning().
* WIP.
* Bug fix: Get exploration-info for tf framework.
Bug fix: Properly deprecate some DQN config keys.
* WIP.
* LINT.
* WIP.
* Split PerWorkerEpsilonGreedy out of EpsilonGreedy.
Docstrings.
* Fix bug in sampler.py in case Policy has self.exploration = None
* Update rllib/agents/dqn/dqn.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Update rllib/agents/trainer.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* WIP.
* Change requests.
* LINT
* In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set
* Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps).
* Update rllib/evaluation/worker_set.py
Co-Authored-By: Eric Liang <ekhliang@gmail.com>
* Review fixes.
* Fix default value for DQN's exploration spec.
* LINT
* Fix recursion bug (wrong parent c'tor).
* Do not pass timestep to get_exploration_info.
* Update tf_policy.py
* Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs.
* Bug fix tf-action-dist
* DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG).
* Switch off exploration when getting action probs from off-policy-estimator's policy.
* LINT
* Fix test_checkpoint_restore.py.
* Deprecate all SAC exploration (unused) configs.
* Properly use `model.last_output()` everywhere. Instead of `model._last_output`.
* WIP.
* Take out set_epsilon from multi-agent-env test (not needed, decays anyway).
* WIP.
* Trigger re-test (flaky checkpoint-restore test).
* WIP.
* WIP.
* Add test case for deterministic action sampling in PPO.
* bug fix.
* Added deterministic test cases for different Agents.
* Fix problem with TupleActions in dynamic-tf-policy.
* Separate supported_spaces tests so they can be run separately for easier debugging.
* LINT.
* Fix autoregressive_action_dist.py test case.
* Re-test.
* Fix.
* Remove duplicate py_test rule from bazel.
* LINT.
* WIP.
* WIP.
* SAC fix.
* SAC fix.
* WIP.
* WIP.
* WIP.
* FIX 2 examples tests.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* Fix.
* LINT.
* Renamed test file.
* WIP.
* Add unittest.main.
* Make action_dist_class mandatory.
* fix
* FIX.
* WIP.
* WIP.
* Fix.
* Fix.
* Fix explorations test case (contextlib cannot find its own nullcontext??).
* Force torch to be installed for QMIX.
* LINT.
* Fix determine_tests_to_run.py.
* Fix determine_tests_to_run.py.
* WIP
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function).
* Rename some stuff.
* Rename some stuff.
* WIP.
* update.
* WIP.
* Gumbel Softmax Dist.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP
* WIP.
* WIP.
* Hypertune.
* Hypertune.
* Hypertune.
* Lock-in.
* Cleanup.
* LINT.
* Fix.
* Update rllib/policy/eager_tf_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/agents/sac/sac_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/agents/sac/sac_policy.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/models/tf/tf_action_dist.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Update rllib/models/tf/tf_action_dist.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
* Fix items from review comments.
* Add dm_tree to RLlib dependencies.
* Add dm_tree to RLlib dependencies.
* Fix DQN test cases ((Torch)Categorical).
* Fix wrong pip install.
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-03-06 10:37:12 -08:00
Stephanie Wang
7c174d0ffe
Make the ref counting test more stressful ( #7473 )
2020-03-05 20:51:24 -08:00
Maksim Smolin
3a134c7224
[RaySGD] Rename PyTorch API endpoints to start with Torch ( #7425 )
...
* Start renaming pytorch to torch
* Rename PyTorchTrainer to TorchTrainer
* Rename PyTorch runners to Torch runners
* Finish renaming API
* Rename to torch in tests
* Finish renaming docs + tests
* Run format + fix DeprecationWarning
* fix
* move tests up
* rename
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-03-03 16:44:42 -08:00
mehrdadn
44aded5272
Bazel mirrors ( #7385 )
...
* Switch to mirrors.bazel.build where possible
* Switch from .zip to .tar.gz for smaller downloads (it's also the default download on UNIX)
* Use direct GitHub URLs in Bazel files for clarity
* Don't pass patches to local_repository
* Remove github_repository()
* Switch to GitHub actions/checkout@v2 which is faster
* Use faster extraction method for LLVm on Windows
* Move LLVM_VERSION_WINDOWS to the shell script since it's not a CI-specific value
* Change GITHUB_TOKEN to GITHUB
* Don't show timestamps for GitHub Actions
* Factor out some options from GitHub Actions
* Tell Bazel to stay on the same volume in GitHun Actions
* Display progress output when downloading toolchains
Co-authored-by: GitHub Web Flow <noreply@github.com>
2020-03-01 14:04:06 -08:00
Edward Oakes
ee0f71e398
Add __commit__ field to ray package in wheels ( #7305 )
2020-02-26 17:54:22 -08:00
mehrdadn
bcecf8b46b
Bazel improvements ( #7170 )
2020-02-26 12:28:13 -08:00
Simon Mo
29b08ddc09
Improve release process from 0.8.2 ( #7303 )
2020-02-24 21:18:53 -08:00
chaokunyang
8b6784de06
[Streaming] Streaming Python API ( #6755 )
2020-02-25 10:33:33 +08:00