* Update tf-example-sgd dependencies, AMI, and instance type
* Make PyTorch dependency optional
* Re-implement optional torch import
* Update tensorflow_train_example
* Setup tf-example-sgd config for SGD development
* Document the MultiWorkerMirroredStrategy behavior
* Run scripts/format
* Undo GPU default for CI
* Remove dev deploy file_mounts
* Update docs on tf_runner and tf_trainer
* Fix formatting
* Remove the debug file-mounts again
* Disable cifar example GPU usage by default so CI runs properly
* Mark failing PyTorch test as flaky
* Clarify the tf SGD sanity check
* Run format script
* Update tf-example-sgd.yaml
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Added histogram functionality to custom metrics infrastructure (another tab in tensorboard)
* updated example to include histogram metric
* added histograms to TBXLogger
* add episode rewards
* lint
Co-authored-by: Eric Liang <ekhliang@gmail.com>
* adding local shuffle and corresponding tests
* fix quotes
* addressing comments and adding seed argument
* formatting
* fix formatting issues
* change test size from small to medium
* addressing comments
* Revert "Revert "Support of scikit-learn with ray joblib backend (#6925)" (#6957)"
This reverts commit 86100bc119.
* adding scikit-learn to dependencies
* Rewrite the async api documentation
* Apply suggestions from code review
Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com>
* clearify comment
* Add quickstart
* Add reference for async in ray.get ray.wait docstring
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
* Add `RandomEnv` example to examples folder.
Convert warning into Error message when using an LSTM in a non-shared-vf network (after the warning, the program would crash).
* LINT.
* Fix issue #6884. LSTM + non-shared vf NN + PPO crashes when using a Tuple action space.
* LINT
* Change warning message for Model: shared_vf=False, LSTM=True cases.
* Bug fix.
* Add examples/random_env.py test to Jenkins.
* Prevent MEMORY checkpoints from breaking FT
* Add save/pause/resume/restore test
* change checkpoint return value based on status
* Fix test_checkpoint_manager_tests.
* Fix test + checkpoint manager bug
* lint
* Add docstring
* Add docstring to checkpoint_manager constructor
* Change variable name for clarity
* Revert on_checkpoint docstring wording
* Break after success
* nit: more informative warning
* Quarantine test