mirror of
https://github.com/vale981/ray
synced 2025-03-08 19:41:38 -05:00
![]() * Exploration API (+EpsilonGreedy sub-class). * Exploration API (+EpsilonGreedy sub-class). * Cleanup/LINT. * Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents). * Add `error` option to deprecation_warning(). * WIP. * Bug fix: Get exploration-info for tf framework. Bug fix: Properly deprecate some DQN config keys. * WIP. * LINT. * WIP. * Split PerWorkerEpsilonGreedy out of EpsilonGreedy. Docstrings. * Fix bug in sampler.py in case Policy has self.exploration = None * Update rllib/agents/dqn/dqn.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Update rllib/agents/trainer.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Change requests. * LINT * In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set * Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps). * Update rllib/evaluation/worker_set.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Review fixes. * Fix default value for DQN's exploration spec. * LINT * Fix recursion bug (wrong parent c'tor). * Do not pass timestep to get_exploration_info. * Update tf_policy.py * Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs. * Bug fix tf-action-dist * DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG). * Switch off exploration when getting action probs from off-policy-estimator's policy. * LINT * Fix test_checkpoint_restore.py. * Deprecate all SAC exploration (unused) configs. * Properly use `model.last_output()` everywhere. Instead of `model._last_output`. * WIP. * Take out set_epsilon from multi-agent-env test (not needed, decays anyway). * WIP. * Trigger re-test (flaky checkpoint-restore test). * WIP. * WIP. * Add test case for deterministic action sampling in PPO. * bug fix. * Added deterministic test cases for different Agents. * Fix problem with TupleActions in dynamic-tf-policy. * Separate supported_spaces tests so they can be run separately for easier debugging. * LINT. * Fix autoregressive_action_dist.py test case. * Re-test. * Fix. * Remove duplicate py_test rule from bazel. * LINT. * WIP. * WIP. * SAC fix. * SAC fix. * WIP. * WIP. * WIP. * FIX 2 examples tests. * WIP. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Renamed test file. * WIP. * Add unittest.main. * Make action_dist_class mandatory. * fix * FIX. * WIP. * WIP. * Fix. * Fix. * Fix explorations test case (contextlib cannot find its own nullcontext??). * Force torch to be installed for QMIX. * LINT. * Fix determine_tests_to_run.py. * Fix determine_tests_to_run.py. * WIP * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Rename some stuff. * Rename some stuff. * WIP. * update. * WIP. * Gumbel Softmax Dist. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP * WIP. * WIP. * Hypertune. * Hypertune. * Hypertune. * Lock-in. * Cleanup. * LINT. * Fix. * Update rllib/policy/eager_tf_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/agents/sac/sac_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/agents/sac/sac_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/models/tf/tf_action_dist.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/models/tf/tf_action_dist.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Fix items from review comments. * Add dm_tree to RLlib dependencies. * Add dm_tree to RLlib dependencies. * Fix DQN test cases ((Torch)Categorical). * Fix wrong pip install. Co-authored-by: Eric Liang <ekhliang@gmail.com> Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com> |
||
---|---|---|
.. | ||
_static | ||
_templates | ||
images | ||
raysgd | ||
a2c-arch.svg | ||
actors.rst | ||
advanced.rst | ||
apex-arch.svg | ||
apex.png | ||
async_api.rst | ||
autoscaler-status.png | ||
autoscaling.rst | ||
cluster-index.rst | ||
conf.py | ||
configure.rst | ||
custom_directives.py | ||
custom_metric.png | ||
ddppo-arch.svg | ||
deploy-on-kubernetes.rst | ||
deploy-on-yarn.rst | ||
deploying-on-slurm.rst | ||
development.rst | ||
dqn-arch.svg | ||
es.png | ||
fault-tolerance.rst | ||
getting-involved.rst | ||
impala-arch.svg | ||
impala.png | ||
index.rst | ||
installation.rst | ||
iter.rst | ||
joblib.rst | ||
memory-management.rst | ||
multi-agent.svg | ||
multi-flat.svg | ||
multiprocessing.rst | ||
offline-q.png | ||
package-ref.rst | ||
pandas_on_ray.rst | ||
pbt.png | ||
ppo-arch.svg | ||
ppo.png | ||
profiling.rst | ||
projects.rst | ||
pytorch.png | ||
ray-tune-parcoords.png | ||
ray-tune-tensorboard.png | ||
ray-tune-viskit.png | ||
rllib-algorithms.rst | ||
rllib-api.svg | ||
rllib-components.svg | ||
rllib-concepts.rst | ||
rllib-config.svg | ||
rllib-dev.rst | ||
rllib-env.rst | ||
rllib-envs.svg | ||
rllib-examples.rst | ||
rllib-models.rst | ||
rllib-offline.rst | ||
rllib-package-ref.rst | ||
rllib-stack.svg | ||
rllib-toc.rst | ||
rllib-training.rst | ||
rllib.rst | ||
rock-paper-scissors.png | ||
serialization.rst | ||
serve.rst | ||
sgd.png | ||
starting-ray.rst | ||
tensorflow.png | ||
throughput.png | ||
timeline.png | ||
troubleshooting.rst | ||
tune-advanced-tutorial.rst | ||
tune-contrib.rst | ||
tune-design.rst | ||
tune-distributed.rst | ||
tune-examples.rst | ||
tune-package-ref.rst | ||
tune-schedulers.rst | ||
tune-searchalg.rst | ||
tune-tutorial.rst | ||
tune-usage.rst | ||
tune.rst | ||
using-ray-on-a-cluster.rst | ||
using-ray-with-gpus.rst | ||
using-ray-with-pytorch.rst | ||
using-ray-with-tensorflow.rst | ||
using-ray.rst | ||
walkthrough.rst |