mirror of
https://github.com/vale981/ray
synced 2025-03-06 10:31:39 -05:00
79 lines
5.3 KiB
ReStructuredText
79 lines
5.3 KiB
ReStructuredText
RLlib Examples
|
|
==============
|
|
|
|
This page is an index of examples for the various use cases and features of RLlib.
|
|
|
|
If any example is broken, or if you'd like to add an example to this page, feel free to raise an issue on our Github repository.
|
|
|
|
Tuned Examples
|
|
--------------
|
|
|
|
- `Tuned examples <https://github.com/ray-project/ray/blob/master/python/ray/rllib/tuned_examples>`__:
|
|
Collection of tuned algorithm hyperparameters.
|
|
- `Atari benchmarks <https://github.com/ray-project/rl-experiments>`__:
|
|
Collection of reasonably optimized Atari results.
|
|
|
|
Training Workflows
|
|
------------------
|
|
|
|
- `Custom training workflows <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/custom_train_fn.py>`__:
|
|
Example of how to use Tune's support for custom training functions to implement custom training workflows.
|
|
- `Curriculum learning <rllib-training.html#example-curriculum-learning>`__:
|
|
Example of how to adjust the configuration of an environment over time.
|
|
- `Custom metrics <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/custom_metrics_and_callbacks.py>`__:
|
|
Example of how to output custom training metrics to TensorBoard.
|
|
- `Using rollout workers directly for control over the whole training workflow <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/rollout_worker_custom_workflow.py>`__:
|
|
Example of how to use RLlib's lower-level building blocks to implement a fully customized training workflow.
|
|
|
|
Custom Envs and Models
|
|
----------------------
|
|
|
|
- `Registering a custom env and model <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/custom_env.py>`__:
|
|
Example of defining and registering a gym env and model for use with RLlib.
|
|
- `Registering a custom model with supervised loss <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/custom_loss.py>`__:
|
|
Example of defining and registering a custom model with a supervised loss.
|
|
- `Subprocess environment <https://github.com/ray-project/ray/blob/master/python/ray/rllib/tests/test_env_with_subprocess.py>`__:
|
|
Example of how to ensure subprocesses spawned by envs are killed when RLlib exits.
|
|
- `Batch normalization <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/batch_norm_model.py>`__:
|
|
Example of adding batch norm layers to a custom model.
|
|
- `Parametric actions <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/parametric_action_cartpole.py>`__:
|
|
Example of how to handle variable-length or parametric action spaces.
|
|
- `Eager execution <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/eager_execution.py>`__:
|
|
Example of how to leverage TensorFlow eager to simplify debugging and design of custom models and policies.
|
|
|
|
Serving and Offline
|
|
-------------------
|
|
- `CartPole server <https://github.com/ray-project/ray/tree/master/python/ray/rllib/examples/serving>`__:
|
|
Example of online serving of predictions for a simple CartPole policy.
|
|
- `Saving experiences <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/saving_experiences.py>`__:
|
|
Example of how to externally generate experience batches in RLlib-compatible format.
|
|
|
|
Multi-Agent and Hierarchical
|
|
----------------------------
|
|
|
|
- `Two-step game <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/twostep_game.py>`__:
|
|
Example of the two-step game from the `QMIX paper <https://arxiv.org/pdf/1803.11485.pdf>`__.
|
|
- `Hand-coded policy <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/multiagent_custom_policy.py>`__:
|
|
Example of running a custom hand-coded policy alongside trainable policies.
|
|
- `Weight sharing between policies <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/multiagent_cartpole.py>`__:
|
|
Example of how to define weight-sharing layers between two different policies.
|
|
- `Multiple trainers <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/multiagent_two_trainers.py>`__:
|
|
Example of alternating training between two DQN and PPO trainers.
|
|
- `Hierarchical training <https://github.com/ray-project/ray/blob/master/python/ray/rllib/examples/hierarchical_training.py>`__:
|
|
Example of hierarchical training using the multi-agent API.
|
|
- `PPO with centralized value function <https://github.com/ray-project/ray/pull/3642/files>`__:
|
|
Example of customizing PPO to include a centralized value function, including a runnable script that demonstrates cooperative CartPole.
|
|
|
|
Community Examples
|
|
------------------
|
|
|
|
- `Traffic Flow <https://berkeleyflow.readthedocs.io/en/latest/flow_setup.html>`__:
|
|
Example of optimizing mixed-autonomy traffic simulations with RLlib / multi-agent.
|
|
- `Roboschool / SageMaker <https://github.com/awslabs/amazon-sagemaker-examples/tree/master/reinforcement_learning/rl_roboschool_ray>`__:
|
|
Example of training robotic control policies in SageMaker with RLlib.
|
|
- `StarCraft2 <https://github.com/oxwhirl/smac>`__:
|
|
Example of training in StarCraft2 maps with RLlib / multi-agent.
|
|
- `NeuroCuts <https://github.com/neurocuts/neurocuts>`__:
|
|
Example of building packet classification trees using RLlib / multi-agent in a bandit-like setting.
|
|
- `Sequential Social Dilemma Games <https://github.com/eugenevinitsky/sequential_social_dilemma_games>`__:
|
|
Example of using the multi-agent API to model several `social dilemma games <https://arxiv.org/abs/1702.03037>`__.
|