ray/doc/source/rllib/package_ref/policy/custom_policies.rst
Yi Cheng fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346)" (#25420)
This reverts commit e4ceae19ef.

Reverts #25346

linux://python/ray/tests:test_client_library_integration never fail before this PR.

In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128). So high likely it's because of this PR.

And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b)
2022-06-02 20:38:44 -07:00

26 lines
1.2 KiB
ReStructuredText

.. _custom-policies-reference-docs:
Building Custom Policy Classes
==============================
.. warning::
As of Ray >= 1.9, it is no longer recommended to use the ``build_policy_class()`` or
``build_tf_policy()`` utility functions for creating custom Policy sub-classes.
Instead, follow the simple guidelines here for directly sub-classing from
either one of the built-in types:
:py:class:`~ray.rllib.policy.dynamic_tf_policy.DynamicTFPolicy`
or
:py:class:`~ray.rllib.policy.torch_policy.TorchPolicy`
In order to create a custom Policy, sub-class :py:class:`~ray.rllib.policy.policy.Policy` (for a generic,
framework-agnostic policy),
:py:class:`~ray.rllib.policy.torch_policy.TorchPolicy`
(for a PyTorch specific policy), or
:py:class:`~ray.rllib.policy.dynamic_tf_policy.DynamicTFPolicy`
(for a TensorFlow specific policy) and override one or more of their methods. Those are in particular:
* :py:meth:`~ray.rllib.policy.policy.Policy.compute_actions_from_input_dict`
* :py:meth:`~ray.rllib.policy.policy.Policy.postprocess_trajectory`
* :py:meth:`~ray.rllib.policy.policy.Policy.loss`
`See here for an example on how to override TorchPolicy <https://github.com/ray-project/ray/blob/master/rllib/agents/ppo/ppo_torch_policy.py>`_.