ray/rllib/agents/impala/tests/test_impala.py

import unittest

import ray
import ray.rllib.agents.impala as impala
from ray.rllib.utils.framework import try_import_tf
from ray.rllib.utils.test_utils import check_compute_single_action, \
    framework_iterator

tf1, tf, tfv = try_import_tf()


class TestIMPALA(unittest.TestCase):
    @classmethod
    def setUpClass(cls) -> None:
        ray.init()

    @classmethod
    def tearDownClass(cls) -> None:
        ray.shutdown()

    def test_impala_compilation(self):
        """Test whether an ImpalaTrainer can be built with both frameworks."""
        config = impala.DEFAULT_CONFIG.copy()
        num_iterations = 1

        for _ in framework_iterator(config):
            local_cfg = config.copy()
            for env in ["Pendulum-v0", "CartPole-v0"]:
                print("Env={}".format(env))
                print("w/o LSTM")
                # Test w/o LSTM.
                local_cfg["model"]["use_lstm"] = False
                local_cfg["num_aggregation_workers"] = 0
                trainer = impala.ImpalaTrainer(config=local_cfg, env=env)
                for i in range(num_iterations):
                    print(trainer.train())
                check_compute_single_action(trainer)
                trainer.stop()

                # Test w/ LSTM.
                print("w/ LSTM")
                local_cfg["model"]["use_lstm"] = True
                local_cfg["model"]["lstm_use_prev_action"] = True
                local_cfg["model"]["lstm_use_prev_reward"] = True
                local_cfg["num_aggregation_workers"] = 2
                trainer = impala.ImpalaTrainer(config=local_cfg, env=env)
                for i in range(num_iterations):
                    print(trainer.train())
                check_compute_single_action(
                    trainer,
                    include_state=True,
                    include_prev_action_reward=True)
                trainer.stop()

    def test_impala_lr_schedule(self):
        config = impala.DEFAULT_CONFIG.copy()
        config["lr_schedule"] = [
            [0, 0.0005],
            [10000, 0.000001],
        ]
        local_cfg = config.copy()
        trainer = impala.ImpalaTrainer(config=local_cfg, env="CartPole-v0")

        def get_lr(result):
            return result["info"]["learner"]["default_policy"]["cur_lr"]

        try:
            r1 = trainer.train()
            r2 = trainer.train()
            assert get_lr(r2) < get_lr(r1), (r1, r2)
        finally:
            trainer.stop()


if __name__ == "__main__":
    import pytest
    import sys
    sys.exit(pytest.main(["-v", __file__]))
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00			`import unittest`

			`import ray`
			`import ray.rllib.agents.impala as impala`
			`from ray.rllib.utils.framework import try_import_tf`
[RLlib] Add testing `Policy.compute_single_action()` for all agents. (#8903) 2020-06-13 17:51:50 +02:00			`from ray.rllib.utils.test_utils import check_compute_single_action, \`
			`framework_iterator`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00
[RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136) * WIP. * Fixes. * LINT. * WIP. * WIP. * Fixes. * Fixes. * Fixes. * Fixes. * WIP. * Fixes. * Test * Fix. * Fixes and LINT. * Fixes and LINT. * LINT. 2020-06-30 10:13:20 +02:00			`tf1, tf, tfv = try_import_tf()`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00

			`class TestIMPALA(unittest.TestCase):`
			`@classmethod`
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970) 2020-06-27 20:50:01 +02:00			`def setUpClass(cls) -> None:`
			`ray.init()`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00
			`@classmethod`
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970) 2020-06-27 20:50:01 +02:00			`def tearDownClass(cls) -> None:`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00			`ray.shutdown()`

			`def test_impala_compilation(self):`
			`"""Test whether an ImpalaTrainer can be built with both frameworks."""`
			`config = impala.DEFAULT_CONFIG.copy()`
			`num_iterations = 1`

[RLlib] Tf2.x native. (#8752) 2020-07-11 22:06:35 +02:00			`for _ in framework_iterator(config):`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00			`local_cfg = config.copy()`
			`for env in ["Pendulum-v0", "CartPole-v0"]:`
			`print("Env={}".format(env))`
[RLlib] Fix `use_lstm` flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734) 2020-06-05 15:40:30 +02:00			`print("w/o LSTM")`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00			`# Test w/o LSTM.`
[RLlib] Fix `use_lstm` flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734) 2020-06-05 15:40:30 +02:00			`local_cfg["model"]["use_lstm"] = False`
[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520) 2020-05-27 16:19:13 +02:00			`local_cfg["num_aggregation_workers"] = 0`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00			`trainer = impala.ImpalaTrainer(config=local_cfg, env=env)`
			`for i in range(num_iterations):`
			`print(trainer.train())`
[RLlib] Add testing `Policy.compute_single_action()` for all agents. (#8903) 2020-06-13 17:51:50 +02:00			`check_compute_single_action(trainer)`
[rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00			`trainer.stop()`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00
			`# Test w/ LSTM.`
[RLlib] Fix `use_lstm` flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734) 2020-06-05 15:40:30 +02:00			`print("w/ LSTM")`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00			`local_cfg["model"]["use_lstm"] = True`
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397) * WIP. * Fix and LINT. 2020-11-25 20:27:46 +01:00			`local_cfg["model"]["lstm_use_prev_action"] = True`
			`local_cfg["model"]["lstm_use_prev_reward"] = True`
[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520) 2020-05-27 16:19:13 +02:00			`local_cfg["num_aggregation_workers"] = 2`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00			`trainer = impala.ImpalaTrainer(config=local_cfg, env=env)`
			`for i in range(num_iterations):`
			`print(trainer.train())`
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970) 2020-06-27 20:50:01 +02:00			`check_compute_single_action(`
			`trainer,`
			`include_state=True,`
			`include_prev_action_reward=True)`
[rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00			`trainer.stop()`
[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00
[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609) 2020-10-27 10:00:24 +01:00			`def test_impala_lr_schedule(self):`
			`config = impala.DEFAULT_CONFIG.copy()`
			`config["lr_schedule"] = [`
			`[0, 0.0005],`
			`[10000, 0.000001],`
			`]`
			`local_cfg = config.copy()`
			`trainer = impala.ImpalaTrainer(config=local_cfg, env="CartPole-v0")`

			`def get_lr(result):`
			`return result["info"]["learner"]["default_policy"]["cur_lr"]`

			`try:`
			`r1 = trainer.train()`
			`r2 = trainer.train()`
			`assert get_lr(r2) < get_lr(r1), (r1, r2)`
			`finally:`
			`trainer.stop()`

[RLlib] IMPALA PyTorch (#8287) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole. 2020-05-03 13:44:25 +02:00
			`if __name__ == "__main__":`
			`import pytest`
			`import sys`
			`sys.exit(pytest.main(["-v", __file__]))`