1
0
Fork 0
mirror of https://github.com/vale981/ray synced 2025-03-12 22:26:39 -04:00
Commit graph

14 commits

Author SHA1 Message Date
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ()
See  and  for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
62dbf26394
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). () 2021-12-21 08:39:05 +01:00
Kai Fricke
3e6ba5d6d2
Revert "Revert [RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py." ()
* Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. ()" ()"
This reverts commit 246787cdd9.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 12:26:47 +01:00
Kai Fricke
246787cdd9
Revert "[RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py. ()" ()
This reverts commit 6f85af435f.
2021-11-12 13:09:43 +00:00
Sven Mika
6f85af435f
[RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py. () 2021-11-11 12:16:20 +01:00
Sven Mika
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). () 2021-10-25 15:00:00 +02:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. () 2020-09-02 14:03:01 +02:00
Sven Mika
d14b501692
[RLlib] First attempt at cleaning up algo code in RLlib: PG. () 2020-08-20 17:05:57 +02:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all script & backfill lint fixes () 2020-08-07 16:49:49 -07:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. () 2020-07-11 22:06:35 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). ()
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils cleanup. () 2020-06-16 08:52:20 +02:00
roireshef
3c60caa448
[rllib] implemented compute_advantages without gae () 2020-01-31 22:25:45 -08:00
Sven
f1b56fa5ee PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). ()
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).

* Fix LINT line-len errors.

* Fix LINT errors.

* Fix `tf_pg_policy` imports (formerly: `pg_policy`).

* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).

* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
  then built into the Bazel/Travis test suite.

* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.

* Fix remaining import errors for agents/pg/...

* Fix circular dependency in pg imports.

* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00