Commit graph

27 commits

Author SHA1 Message Date
Balaji Veeramani
31ed9e5d02
[CI] Replace YAPF disables with Black disables (#21982) 2022-02-08 16:29:25 -08:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
b10d5533be
[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. (#21452) 2022-01-10 11:19:40 +01:00
Sven Mika
853d10871c
[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376) 2022-01-05 18:22:33 +01:00
Sven Mika
b4790900f5
[RLlib] Sub-class Trainer (instead of build_trainer()): All remaining classes; soft-deprecate build_trainer. (#20725) 2021-12-04 22:05:26 +01:00
gjoliver
e7f9e8ceec
[RLlib] Report total_train_steps correctly for offline agents like CQL. (#20541)
* Fix trainer timestep reporting for offline agents like CQL.

* wip.

* extend timesteps_total to 200K for learning_tests_pendulum_cql test

Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-22 21:46:45 +01:00
Artur Niederfahrenhorst
d07e50e957
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into rllib/execution/buffers dir) (#20552) 2021-11-19 11:57:37 +01:00
gjoliver
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents (#19627)
* [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents.

This change
1. Get rid of the try...except clause when we call execution_plan(),
   and get rid of the Deprecation warning as a result.
2. Fix the execution_plan() call in Trainer._try_recover() too.
3. Most importantly, makes it much easier to create and use different types
   of local replay buffers for all our agents.
   E.g., allow us to easily create a reservoir sampling replay buffer for
   APPO agent for Riot in the near future.
* Introduce explicit configuration for replay buffer types.
* Fix is_training key error.
* actually deprecate buffer_size field.
2021-10-26 20:56:02 +02:00
gjoliver
89fbfc00f8
[RLlib] Some minor cleanups (buffer buffer_size -> capacity and others). (#19623) 2021-10-25 09:42:39 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879) 2021-09-30 16:39:05 +02:00
Sven Mika
ba1c489b79
[RLlib Testing] Lower --smoke-test "time_total_s" to make sure it doesn't time out. (#18670) 2021-09-16 18:22:23 +02:00
Sven Mika
8a00154038
[RLlib] Bump tf version in ML docker to tf==2.5.0; add tfp to ML-docker. (#18544) 2021-09-15 08:46:37 +02:00
Sven Mika
4888d7c9af
[RLlib] Replay buffers: Add config option to store contents in checkpoints. (#17999) 2021-08-31 12:21:49 +02:00
Sven Mika
7f2b3c0824
[RLlib] Issue 17667: CQL-torch + GPU not working (due to simple_optimizer=False; must use simple optimizer!). (#17742) 2021-08-11 18:30:21 +02:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 2021-07-20 14:58:13 -04:00
Julius Frost
a88b217d3f
[rllib] Enhancements to Input API for customizing offline datasets (#16957)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-10 15:05:25 -07:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
Sven Mika
3d4dc60e2e
[RLlib] CQL iteration count fixes: Remove dummy buffer and unnecessary store op from exec_plan. (#16332) 2021-06-10 07:49:17 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support (#15841) 2021-05-18 11:10:46 +02:00
Michael Luo
4cbe13cdfd
[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603)
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-05-04 19:06:19 +02:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 (#15527)
* formatting

* format util

* format release

* format rllib/agents

* format rllib/env

* format rllib/execution

* format rllib/evaluation

* format rllib/examples

* format rllib/policy

* format rllib utils and tests

* format streaming

* more formatting

* update requirements files

* fix rllib type checking

* updates

* update

* fix circular import

* Update python/ray/tests/test_runtime_env.py

* noqa
2021-05-03 14:23:28 -07:00
Sven Mika
8b3554e37e
[RLlib] Remove all (already soft-deprecated) SampleBatch.data from code. (#15335) 2021-04-15 19:19:51 +02:00
Michael Luo
020c9439dd
[RLlib] CQL Documentation + Tests (#14531) 2021-03-11 18:51:39 +01:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393) 2021-03-08 15:41:27 +01:00
Michael Luo
ec2c10309b
[RLlib] CQL for HalfCheetah-Random-v0 + Hopper-Random-v0 + CQL Bug Fixes (#14243) 2021-02-22 17:30:18 +01:00
Michael Luo
587f207c2f
[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550) 2021-01-21 16:43:55 +01:00
Michael Luo
42cd414e5b
[RLlib] New Offline RL Algorithm: CQL (based on SAC) (#13118) 2020-12-30 10:11:57 -05:00