Sven Mika
62dbf26394
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). ( #20984 )
2021-12-21 08:39:05 +01:00
Sven Mika
db058d0fb3
[RLlib] Rename metrics_smoothing_episodes
into metrics_num_episodes_for_smoothing
for clarity. ( #20983 )
2021-12-11 20:33:35 +01:00
Sven Mika
49cd7ea6f9
[RLlib] Trainer sub-class PPO/DDPPO (instead of build_trainer()
). ( #20571 )
2021-11-23 23:01:05 +01:00
gjoliver
e7f9e8ceec
[RLlib] Report total_train_steps correctly for offline agents like CQL. ( #20541 )
...
* Fix trainer timestep reporting for offline agents like CQL.
* wip.
* extend timesteps_total to 200K for learning_tests_pendulum_cql test
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-22 21:46:45 +01:00
Artur Niederfahrenhorst
d07e50e957
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into rllib/execution/buffers
dir) ( #20552 )
2021-11-19 11:57:37 +01:00
Sven Mika
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. ( #19981 )
2021-11-05 16:10:00 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils ( #19829 )
2021-11-01 21:46:02 +01:00
gjoliver
9226f9bddc
[RLlib] Report timesteps_this_iter to Tune, so it can track/checkpoint/restore total timesteps trained. ( #19264 )
...
* Report timesteps_this_iter to Tune, so it can track/checkpoint/restore
total timesteps trained.
* Trigger Build
* lint
2021-10-12 16:03:41 +02:00
Sven Mika
c3e3fc7637
[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. ( #19100 )
2021-10-07 23:57:53 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
Sven Mika
05a55a9335
[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). ( #18942 )
2021-09-30 08:30:20 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). ( #18468 )
2021-09-23 12:56:45 +02:00
Sven Mika
3803e796ff
[RLlib] Multi-GPU learner thread (IMPALA) error messages/comments/code-cleanup. ( #18540 )
2021-09-13 19:27:53 +02:00
Sven Mika
1520c3d147
[RLlib] Deepcopy env_ctx for vectorized sub-envs AND add eval-worker-option to Trainer.add_policy()
( #18428 )
2021-09-09 07:10:06 +02:00
Sven Mika
4888d7c9af
[RLlib] Replay buffers: Add config option to store contents in checkpoints. ( #17999 )
2021-08-31 12:21:49 +02:00
Chris Bamford
58a73821fb
[RLlib] IMPALA sample throughput calculation and full queue slowdown fixes ( #17822 )
2021-08-17 14:01:41 +02:00
Sven Mika
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). ( #17371 )
2021-08-03 11:35:49 -04:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. ( #17169 )
2021-07-20 14:58:13 -04:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. ( #17031 )
2021-07-19 13:16:03 -04:00
Sven Mika
55a90e670a
[RLlib] Trainer.add_policy() not working for tf, if added policy is trained afterwards. ( #16927 )
2021-07-11 23:41:38 +02:00
Sven Mika
7eb1a29426
[RLlib] Fix ModelV2 custom metrics for torch. ( #16734 )
2021-07-01 13:01:40 +02:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes ( #16531 )
2021-06-30 12:32:11 +02:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. ( #16569 )
2021-06-21 13:46:01 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )" ( #16543 )
...
This reverts commit e78ec370a9
.
2021-06-18 12:21:49 -07:00
Sven Mika
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )
2021-06-18 10:31:30 +02:00
Sven Mika
3d4dc60e2e
[RLlib] CQL iteration count fixes: Remove dummy buffer and unnecessary store op from exec_plan. ( #16332 )
2021-06-10 07:49:17 +02:00
Sven Mika
4b8dadccbd
[RLlib] Fix PR 16162: Having added sleep to _NextValueNotReady
causes TD3 tests to become flakey. ( #16309 )
2021-06-08 07:27:02 -07:00
Sven Mika
e2be41b407
[RLlib] MARWIL + BC: Various fixes and enhancements. ( #16218 )
2021-06-03 22:29:00 +02:00
Chris Bamford
1e3721ef4a
[RLlib] Remove bad spinlocks to allow pytorch GPU scheduler to interrupt. ( #16162 )
2021-06-01 16:40:28 +02:00
Sven Mika
2d34216660
[RLlib] APEX-DQN: Bug fix for torch and add learning test. ( #15762 )
2021-05-20 09:27:03 +02:00
Sven Mika
2303851c3c
[RLlib] Torch multi-GPU + LSTM/RNN bug fix. ( #15492 )
2021-05-18 11:51:05 +02:00
Michael Luo
4cbe13cdfd
[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. ( #15603 )
...
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-05-04 19:06:19 +02:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 ( #15527 )
...
* formatting
* format util
* format release
* format rllib/agents
* format rllib/env
* format rllib/execution
* format rllib/evaluation
* format rllib/examples
* format rllib/policy
* format rllib utils and tests
* format streaming
* more formatting
* update requirements files
* fix rllib type checking
* updates
* update
* fix circular import
* Update python/ray/tests/test_runtime_env.py
* noqa
2021-05-03 14:23:28 -07:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. ( #14709 )
2021-04-16 09:16:24 +02:00
Sven Mika
dfc116ea27
[RLlib] Discussion 681: Metrics prepends newest episodes instead of appending. ( #15236 )
2021-04-11 15:31:43 +02:00
Chris Bamford
cd89f0dc55
[RLLib] Episode media logging support ( #14767 )
2021-03-19 09:17:09 +01:00
Sven Mika
c3a15ecc0f
[RLlib] Issue #13802 : Enhance metrics for multiagent->count_steps_by=agent_steps
setting. ( #14033 )
2021-03-18 20:27:41 +01:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. ( #13393 )
2021-03-08 15:41:27 +01:00
Sven Mika
7718ec70fb
[RLlib] Remove old SegmentTree from tests dir and unflake respective segment tree test. ( #14450 )
2021-03-03 14:31:30 +01:00
Sven Mika
8000258333
[RLlib] R2D2 Implementation. ( #13933 )
2021-02-25 12:18:11 +01:00
Sven Mika
775e685531
[RLlib] Issue #13824 : compress_observations=True
crashes for all algos not using a replay buffer. ( #14034 )
2021-02-18 21:36:32 +01:00
Sven Mika
eb0038612f
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. ( #13584 )
2021-02-08 15:02:19 +01:00
Sven Mika
d001af3e59
[RLlib] Allow rllib rollout
to run distributed via evaluation workers. ( #13718 )
2021-02-08 12:05:16 +01:00
Michael Luo
a2d1215200
[RLlib] Execution Annotation ( #13036 )
2020-12-24 09:30:33 -05:00
Edward Oakes
cde711aaf1
Revert "[RLLib] Execution-Folder Type Annotations ( #12760 )" ( #12886 )
...
This reverts commit becca1424d
.
2020-12-15 11:03:02 -08:00
Michael Luo
becca1424d
[RLLib] Execution-Folder Type Annotations ( #12760 )
2020-12-14 19:16:44 +01:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. ( #12420 )
2020-12-08 16:41:45 -08:00
Sven Mika
fb318addcb
[RLlib] Curiosity exploration module: tf/tf2.x/tf-eager support. ( #11945 )
2020-11-29 12:31:24 +01:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). ( #11747 )
2020-11-12 16:27:34 +01:00
Eric Liang
8f79b4e45e
[rllib] Replay buffer size inaccurate with replay_seq_len option ( #10988 )
...
* support replay seq len
* update
* fix warn
* add test
* test
2020-09-25 13:47:23 -07:00