Sven Mika
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. ( #21420 )
2022-01-10 11:22:55 +01:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. ( #21008 )
2021-12-13 12:04:23 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. ( #20918 )
2021-12-11 14:57:58 +01:00
Sven Mika
f82880eda1
Revert "Revert [RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 ) ( #20399 )" ( #20417 )
...
This reverts commit 90dc5460d4
.
2021-11-16 14:49:41 +01:00
Amog Kamsetty
90dc5460d4
Revert "[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )" ( #20399 )
...
This reverts commit 5b1c8e46e1
.
2021-11-15 16:11:35 -08:00
Sven Mika
6ff4061f3a
[RLlib] Issue 20269: Offline RL example not working due to new_obs not being written to file. ( #20366 )
...
* wip.
* Apply suggestions from code review
2021-11-15 16:41:08 +01:00
Sven Mika
5b1c8e46e1
[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )
2021-11-15 10:41:54 +01:00
Sven Mika
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. ( #19981 )
2021-11-05 16:10:00 +01:00
Sven Mika
f3397b6f48
[RLlib] Minor fixes/cleanups; chop_into_sequences now handles nested data. ( #19408 )
2021-11-05 14:39:28 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils ( #19829 )
2021-11-01 21:46:02 +01:00
Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation ( #19783 )
2021-10-29 12:03:56 +02:00
Sven Mika
f2cb2ed203
[RLlib; Docs overhaul] Docstring cleanup: Policies, policy_templates. ( #19759 )
2021-10-27 19:14:39 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
Sven Mika
828f5d26b7
[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action
and compute_actions_from_input_dict
. ( #18921 )
2021-09-30 15:03:37 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). ( #18468 )
2021-09-23 12:56:45 +02:00
Sven Mika
a96dbd885b
[RLlib] Reinstate trajectory view API tests. ( #18809 )
2021-09-23 08:31:51 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 ( #18367 )
2021-09-09 08:10:42 +02:00
Sven Mika
4888d7c9af
[RLlib] Replay buffers: Add config option to store contents in checkpoints. ( #17999 )
2021-08-31 12:21:49 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. ( #17928 )
2021-08-21 17:05:48 +02:00
simonsays1980
7b33dc21dc
[RLlib] Fix update model view requirements from init state for bare-metal policies with custom view-reqs. ( #17867 )
...
* Changed '_update_model_view_requirements_from_init_state()' to adopt the 'shift' in view_requirements from a user-defined policy that inherits directly from Policy.
* Added slightly modifed version of Sven's suggestion. Like this any user-defined attributes of the ViewRequirement of the state get conserved.
* I saw that the code in _update_model_view_requirements_from_init_state() had changed and is not identical to my locally installed version. In the new version view_requirements from the model and the policy get united and therefore a loop runs through this unified list. Code should run now in the present version
* Apply suggestions from code review
2021-08-17 11:49:24 +02:00
Sven Mika
5107d16ae5
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. ( #17530 )
2021-08-03 18:30:02 -04:00
Sven Mika
58da5c1c9b
[RLlib] Discussion 3001: Fix comment on internal state shape (must be [B x S=state dim]). ( #17341 )
2021-07-27 21:41:53 -04:00
Chris Bamford
29768a7c01
[RLLib] (P1 regression) Fixing view requirements in compute actions ( #15856 )
2021-07-25 14:25:07 -04:00
Sven Mika
7bc4376466
[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). ( #17077 )
2021-07-22 10:59:13 -04:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. ( #17169 )
2021-07-20 14:58:13 -04:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. ( #17031 )
2021-07-19 13:16:03 -04:00
Sven Mika
649580d735
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). ( #17046 )
2021-07-15 05:51:24 -04:00
Sven Mika
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). ( #17014 )
2021-07-13 14:01:30 -04:00
Amog Kamsetty
38b5b6d24c
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). ( #16565 )" ( #17036 )
...
This reverts commit e4123fff27
.
2021-07-13 09:57:15 -07:00
Kai Fricke
27d80c4c88
[RLlib] ONNX export for tensorflow (1.x) and torch ( #16805 )
2021-07-13 12:38:11 -04:00
Sven Mika
e4123fff27
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). ( #16565 )
2021-07-13 06:38:14 -04:00
Amog Kamsetty
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action
, not normalize_action
." ( #17002 )
...
This reverts commit 7862dd64ea
.
2021-07-12 11:09:14 -07:00
Sven Mika
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action
, not normalize_action
. ( #16774 )
2021-07-08 17:31:34 +02:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes ( #16531 )
2021-06-30 12:32:11 +02:00
Sven Mika
d0014cd351
[RLlib] Policies get/set_state fixes and enhancements. ( #16354 )
2021-06-15 13:08:43 +02:00
Sven Mika
03c7c530a9
[RLlib] Issue 15483: Wrong init states (should be non-zero if ModelV2.get_initial_state
returns non-zero values). ( #15733 )
2021-05-20 09:28:09 +02:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 ( #15527 )
...
* formatting
* format util
* format release
* format rllib/agents
* format rllib/env
* format rllib/execution
* format rllib/evaluation
* format rllib/examples
* format rllib/policy
* format rllib utils and tests
* format streaming
* more formatting
* update requirements files
* fix rllib type checking
* updates
* update
* fix circular import
* Update python/ray/tests/test_runtime_env.py
* noqa
2021-05-03 14:23:28 -07:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. ( #15273 )
2021-04-30 19:26:30 +02:00
Sven Mika
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). ( #14684 )
2021-04-27 10:44:54 +02:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. ( #14709 )
2021-04-16 09:16:24 +02:00
Sven Mika
9c5a0cfd7a
[RLlib] Issue 14385: Policy.compute_actions_from_input_dict
does not properly track accessed fields for Policy's view requirements. ( #14386 )
2021-04-11 18:20:04 +02:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. ( #14860 )
2021-03-23 09:50:18 -07:00
Sven Mika
69202c6a7d
[RLlib] Obsolete usage tracking dict via sample batch. ( #13065 )
2021-03-17 08:18:15 +01:00
Sven Mika
95ef04b71a
[RLlib] Implement TorchPolicy.export_model
. ( #13989 )
2021-02-22 17:09:40 +01:00
Sven Mika
4db86404ad
[RLlib] Issue #13507 : Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. ( #14037 )
2021-02-11 18:58:46 +01:00
Sven Mika
d7301a51f4
[RLlib]: Trajectory View API: Keep env infos (e.g. for postprocessing callbacks), no matter what. ( #13555 )
2021-02-09 17:05:26 +01:00
Stanislav Chekmenev
b9c15a2551
[RLlib] Issue #13761 : Fix get action shape ( #13764 )
2021-02-02 13:13:43 +01:00
Sven Mika
d49c3fae0b
[RLlib] Trajectory View API: Atari framestacking. ( #13315 )
2021-01-13 08:53:34 +01:00
Sven Mika
a5b39ef8e2
[RLlib] Fix missing "info_batch" arg (None) in compute_actions
calls. ( #13237 )
2021-01-07 21:25:02 +01:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. ( #12718 )
2020-12-30 17:32:21 -08:00