hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Simon Mo	9f23affdc0	[Hotfix] Unbreak lint in master (#24794 )	2022-05-13 15:05:05 -07:00
kourosh hakhamaneshi	ffcbb30552	[RLlib] Move from `agents` to `algorithms` - CQL, MARWIL, AlphaStar, MAML, Dreamer, MBMPO. (#24739 )	2022-05-13 18:43:36 +02:00
Sven Mika	f54557073e	[RLlib] Remove `execution_plan` API code no longer needed. (#24501 )	2022-05-06 12:29:53 +02:00
Sven Mika	1bc6419e0e	[RLlib] R2D2 training iteration fn AND switch off `execution_plan` API by default. (#24165 )	2022-05-03 07:59:26 +02:00
Sven Mika	a3d4fc74a6	[RLlib] MARWIL: Move to training_iteration API. (#23798 )	2022-04-11 19:28:32 +02:00
Balaji Veeramani	31ed9e5d02	[CI] Replace YAPF disables with Black disables (#21982 )	2022-02-08 16:29:25 -08:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Jun Gong	55f3bcfb2d	[RLlib] Add a logstd term to MARWIL's loss func to encourage exploration. (#21493 )	2022-01-26 16:00:17 +01:00
Sven Mika	b10d5533be	[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. (#21452 )	2022-01-10 11:19:40 +01:00
Sven Mika	b4790900f5	[RLlib] Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`. (#20725 )	2021-12-04 22:05:26 +01:00
Sven Mika	60b2219d72	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
Artur Niederfahrenhorst	d07e50e957	[RLlib] Replay buffer API (cleanups; docstrings; renames; move into `rllib/execution/buffers` dir) (#20552 )	2021-11-19 11:57:37 +01:00
Sven Mika	f82880eda1	Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 ) This reverts commit `90dc5460d4`.	2021-11-16 14:49:41 +01:00
Amog Kamsetty	90dc5460d4	Revert "[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )" (#20399 ) This reverts commit `5b1c8e46e1`.	2021-11-15 16:11:35 -08:00
Sven Mika	5b1c8e46e1	[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )	2021-11-15 10:41:54 +01:00
Avnish Narayan	026bf01071	[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 ) * Fix QMix, SAC, and MADDPA too. * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and have been moved to python 3.7 * Add gym installation based on python version. Pin python<= 3.6 to gym 0.19 due to install issues with atari roms in gym 0.20 * Reformatting * Fixing tests * Move atari-py install conditional to req.txt * migrate to new ale install method * Fix QMix, SAC, and MADDPA too. * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and have been moved to python 3.7 * Add gym installation based on python version. Pin python<= 3.6 to gym 0.19 due to install issues with atari roms in gym 0.20 Move atari-py install conditional to req.txt migrate to new ale install method Make parametric_actions_cartpole return float32 actions/obs Adding type conversions if obs/actions don't match space Add utils to make elements match gym space dtypes Co-authored-by: Jun Gong <jungong@anyscale.com> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-03 16:24:00 +01:00
Sven Mika	cf21c634a3	[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982 )	2021-11-03 10:00:46 +01:00
Sven Mika	0b308719f8	[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829 )	2021-11-01 21:46:02 +01:00
Sven Mika	9c73871da0	[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )	2021-10-29 12:03:56 +02:00
gjoliver	99a0088233	[RLlib] Unify the way we create local replay buffer for all agents (#19627 ) * [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents. This change 1. Get rid of the try...except clause when we call execution_plan(), and get rid of the Deprecation warning as a result. 2. Fix the execution_plan() call in Trainer._try_recover() too. 3. Most importantly, makes it much easier to create and use different types of local replay buffers for all our agents. E.g., allow us to easily create a reservoir sampling replay buffer for APPO agent for Riot in the near future. * Introduce explicit configuration for replay buffer types. * Fix is_training key error. * actually deprecate buffer_size field.	2021-10-26 20:56:02 +02:00
Sven Mika	b213565783	[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )	2021-10-25 15:00:00 +02:00
gjoliver	89fbfc00f8	[RLlib] Some minor cleanups (buffer buffer_size -> capacity and others). (#19623 )	2021-10-25 09:42:39 +02:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	e2be41b407	[RLlib] MARWIL + BC: Various fixes and enhancements. (#16218 )	2021-06-03 22:29:00 +02:00
Sven Mika	f6302d81be	[RLlib] Discussion 2210: BC algo broken, if "advantages" missing in offline data. (#16019 )	2021-05-25 08:47:17 +02:00
Sven Mika	eaa7f6696d	[RLlib] Issue 15887: MARWIL adv norm update mismatch for tf (static-graph) vs torch versions. (#15898 )	2021-05-19 15:44:11 -07:00
Michael Luo	474f04e322	[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup (#14707 )	2021-05-19 16:32:29 +02:00
Sven Mika	839fc59224	[RLlib] CQL TensorFlow support (#15841 )	2021-05-18 11:10:46 +02:00
Sven Mika	c4a3e1589b	[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. (#15761 )	2021-05-13 09:17:23 +02:00
Michael Luo	4cbe13cdfd	[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603 ) Co-authored-by: Sven Mika <sven@anyscale.io> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-05-04 19:06:19 +02:00
Sven Mika	bb8a286cbc	[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684 )	2021-04-27 10:44:54 +02:00
Sven Mika	8b3554e37e	[RLlib] Remove all (already soft-deprecated) `SampleBatch.data` from code. (#15335 )	2021-04-15 19:19:51 +02:00
Sven Mika	1bb70e4907	[RLlib] Issue 14523: Torch + py3.8 leads to GPU device error. (#15014 )	2021-03-30 21:43:11 +02:00
Sven Mika	6708211b59	[RLlib] JSONReader: Mix files if > 1 at beginning (each worker should start with different file). (#14865 )	2021-03-24 16:07:40 +01:00
Sven Mika	04bc0a9828	[RLlib] Remove all non-trajectory view API code. (#14860 )	2021-03-23 09:50:18 -07:00
Sven Mika	69202c6a7d	[RLlib] Obsolete usage tracking dict via sample batch. (#13065 )	2021-03-17 08:18:15 +01:00
Sven Mika	732197e23a	[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393 )	2021-03-08 15:41:27 +01:00
Sven Mika	eb0038612f	[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584 )	2021-02-08 15:02:19 +01:00
Sven Mika	d629292d63	[RLlib] Add grad_clip config option to MARWIL and stabilize grad clipping against inf global_norms. (#13634 )	2021-01-22 19:36:02 +01:00
Sven Mika	a65ee92b69	[RLlib] MARWIL loss function test case and cleanup. (#13455 )	2021-01-19 09:51:05 +01:00
Sven Mika	c524f86785	[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064 )	2020-12-27 09:46:03 -05:00
Sven Mika	99ae7bae05	[RLlib] JAXPolicy prep. PR #1 . (#13077 )	2020-12-26 20:14:18 -05:00
Sven Mika	e40b14d255	[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420 )	2020-12-08 16:41:45 -08:00
Sven Mika	0df55a139c	[RLlib] Attention Net prep PR #1 : Smaller cleanups. (#12447 ) * WIP. * Fix. * Fix. * Fix.	2020-11-27 16:25:47 -08:00
Sven Mika	bfc4f95e01	[RLlib] Fix test_bc.py test case. (#11722 ) * Fix large json test file. * Fix large json test file. * WIP.	2020-10-31 00:16:09 -07:00
Sven Mika	ce96b03b07	[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033 )	2020-10-06 20:28:16 +02:00
Eric Liang	ecdaaffc67	add large data warning (#10957 )	2020-09-23 15:46:06 -07:00
Julius Frost	e72838c03d	[RLLib] Add missing .to() for MARWIL on PyTorch (#10685 ) There was a missing .to() that caused a device mismatch error on PyTorch with MARWIL.	2020-09-09 18:52:55 -07:00
Sven Mika	4b278c36fc	[RLlib] Behavioral Cloning (from MARWIL). (#10619 )	2020-09-09 17:33:21 +02:00
Barak Michener	8e76796fd0	ci: Redo `format.sh --all` script & backfill lint fixes (#9956 )	2020-08-07 16:49:49 -07:00

1 2

69 commits