hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	6f4d988713	[RLlib] Issue 15556: Fix R2D2 using chunks from previous episodes in the "burn-in" window. (#15737 )	2021-05-18 11:05:42 +02:00
Michael Luo	4cbe13cdfd	[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603 ) Co-authored-by: Sven Mika <sven@anyscale.io> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-05-04 19:06:19 +02:00
Sven Mika	bdda73e2dd	[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421 ) Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!	2021-04-22 11:29:42 +02:00
Michael Luo	b84575c092	[RLlib] 2 RLlib Flaky Tests (#14930 )	2021-03-30 19:21:13 +02:00
Sven Mika	e98808ce11	[RLlib] Fix 2 flakey test cases. (#14892 )	2021-03-29 17:20:29 +02:00
Eric Liang	af8a93f2a4	Deflake some RLlib tests (#14947 ) * fix * update * 100 * flake	2021-03-26 11:45:17 -07:00
Sven Mika	4e17f95927	[RLlib] Unflake 2 test cases (SAC cont. cartpole). (#14620 )	2021-03-15 14:03:54 +01:00
Sven Mika	8000258333	[RLlib] R2D2 Implementation. (#13933 )	2021-02-25 12:18:11 +01:00
Michael Luo	ec2c10309b	[RLlib] CQL for HalfCheetah-Random-v0 + Hopper-Random-v0 + CQL Bug Fixes (#14243 )	2021-02-22 17:30:18 +01:00
Sven Mika	52c94b7ee9	[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522 )	2021-02-02 13:05:58 +01:00
Michael Luo	587f207c2f	[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550 )	2021-01-21 16:43:55 +01:00
Sven Mika	2e3655e8a9	[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238 )	2021-01-19 14:22:36 +01:00
Sven Mika	e74947cc94	[RLlib] Env directory cleanup and tests. (#13082 )	2021-01-19 10:09:39 +01:00
Sven Mika	93c0a5549b	[RLlib] Deprecate `vf_share_layers` in top-level PPO/MAML/MB-MPO configs. (#13397 )	2021-01-19 09:51:35 +01:00
Michael Luo	42cd414e5b	[RLlib] New Offline RL Algorithm: CQL (based on SAC) (#13118 )	2020-12-30 10:11:57 -05:00
Sven Mika	deb33bce84	[RLlib] Add DQN SoftQ learning test case. (#12712 )	2020-12-10 14:55:19 +01:00
Sven Mika	bb03e2499b	[RLlib] PyBullet Env native support via env str-specifier (if installed). (#12209 )	2020-11-30 12:41:24 +01:00
Sven Mika	4afaa46028	[RLlib] Increase the scope of RLlib's regression tests. (#12200 )	2020-11-24 22:18:31 +01:00
Edward Oakes	32d159a2ed	Fix release directory & RELEASE_PROCESS.md (#12269 )	2020-11-23 14:28:59 -06:00
Sven Mika	b6b54f1c81	[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827 )	2020-11-16 10:54:35 -08:00
Michael Luo	59bc1e6c09	[RLLib] MAML extension for all models except RNNs (#11337 )	2020-11-12 16:51:40 -08:00
Michael Luo	6e6c680f14	MBMPO Cartpole (#11832 ) * MBMPO Cartpole Done * Added doc	2020-11-12 10:30:41 -08:00
Sven Mika	5b788ccb13	[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717 )	2020-11-03 12:53:34 -08:00
Sven Mika	ce96b03b07	[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033 )	2020-10-06 20:28:16 +02:00
Michael Luo	47b499d899	Cartpole MAML + Discrete (#11028 )	2020-10-02 12:56:34 +02:00
Sven Mika	4b278c36fc	[RLlib] Behavioral Cloning (from MARWIL). (#10619 )	2020-09-09 17:33:21 +02:00
Michael Luo	8e613652af	[RLLib] MBMPO Fixes (#10296 )	2020-09-09 09:34:34 +02:00
Sven Mika	8a891b3c30	[RLlib] SAC n_step > 1. (#10567 )	2020-09-05 22:26:42 +02:00
Michael Luo	4e9888ce2f	[RLlib] Dreamer (#10172 )	2020-08-26 13:24:05 +02:00
Michael Luo	4d7bd8c892	[RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) (#9409 )	2020-08-02 18:12:09 +02:00
Sven Mika	617eb8f279	[RLlib] Issue 9402 MARWIL producing nan rewards. (#9429 )	2020-07-14 05:07:16 +02:00
Sven Mika	b4c0b942fe	[RLlib] Remove requirement for dataclasses in rllib (not supported in py3.5) (#9237 )	2020-07-01 17:31:44 +02:00
Michael Luo	cf0894d396	[rllib] MAML Agent (#8862 ) * Halfway done with transferring MAML to new Ray * MAML Beta Out * Debugging MAML atm * Distributed Execution * Pendulum Mass Working * All experiments complete * Cleaned up codebase * Travis CI * Travis CI * Tests * Merged conflicts * Fixed variance bug conflict * Comment resolved * Apply suggestions from code review fixed test_maml * Update rllib/agents/maml/tests/test_maml.py * asdf * Fix testing Co-authored-by: Sven Mika <sven@anyscale.io>	2020-06-23 09:48:23 -07:00
Sven Mika	2589309cf0	[RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785 )	2020-06-20 00:05:19 +02:00
Sven Mika	7008902cff	[RLlib] Minor `rllib.utils` cleanup. (#8932 )	2020-06-16 08:52:20 +02:00
Sven Mika	8d1ccfd0f7	[RLlib] Issue 8889: action clipping bug ppo not learning mujoco (#8898 )	2020-06-11 19:17:43 +02:00
Sven Mika	a90cd0fcbb	[RLlib] Unity3d soccer benchmarks (#8834 )	2020-06-11 14:29:57 +02:00
Sven Mika	c74dc58f8b	[RLlib] Fix `use_lstm` flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734 )	2020-06-05 15:40:30 +02:00
Sven Mika	97d524c075	[RLlib] Issue 8769 broken OOM tests_dir cases (R & S). (#8770 )	2020-06-05 08:34:21 +02:00
Sven Mika	2746fc0476	[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520 )	2020-05-27 16:19:13 +02:00
Sven Mika	baa053496a	[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414 )	2020-05-26 11:10:27 +02:00
Sven Mika	3a234ed9e3	[RLlib] Error: "Unknown trainable [some rllib algo name]" (#8525 )	2020-05-21 08:59:32 +02:00
Eric Liang	9d012626e5	[rllib] Distributed exec workflow for impala (#8321 )	2020-05-11 20:24:43 -07:00
Sven Mika	166bb5d690	[RLlib] IMPALA PyTorch (#8287 ) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole.	2020-05-03 13:44:25 +02:00
Sven Mika	b23b6addfc	[RLlib] Stabilize Pendulum-v0 regression test cases. (#8232 ) Stabilize Pendulum regression test cases.	2020-04-30 15:48:11 +02:00
Sven Mika	499ad5fbe4	[RLlib] PyTorch version of APPO. (#8120 ) - Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases. - Add learning test cases for APPO torch (both w/ and w/o v-trace). - Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).	2020-04-23 09:11:12 +02:00
Sven Mika	d15609ba2a	[RLlib] PyTorch version of ARS (Augmented Random Search). (#8106 ) This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.	2020-04-21 09:47:52 +02:00
Sven Mika	3812bfedda	[RLlib] PyTorch version of ES (Evolution Strategies). (#8104 ) PyTorch version of Evolution Strategies (ES) Algo.	2020-04-20 21:47:28 +02:00
Sven Mika	d6cb7d865e	[RLlib] Torch DQN (APEX) TD-Error/prio. replay fixes. (#8082 ) PyTorch APEX_DQN with Prioritized Replay enabled would not work properly due to the td_error not being retrievable by the AsyncReplayOptimizer.	2020-04-20 10:03:25 +02:00
Sven Mika	f7e4dae852	[RLlib] DQN and SAC Atari benchmark fixes. (#7962 ) * Add Atari SAC-discrete (learning MsPacman in 40k ts up to 780 rewards). * SAC loss function test case fix.	2020-04-17 08:49:15 +02:00

1 2

80 commits