hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-08 11:31:40 -05:00

Author	SHA1	Message	Date
Sven Mika	55a90e670a	[RLlib] Trainer.add_policy() not working for tf, if added policy is trained afterwards. (#16927 )	2021-07-11 23:41:38 +02:00
Julius Frost	a88b217d3f	[rllib] Enhancements to Input API for customizing offline datasets (#16957 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-07-10 15:05:25 -07:00
Sven Mika	7862dd64ea	[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`. (#16774 )	2021-07-08 17:31:34 +02:00
Sven Mika	53206dd440	[RLlib] CQL BC loss fixes; PPO/PG/A2\|3C action normalization fixes (#16531 )	2021-06-30 12:32:11 +02:00
AnnaKosiorek	1e709771b2	[rllib][minor] clarification of the softmax axis in dqn_torch_policy (#16311 ) pytorch nn.functional.softmax (unlike tf.nn.softmax) calculates softmax along zeroth dimension by default	2021-06-26 11:19:54 -07:00
Sven Mika	c95dea51e9	[RLlib] External env enhancements + more examples. (#16583 )	2021-06-23 09:09:01 +02:00
Sven Mika	be6db06485	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
Sven Mika	169ddabae7	[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429 )	2021-06-19 22:42:00 +02:00
Amog Kamsetty	bd3cbfc56a	Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359 )" (#16543 ) This reverts commit `e78ec370a9`.	2021-06-18 12:21:49 -07:00
Sven Mika	2900a06dd7	[RLlib] Issue 14503: SAC not allowing custom action distributions. (#16427 )	2021-06-18 17:27:29 +02:00
Sven Mika	e78ec370a9	[RLlib] Allow policies to be added/deleted on the fly. (#16359 )	2021-06-18 10:31:30 +02:00
Sven Mika	d0014cd351	[RLlib] Policies get/set_state fixes and enhancements. (#16354 )	2021-06-15 13:08:43 +02:00
Chris Bamford	fd1a97e39f	[RLlib] Memory leak docs (#15908 )	2021-06-10 18:10:21 +02:00
Sven Mika	3d4dc60e2e	[RLlib] CQL iteration count fixes: Remove dummy buffer and unnecessary store op from exec_plan. (#16332 )	2021-06-10 07:49:17 +02:00
Sven Mika	e2be41b407	[RLlib] MARWIL + BC: Various fixes and enhancements. (#16218 )	2021-06-03 22:29:00 +02:00
Sven Mika	5fe34862ce	[RLlib] DDPG torch GPU bug. (#16133 )	2021-05-28 22:09:25 +02:00
Sven Mika	33a69135cb	[RLlib] Issue 16117: DQN/APEX torch not working on GPU. (#16118 )	2021-05-28 09:12:53 +02:00
Sven Mika	f6302d81be	[RLlib] Discussion 2210: BC algo broken, if "advantages" missing in offline data. (#16019 )	2021-05-25 08:47:17 +02:00
Sven Mika	e80095591c	[RLlib] Entropy coeff schedule bug fix and git bisect script. (#15937 )	2021-05-20 18:15:10 +02:00
Sven Mika	2d34216660	[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762 )	2021-05-20 09:27:03 +02:00
Sven Mika	eaa7f6696d	[RLlib] Issue 15887: MARWIL adv norm update mismatch for tf (static-graph) vs torch versions. (#15898 )	2021-05-19 15:44:11 -07:00
Michael Luo	474f04e322	[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup (#14707 )	2021-05-19 16:32:29 +02:00
Chris Bamford	0be83d9a95	[RLlib] Fixing Memory Leak In Multi-Agent environments. Adding tooling for finding memory leaks in workers. (#15815 )	2021-05-18 13:23:00 +02:00
Sven Mika	2303851c3c	[RLlib] Torch multi-GPU + LSTM/RNN bug fix. (#15492 )	2021-05-18 11:51:05 +02:00
Sven Mika	839fc59224	[RLlib] CQL TensorFlow support (#15841 )	2021-05-18 11:10:46 +02:00
Sven Mika	d89fb82bfb	[RLlib] Add simple curriculum learning API and example script. (#15740 )	2021-05-16 17:35:10 +02:00
Sven Mika	469f5227da	[RLlib] CQL bug fix: Normalize actions for atanh in BC part of the CQL loss. (#15814 )	2021-05-16 15:21:06 +02:00
Sven Mika	bc09e75b78	[RLlib] Fix 3 flakey test cases. (#15785 )	2021-05-16 12:20:33 +02:00
Sven Mika	c4a3e1589b	[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. (#15761 )	2021-05-13 09:17:23 +02:00
Sven Mika	16ddab49f5	[RLlib] Trainer._evaluate -> Trainer.evaluate; Also make evaluation possible w/o evaluation worker set. (#15591 )	2021-05-12 12:16:00 +02:00
Sven Mika	a495759f06	[RLlib] Discussion 2022: PPO should auto-adjust `rollout_fragment_length` if other settings do not align with `train_batch_size`. (#15611 )	2021-05-10 16:16:02 +02:00
Sven Mika	461d73ddf1	[RLlib] `simple_optimizer` should not be used by default for tf+MA. (#15365 )	2021-05-10 16:10:44 +02:00
Sven Mika	c7563a32ed	[RLlib] DD-PPO not supported on Win (add meaningful error message). (#15631 )	2021-05-04 19:26:17 +02:00
Michael Luo	4cbe13cdfd	[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603 ) Co-authored-by: Sven Mika <sven@anyscale.io> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-05-04 19:06:19 +02:00
Sven Mika	4b3add0066	[RLlib] Discussion 2021: PPO does not learn vf, iff use_gae=False (ignores use_critic setting). (#15610 )	2021-05-04 14:17:00 +02:00
Antoine Galataud	ce1c001b1d	[RLlib] DQN: Place LearningRateSchedule mixin at the right moment (#15558 )	2021-05-04 13:21:40 +02:00
Amog Kamsetty	ebc44c3d76	[CI] Upgrade flake8 to 3.9.1 (#15527 ) * formatting * format util * format release * format rllib/agents * format rllib/env * format rllib/execution * format rllib/evaluation * format rllib/examples * format rllib/policy * format rllib utils and tests * format streaming * more formatting * update requirements files * fix rllib type checking * updates * update * fix circular import * Update python/ray/tests/test_runtime_env.py * noqa	2021-05-03 14:23:28 -07:00
Sven Mika	e973b726c2	[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273 )	2021-04-30 19:26:30 +02:00
Sven Mika	78b776942f	[RLlib] Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf and torch). (#15538 )	2021-04-27 17:19:52 +02:00
SebastianBo1995	f5be8d8f74	[Rllib] Offline Learning Bug, different shapes (#15132 )	2021-04-27 17:18:17 +02:00
Sven Mika	bb8a286cbc	[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684 )	2021-04-27 10:44:54 +02:00
Kai Fricke	2c11a1aff1	[RLlib] Evaluation parallel to training check, key-error hotfix (#15345 )	2021-04-27 08:38:10 +02:00
mvindiola1	9330403200	[RLlib] Mask out padded values for A3C loss with recurrent policy (#15525 )	2021-04-27 08:36:04 +02:00
Sven Mika	354c960fff	[RLlib] Fix test_dependency_torch and fix custom logger support for RLlib. (#15120 )	2021-04-24 08:13:41 +02:00
Sven Mika	bdda73e2dd	[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421 ) Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!	2021-04-22 11:29:42 +02:00
Sven Mika	7318439c3d	[RLlib] DQN native_ratio (for training intensity) incorrect (discussion 1763). (#15436 ) Thanks @Manuscrit !	2021-04-22 11:06:29 +02:00
Fabien Couthouis	fe06642df0	[RLlib] Report mean losses instead of sum in IMPALA (discussion 1709) (#15427 )	2021-04-21 10:59:06 +02:00
Sven Mika	7ff27dfe07	[RLlib] Remove atari dependency for RLlib (in favor of detailed error message). (#15292 )	2021-04-20 08:46:58 +02:00
Sven Mika	41968512ca	[RLlib] Partial GPU examples (for learner and workers). (#15334 )	2021-04-20 08:46:05 +02:00
Sven Mika	cecfc3b43b	[RLlib] Multi-GPU support for Torch algorithms. (#14709 )	2021-04-16 09:16:24 +02:00

... 4 5 6 7 8 ...

631 commits