hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	92f030331e	[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420 )	2022-01-10 11:22:55 +01:00
Matti Picus	5aef1e1708	remove deprecated unittest aliases (#21455 ) In a [recent review](https://discuss.python.org/t/experience-with-python-3-11-in-fedora/12911) of the experience of the Fedora team porting packages to the upcoming python 3.11, they remarked that most of the work was in removing deprecated aliases in unittest. I came across a few of these when looking at unrelated test failures, the DeprecationWarnings caught my eye. So a made a quick sweep of the code, using `git grep` to find occurances of the deprecated aliases: old \| new ---\|--- assertEquals \| assertEqual assertNotEquals \| assertNotEqual assertRaisesRegexp \| assertRaisesRegex	2022-01-09 20:29:54 -08:00
Sven Mika	bec719d823	[RLlib] Trainer sub-class IMPALA (instead of using `build_trainer()`). (#20570 )	2021-11-30 19:08:36 +01:00
Sven Mika	49cd7ea6f9	[RLlib] Trainer sub-class PPO/DDPPO (instead of `build_trainer()`). (#20571 )	2021-11-23 23:01:05 +01:00
Sven Mika	a931076f59	[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981 )	2021-11-05 16:10:00 +01:00
Sven Mika	e6ae08f416	[RLlib] Optionally don't drop last ts in v-trace calculations (APPO and IMPALA). (#19601 )	2021-11-03 10:01:34 +01:00
Sven Mika	cf21c634a3	[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982 )	2021-11-03 10:00:46 +01:00
Sven Mika	0b308719f8	[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829 )	2021-11-01 21:46:02 +01:00
gjoliver	99a0088233	[RLlib] Unify the way we create local replay buffer for all agents (#19627 ) * [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents. This change 1. Get rid of the try...except clause when we call execution_plan(), and get rid of the Deprecation warning as a result. 2. Fix the execution_plan() call in Trainer._try_recover() too. 3. Most importantly, makes it much easier to create and use different types of local replay buffers for all our agents. E.g., allow us to easily create a reservoir sampling replay buffer for APPO agent for Riot in the near future. * Introduce explicit configuration for replay buffer types. * Fix is_training key error. * actually deprecate buffer_size field.	2021-10-26 20:56:02 +02:00
Sven Mika	b213565783	[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )	2021-10-25 15:00:00 +02:00
gjoliver	9226f9bddc	[RLlib] Report timesteps_this_iter to Tune, so it can track/checkpoint/restore total timesteps trained. (#19264 ) * Report timesteps_this_iter to Tune, so it can track/checkpoint/restore total timesteps trained. * Trigger Build * lint	2021-10-12 16:03:41 +02:00
Sven Mika	b4300dd532	[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937 )	2021-10-04 13:29:00 +02:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	b99943806e	[RLlib] Add support for IMPALA to handle more than one loss/optimizer (analogous to recent enhancement for APPO). (#18971 )	2021-09-29 21:30:04 +02:00
Sven Mika	698b4eeed3	[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669 )	2021-09-21 22:00:14 +02:00
Sven Mika	3803e796ff	[RLlib] Multi-GPU learner thread (IMPALA) error messages/comments/code-cleanup. (#18540 )	2021-09-13 19:27:53 +02:00
Sven Mika	ea4a22249c	[RLlib] Add simple action-masking example script/env/model (tf and torch). (#18494 )	2021-09-11 23:08:09 +02:00
Sven Mika	ba58f5edb1	[RLlib] Strictly run `evaluation_num_episodes` episodes each evaluation run (no matter the other eval config settings). (#18335 )	2021-09-05 15:37:05 +02:00
Sven Mika	599e589481	[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065 )	2021-08-31 14:56:53 +02:00
Sven Mika	494ddd98c1	[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928 )	2021-08-21 17:05:48 +02:00
Sven Mika	5107d16ae5	[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530 )	2021-08-03 18:30:02 -04:00
Sven Mika	924f11cd45	[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371 )	2021-08-03 11:35:49 -04:00
Sven Mika	8a844ff840	[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch\|TFPolicy should not use `ray.get_gpu_ids()` (b/c no GPUs assigned by ray). (#17444 )	2021-08-02 17:29:59 -04:00
Sven Mika	5a313ba3d6	[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169 )	2021-07-20 14:58:13 -04:00
Sven Mika	18d173b172	[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031 )	2021-07-19 13:16:03 -04:00
Sven Mika	169ddabae7	[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429 )	2021-06-19 22:42:00 +02:00
Sven Mika	839fc59224	[RLlib] CQL TensorFlow support (#15841 )	2021-05-18 11:10:46 +02:00
Sven Mika	78b776942f	[RLlib] Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf and torch). (#15538 )	2021-04-27 17:19:52 +02:00
Sven Mika	bdda73e2dd	[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421 ) Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!	2021-04-22 11:29:42 +02:00
Fabien Couthouis	fe06642df0	[RLlib] Report mean losses instead of sum in IMPALA (discussion 1709) (#15427 )	2021-04-21 10:59:06 +02:00
Sven Mika	c90de315e5	[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. (#15295 )	2021-04-15 16:50:42 +01:00
Sven Mika	ef0f163d16	[RLlib] Discussion 1709: IMPALA (tf and torch) reports sum of entropy (over batch) in stats. Should report mean instead. (#15290 )	2021-04-14 11:44:25 +02:00
Sven Mika	c3a15ecc0f	[RLlib] Issue #13802 : Enhance metrics for `multiagent->count_steps_by=agent_steps` setting. (#14033 )	2021-03-18 20:27:41 +01:00
Sven Mika	69202c6a7d	[RLlib] Obsolete usage tracking dict via sample batch. (#13065 )	2021-03-17 08:18:15 +01:00
Sven Mika	ef944bc5f0	[RLlib] Re-enable placement group support for RLlib. (#14384 )	2021-03-05 08:16:24 +01:00
Eric Liang	9db000ff2c	Auto report object store memory usage; remove some deprecated code (#14260 )	2021-03-01 13:19:44 -08:00
Richard Liaw	a2d2275ee1	Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289 )" (#14360 ) This reverts commit `6cd0cd3bd9`.	2021-02-25 14:27:35 -08:00
Sven Mika	6cd0cd3bd9	[RLlib + Tune] Add placement group support to RLlib. (#14289 )	2021-02-25 16:01:31 +01:00
Sven Mika	2e3655e8a9	[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238 )	2021-01-19 14:22:36 +01:00
Sven Mika	c524f86785	[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064 )	2020-12-27 09:46:03 -05:00
Sven Mika	99ae7bae05	[RLlib] JAXPolicy prep. PR #1 . (#13077 )	2020-12-26 20:14:18 -05:00
Corey Lowman	668ea0bc26	Fix typo RMSProp -> RMSprop (#13063 )	2020-12-23 13:37:46 -08:00
Sven Mika	e40b14d255	[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420 )	2020-12-08 16:41:45 -08:00
Sven Mika	19c8033df2	[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366 ) * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * LINT and fixes. MB-MPO and MAML not working yet. * wip * update * update * rmeove * remove dep * higher * Update requirements_rllib.txt * Update requirements_rllib.txt * relpos * no mbmpo Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-12-01 17:41:10 -08:00
Sven Mika	592c161032	[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397 ) * WIP. * Fix and LINT.	2020-11-25 11:27:46 -08:00
Sven Mika	62c7ab5182	[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747 )	2020-11-12 16:27:34 +01:00
Sven Mika	d9f1874e34	[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609 )	2020-10-27 10:00:24 +01:00
Sven Mika	805dad3bc4	[RLlib] SAC algo cleanup. (#10825 )	2020-09-20 11:27:02 +02:00
Sven Mika	ef18893fb5	[RLlib] PPO, APPO, and DD-PPO code cleanup. (#10420 )	2020-09-02 14:03:01 +02:00
Barak Michener	8e76796fd0	ci: Redo `format.sh --all` script & backfill lint fixes (#9956 )	2020-08-07 16:49:49 -07:00

1 2

80 commits