hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	daa4304a91	[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )	2021-12-13 12:04:23 +01:00
Sven Mika	db058d0fb3	[RLlib] Rename `metrics_smoothing_episodes` into `metrics_num_episodes_for_smoothing` for clarity. (#20983 )	2021-12-11 20:33:35 +01:00
Sven Mika	596c8e2772	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
Eric Liang	6f93ea437e	Remove the flaky test tag (#21006 )	2021-12-11 01:03:17 -08:00
Sven Mika	f814c2af89	[RLlib; Docs] Docs API reference pages: `rllib/execution`, `rllib/evaluation`, `rllib/models`, `rllib/offline`. (#20538 )	2021-12-10 09:41:29 +01:00
kk-55	9acf2f954d	[RLlib] Example containing a proposal for computing an adapted (time-dependent) GAE used by the PPO algorithm (via callback on_postprocess_trajectory) (#20850 )	2021-12-09 14:48:56 +01:00
Tomasz Wrona	39c202fa66	[RLlib] Allow extra keys in info in multi-agent (#20793 )	2021-12-09 14:44:33 +01:00
Carlo Grisetti	a8286c55af	[RLLib] Fix deprecated convert_to_non_torch_type (#20751 )	2021-12-09 14:42:12 +01:00
Avnish Narayan	6996eaa986	[RLlib] Add necessary fields to Base Envs, and BaseEnv wrapper classes (#20832 )	2021-12-09 14:40:40 +01:00
Sven Mika	63db0e3a7c	[RLlib] Fix SAC learning test flakiness introduced in PR: "Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`." (#20985 )	2021-12-09 14:24:27 +01:00
Ishant Mrinal	2868d1a2cf	[RLlib] Support for RE3 exploration algorithm (for tf) (#19551 )	2021-12-07 13:26:34 +01:00
Avnish Narayan	b8c64480d8	[RLlib] Change return type of try_reset to MultiEnvDict (#20868 )	2021-12-06 14:15:33 +01:00
Sven Mika	b4790900f5	[RLlib] Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`. (#20725 )	2021-12-04 22:05:26 +01:00
Sven Mika	60b2219d72	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
Amog Kamsetty	611bfc1352	[ML] Move `find_free_port` to `ml_utils` (#20828 ) Small refactoring of common utility used by Train, Tune, and Rllib.	2021-12-03 13:38:42 -08:00
Sven Mika	0de41e4a6b	[RLlib] Trainer sub-class QMIX/MAML/MB-MPO (instead of `build_trainer`). (#20639 )	2021-12-02 13:17:10 +01:00
Jun Gong	2317c693cf	[RLlib] Use SampleBrach instead of input dict whenever possible (#20746 )	2021-12-02 13:11:26 +01:00
Jun Gong	65bd8e29f8	[RLlib] Update a few things to get rid of the `remote_vector_env` deprecation warning. (#20753 )	2021-12-02 13:10:44 +01:00
Sven Mika	9e38f6f613	[RLlib] Trainer sub-class DDPG/TD3/APEX-DDPG (instead of `build_trainer`). (#20636 )	2021-12-01 10:52:12 +01:00
Avnish Narayan	74dd0e4085	[RLlib] Make `to_base_env()` a method of all RLlib-supported Env classes (#20811 )	2021-12-01 09:01:02 +01:00
Avnish Narayan	3ddc09544d	[rllib] Env to base env refactor (#20785 )	2021-11-30 17:02:10 -08:00
Sven Mika	bec719d823	[RLlib] Trainer sub-class IMPALA (instead of using `build_trainer()`). (#20570 )	2021-11-30 19:08:36 +01:00
Sven Mika	3d2e27485b	[RLlib] Trainer sub-class DQN/SimpleQ/APEX-DQN/R2D2 (instead of using `build_trainer`). (#20633 )	2021-11-30 18:05:44 +01:00
Carlo Grisetti	514ed27f63	[RLlib] Fix deprecation message for `rllib.env.remote_vector_env` (now `RemoteBaseEnv`) and migrate import (#20750 )	2021-11-30 18:01:21 +01:00
mvindiola1	8cee0c03bf	[RLlib] Update `max_seq_len` in pad_batch_to_sequences_of_same_size (#20743 )	2021-11-30 18:00:07 +01:00
mvindiola1	eadc7669c5	[RLlib] SampleBatch.concat_samples fix incorrect max_seq_len calculation (#20704 )	2021-11-29 12:01:40 +01:00
Sven Mika	e37afe0425	[RLlib; Docs] Auto API reference pages overhaul: `rllib/policy` and `rllib/agents` packages. (#20537 )	2021-11-25 09:35:19 +01:00
Sven Mika	c07d8c4c22	[RLlib] Trainer sub-class A2C/A3C (instead of `build_trainer`). (#20635 )	2021-11-24 22:07:13 +01:00
Sven Mika	49cd7ea6f9	[RLlib] Trainer sub-class PPO/DDPPO (instead of `build_trainer()`). (#20571 )	2021-11-23 23:01:05 +01:00
Sven Mika	9d2fe5756c	[RLlib] Trainer sub-class for APPO (instead of using `build_trainer()`). (#20424 )	2021-11-22 22:14:21 +01:00
gjoliver	e7f9e8ceec	[RLlib] Report total_train_steps correctly for offline agents like CQL. (#20541 ) * Fix trainer timestep reporting for offline agents like CQL. * wip. * extend timesteps_total to 200K for learning_tests_pendulum_cql test Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-22 21:46:45 +01:00
Artur Niederfahrenhorst	d07e50e957	[RLlib] Replay buffer API (cleanups; docstrings; renames; move into `rllib/execution/buffers` dir) (#20552 )	2021-11-19 11:57:37 +01:00
gjoliver	18862f9f44	[RLlib] Add a comment in the doc string of `on_learn_on_batch` callback function. (#20456 )	2021-11-19 10:49:07 +01:00
Avnish Narayan	b6077a36d4	[RLlib; Pre-checks/better failure behavior]: Env Checker for Gym Environments (#20481 )	2021-11-19 09:41:03 +01:00
Sven Mika	7a585fb275	[RLlib; Documentation] RLlib README overhaul. (#20249 )	2021-11-18 18:08:40 +01:00
Sven Mika	56619b955e	[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. (#20250 )	2021-11-17 21:40:16 +01:00
gjoliver	724a140795	[rllib] Make sure json can serialize result dict (#20439 ) We may have fields in the result dict that are or None. Make sure our results are json serializable.	2021-11-17 10:27:00 -08:00
Avnish Narayan	dc17f0a241	Add error messages for missing tf and torch imports (#20205 ) Co-authored-by: Sven Mika <sven@anyscale.io> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-16 16:30:53 -08:00
Kai Fricke	05d21497db	[rllib/tune] Fix durable trainable in trainer template, add release test (#20422 )	2021-11-16 20:52:42 +00:00
gjoliver	6e787f70e0	[Rllib/release] Disable throughput check (#20387 ) Throughput check was enabled by `d8a61f801f` prematurely. E.g., see state before the commit: `a931076f59/rllib/utils/test_utils.py (L740-L741)`	2021-11-16 11:05:51 -08:00
Sven Mika	f82880eda1	Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 ) This reverts commit `90dc5460d4`.	2021-11-16 14:49:41 +01:00
Stefan Schneider	2b3d0c691f	[RLlib] Document and extend action mask example. (#20390 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Sven Mika <sven@anyscale.io> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-16 13:20:41 +01:00
Kai Fricke	3e6ba5d6d2	Revert "Revert [RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`." (#20285 ) * Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)" (#20284)" This reverts commit `246787cdd9`. Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-16 12:26:47 +01:00
Amog Kamsetty	90dc5460d4	Revert "[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )" (#20399 ) This reverts commit `5b1c8e46e1`.	2021-11-15 16:11:35 -08:00
Sven Mika	6ff4061f3a	[RLlib] Issue 20269: Offline RL example not working due to new_obs not being written to file. (#20366 ) * wip. * Apply suggestions from code review	2021-11-15 16:41:08 +01:00
Sven Mika	5b1c8e46e1	[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )	2021-11-15 10:41:54 +01:00
xwjiang2010	cdf70c2900	[Tune] Remove legacy resources implementations in Runner and Executor. (#19773 )	2021-11-12 12:33:39 -08:00
Sven Mika	38c456b6f4	[RLlib; Tune] Fix rllib/train.py script after tune.Experiment c'tor change. (#20283 )	2021-11-12 15:25:50 +01:00
Kai Fricke	246787cdd9	Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055 )" (#20284 ) This reverts commit `6f85af435f`.	2021-11-12 13:09:43 +00:00
Sven Mika	70fe25055a	[RLlib] Issue: Get single step input dict incorrect. (#20217 )	2021-11-12 08:38:51 +01:00

... 2 3 4 5 6 ...

1112 commits