hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	92f030331e	[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420 )	2022-01-10 11:22:55 +01:00
Sven Mika	4eaf70942d	[RLlib] Issue 21297: Ignore PPO KL-loss term completely if kl-coeff == 0.0 to avoid NaN values due to some discrete action probs==0.0 (#21456 )	2022-01-10 11:22:40 +01:00
Sven Mika	35af30a446	[RLlib] Issue 21109: Action unsquashing causes inf/NaN actions for unbounded action spaces. (#21110 )	2022-01-10 11:20:37 +01:00
Sven Mika	b10d5533be	[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. (#21452 )	2022-01-10 11:19:40 +01:00
Matti Picus	5aef1e1708	remove deprecated unittest aliases (#21455 ) In a [recent review](https://discuss.python.org/t/experience-with-python-3-11-in-fedora/12911) of the experience of the Fedora team porting packages to the upcoming python 3.11, they remarked that most of the work was in removing deprecated aliases in unittest. I came across a few of these when looking at unrelated test failures, the DeprecationWarnings caught my eye. So a made a quick sweep of the code, using `git grep` to find occurances of the deprecated aliases: old \| new ---\|--- assertEquals \| assertEqual assertNotEquals \| assertNotEqual assertRaisesRegexp \| assertRaisesRegex	2022-01-09 20:29:54 -08:00
Sven Mika	34cee199b1	[RLlib] from remote_vector_env import ... -> from remote_base_env import ... (avoid deprecation warning). (#21460 )	2022-01-08 17:13:04 +01:00
Sven Mika	3a3d0a4a2b	[RLlib] Issue 21340: SampleBatch __init__ docstring wrong. (#21447 )	2022-01-07 15:48:14 +01:00
Avnish Narayan	39f8072eac	[RLlib] [MultiAgentEnv Refactor #2 ] Change space types for `BaseEnvs` and `MultiAgentEnvs` (#21063 )	2022-01-06 14:34:20 -08:00
Avnish Narayan	f7a5fc36eb	[rllib] Give rnnsac_stateless cartpole gpu, increase timeout (#21407 ) Increase test_preprocessors runtimes.	2022-01-06 11:54:19 -08:00
Sven Mika	853d10871c	[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376 )	2022-01-05 18:22:33 +01:00
Sven Mika	9e6b871739	[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330 )	2022-01-05 11:29:44 +01:00
Sven Mika	c01245763e	[RLlib] Revert "Revert "updated pettingzoo wrappers, env versions, urls"" (#21339 )	2022-01-04 18:30:26 +01:00
Sven Mika	abd3bef63b	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
Kai Fricke	489e6945a6	Revert "[RLlib] Updated pettingzoo wrappers, env versions, urls (#20113 )" (#21338 ) This reverts commit `327eb84154`.	2022-01-03 10:21:25 +00:00
Benjamin Black	327eb84154	[RLlib] Updated pettingzoo wrappers, env versions, urls (#20113 )	2022-01-02 21:29:09 +01:00
Balaji Veeramani	c263008c07	[RLlib] Move `__grouping_doc_end__` (#21321 ) These changes are needed for two reasons. `__grouping_doc_end__` is in the wrong place If you look at the part of the Ray documentation where the tag is referenced, you'll read > You can use the MultiAgentEnv.with_agent_groups() method to define these groups: However, if you look at the code snippet below, you'll see the implementation of `to_base_env` in addition to the implementation of `with_agent_groups`. To remove `to_base_env` from the code snippet, we need to move `__grouping_doc__end__`. Black cannot format `multi_agent_env.py` For some reason, Black errors while formatting `multi_agent_env.py`. However, if we move `__grouping_doc_end__` up, the issue is resolved.	2022-01-01 20:11:06 -08:00
Akash Patel	cbcd03b779	Upgrade cython to 0.29.26 for py310 (#21244 )	2021-12-26 20:26:08 -08:00
Sven Mika	62dbf26394	[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984 )	2021-12-21 08:39:05 +01:00
Avnish Narayan	85a368c720	[RLlib] Expand Base env API to add necessary methods for testing. (#21027 )	2021-12-16 10:19:49 +01:00
brulu	8b77fc0aef	[RLlib] Updating Repeated space. Allowing numpy arrays and adding representation. (#20799 )	2021-12-16 08:27:55 +01:00
Sven Mika	e485aa846a	[RLlib; Docs overhaul] Overhaul of auto-API reference pages (via sphinx autoclass/automodule). (#19786 )	2021-12-15 22:32:52 +01:00
simonsays1980	1a8aa2da1f	[RLlib] Added `tensorlib=numpy' to 'restore_original_dimensions()' such that … (#20342 )	2021-12-15 14:03:18 +01:00
Alexis DUBURCQ	6c3e63bc9c	[RLlib] Fix view requirements. (#21043 )	2021-12-15 11:59:04 +01:00
Jun Gong	767f78eaf8	[RLlib] Always attach latest eval metrics. (#21011 )	2021-12-15 11:42:53 +01:00
WanXing Wang	72bd2d7e09	[Core] Support back pressure for actor tasks. (#20894 ) Resubmit the PR https://github.com/ray-project/ray/pull/19936 I've figure out that the test case `//rllib:tests/test_gpus::test_gpus_in_local_mode` failed due to deadlock in local mode. In local mode, if the user code submits another task during the executing of current task, the `CoreWorker::actor_task_mutex_` may cause deadlock. The solution is quite simple, release the lock before executing task in local mode. In the commit `7c2f61c76c`: 1. Release the lock in local mode to fix the bug. @scv119 2. `test_local_mode_deadlock` added to cover the case. @rkooo567 3. Left a trivial change in `rllib/tests/test_gpus.py` to make the `RAY_CI_RLLIB_DIRECTLY_AFFECTED ` to take effect.	2021-12-13 23:56:07 -08:00
Sven Mika	daa4304a91	[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )	2021-12-13 12:04:23 +01:00
Sven Mika	db058d0fb3	[RLlib] Rename `metrics_smoothing_episodes` into `metrics_num_episodes_for_smoothing` for clarity. (#20983 )	2021-12-11 20:33:35 +01:00
Sven Mika	596c8e2772	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
Eric Liang	6f93ea437e	Remove the flaky test tag (#21006 )	2021-12-11 01:03:17 -08:00
Sven Mika	f814c2af89	[RLlib; Docs] Docs API reference pages: `rllib/execution`, `rllib/evaluation`, `rllib/models`, `rllib/offline`. (#20538 )	2021-12-10 09:41:29 +01:00
kk-55	9acf2f954d	[RLlib] Example containing a proposal for computing an adapted (time-dependent) GAE used by the PPO algorithm (via callback on_postprocess_trajectory) (#20850 )	2021-12-09 14:48:56 +01:00
Tomasz Wrona	39c202fa66	[RLlib] Allow extra keys in info in multi-agent (#20793 )	2021-12-09 14:44:33 +01:00
Carlo Grisetti	a8286c55af	[RLLib] Fix deprecated convert_to_non_torch_type (#20751 )	2021-12-09 14:42:12 +01:00
Avnish Narayan	6996eaa986	[RLlib] Add necessary fields to Base Envs, and BaseEnv wrapper classes (#20832 )	2021-12-09 14:40:40 +01:00
Sven Mika	63db0e3a7c	[RLlib] Fix SAC learning test flakiness introduced in PR: "Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`." (#20985 )	2021-12-09 14:24:27 +01:00
Ishant Mrinal	2868d1a2cf	[RLlib] Support for RE3 exploration algorithm (for tf) (#19551 )	2021-12-07 13:26:34 +01:00
Avnish Narayan	b8c64480d8	[RLlib] Change return type of try_reset to MultiEnvDict (#20868 )	2021-12-06 14:15:33 +01:00
Sven Mika	b4790900f5	[RLlib] Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`. (#20725 )	2021-12-04 22:05:26 +01:00
Sven Mika	60b2219d72	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
Amog Kamsetty	611bfc1352	[ML] Move `find_free_port` to `ml_utils` (#20828 ) Small refactoring of common utility used by Train, Tune, and Rllib.	2021-12-03 13:38:42 -08:00
Sven Mika	0de41e4a6b	[RLlib] Trainer sub-class QMIX/MAML/MB-MPO (instead of `build_trainer`). (#20639 )	2021-12-02 13:17:10 +01:00
Jun Gong	2317c693cf	[RLlib] Use SampleBrach instead of input dict whenever possible (#20746 )	2021-12-02 13:11:26 +01:00
Jun Gong	65bd8e29f8	[RLlib] Update a few things to get rid of the `remote_vector_env` deprecation warning. (#20753 )	2021-12-02 13:10:44 +01:00
Sven Mika	9e38f6f613	[RLlib] Trainer sub-class DDPG/TD3/APEX-DDPG (instead of `build_trainer`). (#20636 )	2021-12-01 10:52:12 +01:00
Avnish Narayan	74dd0e4085	[RLlib] Make `to_base_env()` a method of all RLlib-supported Env classes (#20811 )	2021-12-01 09:01:02 +01:00
Avnish Narayan	3ddc09544d	[rllib] Env to base env refactor (#20785 )	2021-11-30 17:02:10 -08:00
Sven Mika	bec719d823	[RLlib] Trainer sub-class IMPALA (instead of using `build_trainer()`). (#20570 )	2021-11-30 19:08:36 +01:00
Sven Mika	3d2e27485b	[RLlib] Trainer sub-class DQN/SimpleQ/APEX-DQN/R2D2 (instead of using `build_trainer`). (#20633 )	2021-11-30 18:05:44 +01:00
Carlo Grisetti	514ed27f63	[RLlib] Fix deprecation message for `rllib.env.remote_vector_env` (now `RemoteBaseEnv`) and migrate import (#20750 )	2021-11-30 18:01:21 +01:00
mvindiola1	8cee0c03bf	[RLlib] Update `max_seq_len` in pad_batch_to_sequences_of_same_size (#20743 )	2021-11-30 18:00:07 +01:00

... 3 4 5 6 7 ...

1187 commits