hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	18d173b172	[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031 )	2021-07-19 13:16:03 -04:00
Sven Mika	a5831f9429	[RLlib] Fix bandit example scripts and add all scripts to CI testing suite.	2021-06-15 13:30:31 +02:00
mvindiola1	170366fbf1	[RLlib] contrib/MADDPG: Make get_weights and set_weights use dictionaries rather than lists. (#14903 ) Co-authored-by: Manny Vindiola <manuel.m.vindiola.civ@mail.mil>	2021-05-04 13:26:39 +02:00
Yeachan-Heo	0552f6e886	[RLlib] Update alpha_zero_policy.py (#15042 )	2021-05-04 13:20:24 +02:00
Sven Mika	8b3554e37e	[RLlib] Remove all (already soft-deprecated) `SampleBatch.data` from code. (#15335 )	2021-04-15 19:19:51 +02:00
Sven Mika	69202c6a7d	[RLlib] Obsolete usage tracking dict via sample batch. (#13065 )	2021-03-17 08:18:15 +01:00
Sven Mika	d001af3e59	[RLlib] Allow `rllib rollout` to run distributed via evaluation workers. (#13718 )	2021-02-08 12:05:16 +01:00
Sven Mika	2e3655e8a9	[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238 )	2021-01-19 14:22:36 +01:00
Sven Mika	99ae7bae05	[RLlib] JAXPolicy prep. PR #1 . (#13077 )	2020-12-26 20:14:18 -05:00
Sven Mika	e40b14d255	[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420 )	2020-12-08 16:41:45 -08:00
Sven Mika	99c81c6795	[RLlib] Attention Net prep PR #3 . (#12450 )	2020-12-07 13:08:17 +01:00
Sven Mika	19c8033df2	[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366 ) * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * LINT and fixes. MB-MPO and MAML not working yet. * wip * update * update * rmeove * remove dep * higher * Update requirements_rllib.txt * Update requirements_rllib.txt * relpos * no mbmpo Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-12-01 17:41:10 -08:00
Sven Mika	0df55a139c	[RLlib] Attention Net prep PR #1 : Smaller cleanups. (#12447 ) * WIP. * Fix. * Fix. * Fix.	2020-11-27 16:25:47 -08:00
Eric Liang	9b8218aabd	[docs] Move all /latest links to /master (#11897 ) * use master link * remae * revert non-ray * more * mre	2020-11-10 10:53:28 -08:00
Lara Codeca	e735add268	[RLlib] Integration with SUMO Simulator (#11710 )	2020-11-03 09:45:03 +01:00
Sven Mika	d9f1874e34	[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609 )	2020-10-27 10:00:24 +01:00
Eric Liang	ecdaaffc67	add large data warning (#10957 )	2020-09-23 15:46:06 -07:00
Sven Mika	28ab797cf5	[RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544 )	2020-09-06 10:58:00 +02:00
Sven Mika	78dfed2683	[RLlib] Issue 8384: QMIX doesn't learn anything. (#9527 )	2020-07-17 12:14:34 +02:00
Piotr Januszewski	155cc81e40	Clarify training intensity configuration docstring (#9244 ) (#9306 )	2020-07-05 20:07:27 -07:00
Richard Liaw	d35f0e40d0	[tune] Use public methods for trainable (#9184 )	2020-07-01 11:00:00 -07:00
Sven Mika	43043ee4d5	[RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136 ) * WIP. * Fixes. * LINT. * WIP. * WIP. * Fixes. * Fixes. * Fixes. * Fixes. * WIP. * Fixes. * Test * Fix. * Fixes and LINT. * Fixes and LINT. * LINT.	2020-06-30 10:13:20 +02:00
Tanay Wakhare	efcee9f1de	[RLlib] MADDPG bug fix (issue https://github.com/ray-project/ray/issues/8483 ) (#9110 ) * Bug fix for https://githhub.com/ray-project/ray/issues/8483 We need to pass in a framework explicitly with the new defaults. Further, the actual bug was that policies were being sorted alphabetically in the MADDPG init(), which led to incorrect initialization. * Linting	2020-06-30 00:27:32 -07:00
Sven Mika	7008902cff	[RLlib] Minor `rllib.utils` cleanup. (#8932 )	2020-06-16 08:52:20 +02:00
Eric Liang	34bae27ac7	[rllib] Flexible multi-agent replay modes and replay_sequence_length (#8893 )	2020-06-12 20:17:27 -07:00
Sven Mika	a90cd0fcbb	[RLlib] Unity3d soccer benchmarks (#8834 )	2020-06-11 14:29:57 +02:00
Dean Wampler	53712d2ef7	Fix typo in docs for LinearDiscreteEnv (#8891 )	2020-06-11 08:34:35 +02:00
Sven Mika	ad695a818b	Bug fix in the contextual bandit's linear_regression.py model. (#8815 )	2020-06-06 22:47:42 +02:00
Sven Mika	d8a081a185	[RLlib] Unity3D integration (n Unity3D clients vs learning server). (#8590 )	2020-05-30 22:48:34 +02:00
Sven Mika	2746fc0476	[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520 )	2020-05-27 16:19:13 +02:00
Paco Nathan	067bbb6710	resolved NameError in ray.tune() call (#8494 )	2020-05-27 10:55:56 +02:00
Sven Mika	0422e9c5a8	[RLlib] Add 2 Transformer learning test cases on StatelessCartPole (PPO and IMPALA). (#8624 )	2020-05-27 10:19:47 +02:00
Eric Liang	9a83908c46	[rllib] Deprecate policy optimizers (#8345 )	2020-05-21 10:16:18 -07:00
Eric Liang	aa7a58e92f	[rllib] Support training intensity for dqn / apex (#8396 )	2020-05-20 11:22:30 -07:00
Sven Mika	57544b1ff9	[RLlib] Examples folder restructuring (Model examples; final part). (#8278 ) - This PR completes any previously missing PyTorch Model counterparts to TFModels in examples/models. - It also makes sure, all example scripts in the rllib/examples folder are tested for both frameworks and learn the given task (this is often currently not checked) using a --as-test flag in connection with a --stop-reward.	2020-05-12 08:23:10 +02:00
A Kharitonov	304e31b7e5	Fixed: contrib/MADDPG MADDPGTFPolicy missing self.config assignment (#8343 )	2020-05-08 12:05:06 -07:00
Eric Liang	2c599dbf05	[rllib] Port QMIX, MADDPG to new execution API (#8344 )	2020-05-07 23:41:10 -07:00
Eric Liang	b14cc16616	[rllib] Enable functional execution workflow API by default (#8221 )	2020-05-05 12:36:42 -07:00
Sven Mika	42991d723f	[RLlib] rllib/examples folder restructuring (#8250 ) Cleans up of the rllib/examples folder by moving all example Envs into rllibexamples/env (so they can be used by other scripts and tests as well).	2020-05-01 22:59:34 +02:00
Sven Mika	c593fb09b7	[RLlib] Remove all f-strings to keep py3.5 compatibility.	2020-04-30 11:10:16 -07:00
Sven Mika	bf25aee392	[RLlib] Deprecate all Model(v1) usage. (#8146 ) Deprecate all Model(v1) usage.	2020-04-29 12:12:59 +02:00
roireshef	dbcad35022	[RLlib] Added DefaultCallbacks which replaces old callbacks dict interface (#6972 )	2020-04-16 16:06:42 -07:00
Sven Mika	428516056a	[RLlib] SAC Torch (incl. Atari learning) (#7984 ) * Policy-classes cleanup and torch/tf unification. - Make Policy abstract. - Add `action_dist` to call to `extra_action_out_fn` (necessary for PPO torch). - Move some methods and vars to base Policy (from TFPolicy): num_state_tensors, ACTION_PROB, ACTION_LOGP and some more. * Fix `clip_action` import from Policy (should probably be moved into utils altogether). * - Move `is_recurrent()` and `num_state_tensors()` into TFPolicy (from DynamicTFPolicy). - Add config to all Policy c'tor calls (as 3rd arg after obs and action spaces). * Add `config` to c'tor call to TFPolicy. * Add missing `config` to c'tor call to TFPolicy in marvil_policy.py. * Fix test_rollout_worker.py::MockPolicy and BadPolicy classes (Policy base class is now abstract). * Fix LINT errors in Policy classes. * Implement StatefulPolicy abstract methods in test cases: test_multi_agent_env.py. * policy.py LINT errors. * Create a simple TestPolicy to sub-class from when testing Policies (reduces code in some test cases). * policy.py - Remove abstractmethod from `apply_gradients` and `compute_gradients` (these are not required iff `learn_on_batch` implemented). - Fix docstring of `num_state_tensors`. * Make QMIX torch Policy a child of TorchPolicy (instead of Policy). * QMixPolicy add empty implementations of abstract Policy methods. * Store Policy's config in self.config in base Policy c'tor. * - Make only compute_actions in base Policy's an abstractmethod and provide pass implementation to all other methods if not defined. - Fix state_batches=None (most Policies don't have internal states). * Cartpole tf learning. * Cartpole tf AND torch learning (in ~ same ts). * Cartpole tf AND torch learning (in ~ same ts). 2 * Cartpole tf (torch syntax-broken) learning (in ~ same ts). 3 * Cartpole tf AND torch learning (in ~ same ts). 4 * Cartpole tf AND torch learning (in ~ same ts). 5 * Cartpole tf AND torch learning (in ~ same ts). 6 * Cartpole tf AND torch learning (in ~ same ts). Pendulum tf learning. * WIP. * WIP. * SAC torch learning Pendulum. * WIP. * SAC torch and tf learning Pendulum and Cartpole after cleanup. * WIP. * LINT. * LINT. * SAC: Move policy.target_model to policy.device as well. * Fixes and cleanup. * Fix data-format of tf keras Conv2d layers (broken for some tf-versions which have data_format="channels_first" as default). * Fixes and LINT. * Fixes and LINT. * Fix and LINT. * WIP. * Test fixes and LINT. * Fixes and LINT. Co-authored-by: Sven Mika <sven@Svens-MacBook-Pro.local>	2020-04-15 13:25:16 +02:00
Robert Nishihara	d985d7537e	Replace all instances of ray.readthedocs.io with ray.io (#7994 )	2020-04-13 16:17:05 -07:00
Sven Mika	d2b5c171cb	[RLlib] Add pytorch sigils to toc and add links to algo overview table. (#7950 ) * Add torch sigils to toc-tree for DQN/APEX. * WIP.	2020-04-09 10:40:18 -07:00
Sven Mika	22ccc43670	[RLlib] DQN torch version. (#7597 ) * Fix. * Rollback. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * Fix. * Fix. * Fix. * Fix. * Fix. * WIP. * WIP. * Fix. * Test case fixes. * Test case fixes and LINT. * Test case fixes and LINT. * Rollback. * WIP. * WIP. * Test case fixes. * Fix. * Fix. * Fix. * Add regression test for DQN w/ param noise. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Comment * Regression test case. * WIP. * WIP. * LINT. * LINT. * WIP. * Fix. * Fix. * Fix. * LINT. * Fix (SAC does currently not support eager). * Fix. * WIP. * LINT. * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * WIP. * Fix. * LINT. * LINT. * Fix and LINT. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Fix. * Fix and LINT. * Update rllib/utils/exploration/exploration.py * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Fixes. * WIP. * LINT. * Fixes and LINT. * LINT and fixes. * LINT. * Move action_dist back into torch extra_action_out_fn and LINT. * Working SimpleQ learning cartpole on both torch AND tf. * Working Rainbow learning cartpole on tf. * Working Rainbow learning cartpole on tf. * WIP. * LINT. * LINT. * Update docs and add torch to APEX test. * LINT. * Fix. * LINT. * Fix. * Fix. * Fix and docstrings. * Fix broken RLlib tests in master. * Split BAZEL learning tests into cartpole and pendulum (reached the 60min barrier). * Fix error_outputs option in BAZEL for RLlib regression tests. * Fix. * Tune param-noise tests. * LINT. * Fix. * Fix. * test * test * test * Fix. * Fix. * WIP. * WIP. * WIP. * WIP. * LINT. * WIP. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-04-06 11:56:16 -07:00
Sven Mika	5537fe13b0	[RLlib] Exploration API: ParamNoise Integration into DQN; working example/test cases. (#7814 )	2020-04-03 10:44:25 -07:00
Sven Mika	e153e3179f	[RLlib] Exploration API: Policy changes needed for forward pass noisifications. (#7798 ) * Rollback. * WIP. * WIP. * LINT. * WIP. * Fix. * Fix. * Fix. * LINT. * Fix (SAC does currently not support eager). * Fix. * WIP. * LINT. * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * WIP. * Fix. * LINT. * LINT. * Fix and LINT. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Fix. * Fix and LINT. * Update rllib/utils/exploration/exploration.py * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Fixes. * LINT. * WIP. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-04-01 00:43:21 -07:00
Sven Mika	e356e97eb2	[RLlib] Assert correct policy class being used in Worker. (#7769 )	2020-03-30 14:03:29 -07:00
Eric Liang	5cebee68d6	[rllib] Add scaling guide to documentation, improve bandit docs (#7780 ) * update * reword * update * ms * multi node sgd * reorder * improve bandit docs * contrib * update * ref * improve refs * fix build * add pillow dep * add pil * update pil * pillow * remove false	2020-03-27 22:05:43 -07:00

1 2

63 commits