hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-08 11:31:40 -05:00

Author	SHA1	Message	Date
Richard Liaw	a2d2275ee1	Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289 )" (#14360 ) This reverts commit `6cd0cd3bd9`.	2021-02-25 14:27:35 -08:00
Sven Mika	6cd0cd3bd9	[RLlib + Tune] Add placement group support to RLlib. (#14289 )	2021-02-25 16:01:31 +01:00
Sven Mika	8000258333	[RLlib] R2D2 Implementation. (#13933 )	2021-02-25 12:18:11 +01:00
QuantumMecha	0c93bb77cb	[RLlib] Update Documentation for Curiosity's support of continuous actions (#13784 ) Only (Multi)Discrete action spaces are supported so far according to https://github.com/ray-project/ray/blob/master/rllib/utils/exploration/curiosity.py	2021-02-02 13:10:09 +01:00
Sven Mika	9dd9f72111	[RLlib] Add more detailed Documentation on Model building API (#13261 )	2021-01-09 12:38:29 +01:00
Michael Luo	67229bf350	[RLlib] SlateQ Documentation (#13266 )	2021-01-09 11:21:51 +01:00
Sven Mika	391cdfae8c	[RLlib] Trajectory view API docs. (#12718 )	2020-12-30 17:32:21 -08:00
Michael Luo	6e6c680f14	MBMPO Cartpole (#11832 ) * MBMPO Cartpole Done * Added doc	2020-11-12 10:30:41 -08:00
Eric Liang	9b8218aabd	[docs] Move all /latest links to /master (#11897 ) * use master link * remae * revert non-ray * more * mre	2020-11-10 10:53:28 -08:00
Yutai Zhou	6999db93cb	Un-indent multiagent section (#11310 ) * Un-indent multiagent section MARL section used to be nested inside bandits, which we probably don't want. Maybe give it its own section instead?	2020-10-29 16:12:48 +01:00
huyz-git	64e3c9741a	Update rllib-algorithms.rst (#11642 )	2020-10-28 15:07:10 -07:00
Sven Mika	f91c455527	[RLlib] Curiosity documentation. (#11066 )	2020-09-29 09:39:22 +02:00
Sven Mika	805dad3bc4	[RLlib] SAC algo cleanup. (#10825 )	2020-09-20 11:27:02 +02:00
Eric Liang	f7d5aa46a3	[hotfix] Fix table formatting (#10687 )	2020-09-09 16:08:54 -07:00
Justin Terry	8a1caf6279	rename centralized critic to shared critic (#10610 )	2020-09-09 15:49:32 -07:00
Sven Mika	4b278c36fc	[RLlib] Behavioral Cloning (from MARWIL). (#10619 )	2020-09-09 17:33:21 +02:00
Michael Luo	8e613652af	[RLLib] MBMPO Fixes (#10296 )	2020-09-09 09:34:34 +02:00
Simon Mo	5a38a76c83	[Doc] Use sphinx_book_theme (#10379 )	2020-09-08 16:25:23 -07:00
Justin Terry	352718610d	Multi-agent Algorithm Documentation Updates (#9722 )	2020-09-03 22:37:46 -07:00
Michael Luo	4e9888ce2f	[RLlib] Dreamer (#10172 )	2020-08-26 13:24:05 +02:00
Matthew Strawbridge	7a5af7e744	Fix links to ddpg tuned examples (#9713 )	2020-08-25 11:30:13 -07:00
Sven Mika	d14b501692	[RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115 )	2020-08-20 17:05:57 +02:00
Sven Mika	fe0bdb23ff	[RLlib] Attention Net/Transformers docs improvement.	2020-08-17 13:07:17 -07:00
Justin Terry	0d67602051	Update rllib-algorithms.rst (#9640 )	2020-07-24 19:35:28 -07:00
Sven Mika	78dfed2683	[RLlib] Issue 8384: QMIX doesn't learn anything. (#9527 )	2020-07-17 12:14:34 +02:00
Michael Luo	851d02463b	[Doc] RLlib Algorithms Documentation: MAML + PyTorch MAML (#9189 )	2020-07-03 11:05:15 -07:00
Sven Mika	a90cd0fcbb	[RLlib] Unity3d soccer benchmarks (#8834 )	2020-06-11 14:29:57 +02:00
Chapman Siu	04cffb7e65	[docs] `rllib-models.rst` - QMIX +parametric (#8868 ) Updating docs to show that QMIX supports parametric action space, as per SMAC environments. This is reflected in the code here: https://github.com/ray-project/ray/blob/master/rllib/agents/qmix/qmix_policy.py#L179 and consistent with QMIX being an extension of DQN	2020-06-09 21:56:16 -07:00
Sven Mika	2746fc0476	[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520 )	2020-05-27 16:19:13 +02:00
Eric Liang	9a83908c46	[rllib] Deprecate policy optimizers (#8345 )	2020-05-21 10:16:18 -07:00
Sven Mika	b95e28faea	[RLlib] APEX_DDPG (PyTorch) test case and docs. (#8288 ) APEX_DDPG (PyTorch) test case and docs.	2020-05-04 09:36:27 +02:00
Sven Mika	166bb5d690	[RLlib] IMPALA PyTorch (#8287 ) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole.	2020-05-03 13:44:25 +02:00
Sven Mika	499ad5fbe4	[RLlib] PyTorch version of APPO. (#8120 ) - Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases. - Add learning test cases for APPO torch (both w/ and w/o v-trace). - Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).	2020-04-23 09:11:12 +02:00
Sven Mika	d15609ba2a	[RLlib] PyTorch version of ARS (Augmented Random Search). (#8106 ) This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.	2020-04-21 09:47:52 +02:00
Sven Mika	3812bfedda	[RLlib] PyTorch version of ES (Evolution Strategies). (#8104 ) PyTorch version of Evolution Strategies (ES) Algo.	2020-04-20 21:47:28 +02:00
Sven Mika	d0fab84e4d	[RLlib] DDPG PyTorch version. (#7953 ) The DDPG/TD3 algorithms currently do not have a PyTorch implementation. This PR adds PyTorch support for DDPG/TD3 to RLlib. This PR: - Depends on the re-factor PR for DDPG (Functional Algorithm API). - Adds learning regression tests for the PyTorch version of DDPG and a DDPG (torch) - Updates the documentation to reflect that DDPG and TD3 now support PyTorch. * Learning Pendulum-v0 on torch version (same config as tf). Wall time a little slower (~20% than tf). * Fix GPU target model problem.	2020-04-16 10:20:01 +02:00
Sven Mika	d2b5c171cb	[RLlib] Add pytorch sigils to toc and add links to algo overview table. (#7950 ) * Add torch sigils to toc-tree for DQN/APEX. * WIP.	2020-04-09 10:40:18 -07:00
Sven Mika	22ccc43670	[RLlib] DQN torch version. (#7597 ) * Fix. * Rollback. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * Fix. * Fix. * Fix. * Fix. * Fix. * WIP. * WIP. * Fix. * Test case fixes. * Test case fixes and LINT. * Test case fixes and LINT. * Rollback. * WIP. * WIP. * Test case fixes. * Fix. * Fix. * Fix. * Add regression test for DQN w/ param noise. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Comment * Regression test case. * WIP. * WIP. * LINT. * LINT. * WIP. * Fix. * Fix. * Fix. * LINT. * Fix (SAC does currently not support eager). * Fix. * WIP. * LINT. * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * WIP. * Fix. * LINT. * LINT. * Fix and LINT. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Fix. * Fix and LINT. * Update rllib/utils/exploration/exploration.py * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Fixes. * WIP. * LINT. * Fixes and LINT. * LINT and fixes. * LINT. * Move action_dist back into torch extra_action_out_fn and LINT. * Working SimpleQ learning cartpole on both torch AND tf. * Working Rainbow learning cartpole on tf. * Working Rainbow learning cartpole on tf. * WIP. * LINT. * LINT. * Update docs and add torch to APEX test. * LINT. * Fix. * LINT. * Fix. * Fix. * Fix and docstrings. * Fix broken RLlib tests in master. * Split BAZEL learning tests into cartpole and pendulum (reached the 60min barrier). * Fix error_outputs option in BAZEL for RLlib regression tests. * Fix. * Tune param-noise tests. * LINT. * Fix. * Fix. * test * test * test * Fix. * Fix. * WIP. * WIP. * WIP. * WIP. * LINT. * WIP. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-04-06 11:56:16 -07:00
Eric Liang	5cebee68d6	[rllib] Add scaling guide to documentation, improve bandit docs (#7780 ) * update * reword * update * ms * multi node sgd * reorder * improve bandit docs * contrib * update * ref * improve refs * fix build * add pillow dep * add pil * update pil * pillow * remove false	2020-03-27 22:05:43 -07:00
Saurabh Gupta	6ddf84b019	Contextual Bandit algorithms (WIP) (#7642 )	2020-03-26 13:41:16 -07:00
hubcity	3d0a8662b3	#7246 - Fixing broken links (#7247 ) * #7246 - Fixing broken links * Apply suggestions from code review Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-03-25 21:46:13 -07:00
Eric Liang	dd70720578	[rllib] Rename sample_batch_size => rollout_fragment_length (#7503 ) * bulk rename * deprecation warn * update doc * update fig * line length * rename * make pytest comptaible * fix test * fi sys * rename * wip * fix more * lint * update svg * comments * lint * fix use of batch steps	2020-03-14 12:05:04 -07:00
Eric Liang	026f6884b5	[rllib] Add Decentralized DDPPO trainer and documentation (#7088 )	2020-02-10 15:28:27 -08:00
Sven Mika	0e3960893a	[RLlib] Add rainbow config hint to algo-documentation. (#7052 )	2020-02-05 12:01:43 -08:00
Eric Liang	6bb30c9f1b	fix links (#6883 )	2020-01-22 01:06:07 -08:00
Eric Liang	14016535a5	[rllib] Add TF and Torch icons to show which are available for each algo (#6869 )	2020-01-20 15:22:21 -08:00
Sven Mika	7659cae3ba	[RLlib] Add PG torch regression test (#6828 ) * Add PG torch regression test to tuned_examples/regression_tests dir. * Rename cartpole-pg.yaml into cartpole-pg-tf.yaml * cartpole-pg-tf.yaml: Change cartpole-pg name of tuned_example to cartpole-pg-tf.	2020-01-18 15:57:12 -08:00
Justin Terry	97bf79917c	[RLlib] Update MADDPG example repo to maintained fork (#6831 )	2020-01-18 13:08:27 -08:00
Michael Luo	e5dded917c	SAC site changes (#6759 )	2020-01-09 18:13:42 -08:00
Michael Luo	1cb335487e	SAC for Mujoco Environments (#6642 )	2019-12-31 00:16:54 -08:00

1 2

84 commits