hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	deb33bce84	[RLlib] Add DQN SoftQ learning test case. (#12712 )	2020-12-10 14:55:19 +01:00
Sven Mika	ea25482f6a	WIP. (#12706 )	2020-12-09 11:49:21 -08:00
Sven Mika	f6241302a8	[RLlib] Fix issue 12678: MultiAgentBatch has no attribute `total`. (#12704 )	2020-12-09 16:41:13 +01:00
Sven Mika	28108c905b	[RLlib] Tf-eager policy bug fix: Duplicate model call in compute_gradients. (#12682 )	2020-12-09 08:03:58 +01:00
Sven Mika	e40b14d255	[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420 )	2020-12-08 16:41:45 -08:00
Felipe Antunes	4c0f0ce3a9	[RLlib] In OffPolicyEstimators (Offline RL): Include last step of trajectory (#12619 )	2020-12-08 12:39:40 +01:00
Sven Mika	340b1e99fc	[RLlib] Fix JAX import bug. (#12621 )	2020-12-07 11:05:08 -08:00
Sven Mika	99c81c6795	[RLlib] Attention Net prep PR #3 . (#12450 )	2020-12-07 13:08:17 +01:00
Kai Fricke	219c445648	[tune] verbosity refactor second attempt (#12571 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-12-04 13:56:26 -08:00
Sven Mika	3f4bc16276	[RLlib] Add a minimal JAX ModelV2 (FCNet) to RLlib. (#12502 )	2020-12-03 15:51:30 +01:00
Sven Mika	19c8033df2	[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366 ) * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * LINT and fixes. MB-MPO and MAML not working yet. * wip * update * update * rmeove * remove dep * higher * Update requirements_rllib.txt * Update requirements_rllib.txt * relpos * no mbmpo Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-12-01 17:41:10 -08:00
Sven Mika	9021f15b2a	[RLlib] Fix setup-dev.py error when creating a softlink for new_dashboard. (#12442 )	2020-12-01 11:46:59 +01:00
Sven Mika	3ad9365e1d	[RLlib] Attention Net prep PR #2 : Smaller cleanups. (#12449 )	2020-12-01 08:21:45 +01:00
Amog Kamsetty	f9a99f20dd	Revert "Re-Revert "[Core] zero-copy serializer for pytorch (#12344 )" (#12478 )" (#12515 ) This reverts commit `3f22448834`.	2020-11-30 19:05:55 -08:00
Siyuan (Ryans) Zhuang	3f22448834	Re-Revert "[Core] zero-copy serializer for pytorch (#12344 )" (#12478 ) * [Core] zero-copy serializer for pytorch (#12344) * zero-copy serializer for pytorch * address possible bottleneck * add tests & device support (cherry picked from commit `0a505ca83d`) * add environmental variables * update doc	2020-11-30 11:43:03 -08:00
Sven Mika	bb03e2499b	[RLlib] PyBullet Env native support via env str-specifier (if installed). (#12209 )	2020-11-30 12:41:24 +01:00
Sven Mika	fb318addcb	[RLlib] Curiosity exploration module: tf/tf2.x/tf-eager support. (#11945 )	2020-11-29 12:31:24 +01:00
Pierre TASSEL	60a545ab57	[RLLib] Fix HyperOptSearch tuple to list conversion (#12462 ) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>	2020-11-28 10:07:54 -08:00
Sven Mika	0df55a139c	[RLlib] Attention Net prep PR #1 : Smaller cleanups. (#12447 ) * WIP. * Fix. * Fix. * Fix.	2020-11-27 16:25:47 -08:00
Sven Mika	6475297bd3	[RLlib] Torch LR schedule not working. Fix and added test case. (#12396 )	2020-11-26 13:14:11 +01:00
Sven Mika	b7dbbfbf41	[RLlib] Issue 11591: SAC loss does not use PR-weights in critic loss term. (#12394 ) * WIP. * Fix and LINT.	2020-11-25 11:28:46 -08:00
Sven Mika	592c161032	[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397 ) * WIP. * Fix and LINT.	2020-11-25 11:27:46 -08:00
Sven Mika	841d93d366	[RLlib] Issue 12233 shared tf layers example not really shared (only works for tf1.x, not tf2.x). (#12399 )	2020-11-25 11:27:19 -08:00
Sven Mika	95175a822f	[RLlib] Issue 11974: Traj view API next-action (shift=+1) not working. (#12407 ) * WIP. * Fix and LINT.	2020-11-25 11:26:29 -08:00
karstenddwx	09d5413f70	[RLlib] rollout batch, handle rewards that are None (unknown) in a multi-agent env (#11858 ) (#11911 )	2020-11-25 13:39:22 +01:00
danuo	c009c178f6	[RLlib] Closes #11924 : Add support for custom/ray environments in rollouts.py for agents without workers (#11926 ) * Closes #11924 Formerly, rollout.py would only load environments from gym (with gym.make() ) , if an agent without workers is employed (such as ES or ARS). This will result in an error, if a custom environment is used. This PR adds the possibility to load environments from the ray registry, while maintaining the support for gym environments. * Update rllib/rollout.py Co-authored-by: Sven Mika <sven@anyscale.io>	2020-11-25 08:43:17 +01:00
Tomasz Wrona	82852f0ed2	[RLlib] Add ResetOnExceptionWrapper with tests for unstable 3rd party envs (#12353 )	2020-11-25 08:41:58 +01:00
Sven Mika	4afaa46028	[RLlib] Increase the scope of RLlib's regression tests. (#12200 )	2020-11-24 22:18:31 +01:00
Edward Oakes	32d159a2ed	Fix release directory & RELEASE_PROCESS.md (#12269 )	2020-11-23 14:28:59 -06:00
Sven Mika	f6b84cb2f7	[RLlib] Fix offline logp vs prob bug in OffPolicyEstimator class. (#12158 )	2020-11-20 08:59:43 +01:00
Raoul Khouri	d07ffc152b	[rllib] Rrk/12079 custom filters (#12095 ) * travis reformatted	2020-11-19 13:20:20 -08:00
Sven Mika	dab241dcc6	[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063 )	2020-11-19 19:01:14 +01:00
Sven Mika	6da4342822	[RLlib] Add on_learn_on_batch (Policy) callback to DefaultCallbacks. (#12070 )	2020-11-18 15:39:23 +01:00
Sven Mika	b6b54f1c81	[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827 )	2020-11-16 10:54:35 -08:00
Michael Luo	59bc1e6c09	[RLLib] MAML extension for all models except RNNs (#11337 )	2020-11-12 16:51:40 -08:00
Sven Mika	0bd69edd71	[RLlib] Trajectory view API: enable by default for ES and ARS (#11826 )	2020-11-12 10:33:10 -08:00
Michael Luo	6e6c680f14	MBMPO Cartpole (#11832 ) * MBMPO Cartpole Done * Added doc	2020-11-12 10:30:41 -08:00
Sven Mika	62c7ab5182	[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747 )	2020-11-12 16:27:34 +01:00
Michael Luo	59ccbc0fc7	[RLlib] Model Annotations: Tensorflow (#11964 )	2020-11-12 12:18:50 +01:00
Michael Luo	b2984d1c34	[RLlib] Model Annotations to Torch Models (#9749 )	2020-11-12 12:16:12 +01:00
Sven Mika	72fc79740c	[RLlib] Issue with pickle versions (breaks rollout test cases in RLlib). (#11939 )	2020-11-11 21:52:21 +01:00
Sven Mika	291c172d83	[RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909 )	2020-11-11 18:45:28 +01:00
Eric Liang	9b8218aabd	[docs] Move all /latest links to /master (#11897 ) * use master link * remae * revert non-ray * more * mre	2020-11-10 10:53:28 -08:00
Benjamin Black	1999266bba	Updated pettingzoo env to acomidate api changes and fixes (#11873 ) * Updated pettingzoo env to acomidate api changes and fixes * fixed test failure * fixed linting issue * fixed test failure	2020-11-09 16:09:49 -08:00
Eric Liang	6b7a4dfaa0	[rllib] Forgot to pass ioctx to child json readers (#11839 ) * fix ioctx * fix	2020-11-05 22:07:57 -08:00
Sven Mika	d6c7c7c675	[RLlib] Make sure, DQN torch actions are of type=long before torch.nn.functional.one_hot() op. (#11800 )	2020-11-04 18:04:03 +01:00
heng2j	9073e6507c	WIP: Update to support the Food Collector environment (#11373 ) * Update to support the Food Collector environment Recently, I am trying out ML Agent with Ray, and trying to use the food collector environment. Since the observation space and action space haven't defined in the unity3d_env.py. I propose to make this changes to add the support for Food Collector. I have tried to use this env in the [unity3d_env_local example](https://github.com/ray-project/ray/blob/master/rllib/examples/unity3d_env_local.py). Please let me know if this the proper adjustment. Even these are just few line of code, please let me know how can I made a proper contribution. * Apply suggestions from code review	2020-11-04 12:29:16 +01:00
Pierre TASSEL	66605cfcbd	[RLLib] Random Parametric Trainer (#11366 )	2020-11-04 11:12:51 +01:00
mvindiola1	4518fe790f	[RLLIB] Convert torch state arrays to tensors during compute log likelihoods (#11708 )	2020-11-04 09:33:56 +01:00
Sven Mika	5b788ccb13	[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717 )	2020-11-03 12:53:34 -08:00

1 2 3 4 5 ...

500 commits