hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
danuo	c009c178f6	[RLlib] Closes #11924 : Add support for custom/ray environments in rollouts.py for agents without workers (#11926 ) * Closes #11924 Formerly, rollout.py would only load environments from gym (with gym.make() ) , if an agent without workers is employed (such as ES or ARS). This will result in an error, if a custom environment is used. This PR adds the possibility to load environments from the ray registry, while maintaining the support for gym environments. * Update rllib/rollout.py Co-authored-by: Sven Mika <sven@anyscale.io>	2020-11-25 08:43:17 +01:00
Tomasz Wrona	82852f0ed2	[RLlib] Add ResetOnExceptionWrapper with tests for unstable 3rd party envs (#12353 )	2020-11-25 08:41:58 +01:00
Sven Mika	4afaa46028	[RLlib] Increase the scope of RLlib's regression tests. (#12200 )	2020-11-24 22:18:31 +01:00
Edward Oakes	32d159a2ed	Fix release directory & RELEASE_PROCESS.md (#12269 )	2020-11-23 14:28:59 -06:00
Sven Mika	f6b84cb2f7	[RLlib] Fix offline logp vs prob bug in OffPolicyEstimator class. (#12158 )	2020-11-20 08:59:43 +01:00
Raoul Khouri	d07ffc152b	[rllib] Rrk/12079 custom filters (#12095 ) * travis reformatted	2020-11-19 13:20:20 -08:00
Sven Mika	dab241dcc6	[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063 )	2020-11-19 19:01:14 +01:00
Sven Mika	6da4342822	[RLlib] Add on_learn_on_batch (Policy) callback to DefaultCallbacks. (#12070 )	2020-11-18 15:39:23 +01:00
Sven Mika	b6b54f1c81	[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827 )	2020-11-16 10:54:35 -08:00
Michael Luo	59bc1e6c09	[RLLib] MAML extension for all models except RNNs (#11337 )	2020-11-12 16:51:40 -08:00
Sven Mika	0bd69edd71	[RLlib] Trajectory view API: enable by default for ES and ARS (#11826 )	2020-11-12 10:33:10 -08:00
Michael Luo	6e6c680f14	MBMPO Cartpole (#11832 ) * MBMPO Cartpole Done * Added doc	2020-11-12 10:30:41 -08:00
Sven Mika	62c7ab5182	[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747 )	2020-11-12 16:27:34 +01:00
Michael Luo	59ccbc0fc7	[RLlib] Model Annotations: Tensorflow (#11964 )	2020-11-12 12:18:50 +01:00
Michael Luo	b2984d1c34	[RLlib] Model Annotations to Torch Models (#9749 )	2020-11-12 12:16:12 +01:00
Sven Mika	72fc79740c	[RLlib] Issue with pickle versions (breaks rollout test cases in RLlib). (#11939 )	2020-11-11 21:52:21 +01:00
Sven Mika	291c172d83	[RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909 )	2020-11-11 18:45:28 +01:00
Eric Liang	9b8218aabd	[docs] Move all /latest links to /master (#11897 ) * use master link * remae * revert non-ray * more * mre	2020-11-10 10:53:28 -08:00
Benjamin Black	1999266bba	Updated pettingzoo env to acomidate api changes and fixes (#11873 ) * Updated pettingzoo env to acomidate api changes and fixes * fixed test failure * fixed linting issue * fixed test failure	2020-11-09 16:09:49 -08:00
Eric Liang	6b7a4dfaa0	[rllib] Forgot to pass ioctx to child json readers (#11839 ) * fix ioctx * fix	2020-11-05 22:07:57 -08:00
Sven Mika	d6c7c7c675	[RLlib] Make sure, DQN torch actions are of type=long before torch.nn.functional.one_hot() op. (#11800 )	2020-11-04 18:04:03 +01:00
heng2j	9073e6507c	WIP: Update to support the Food Collector environment (#11373 ) * Update to support the Food Collector environment Recently, I am trying out ML Agent with Ray, and trying to use the food collector environment. Since the observation space and action space haven't defined in the unity3d_env.py. I propose to make this changes to add the support for Food Collector. I have tried to use this env in the [unity3d_env_local example](https://github.com/ray-project/ray/blob/master/rllib/examples/unity3d_env_local.py). Please let me know if this the proper adjustment. Even these are just few line of code, please let me know how can I made a proper contribution. * Apply suggestions from code review	2020-11-04 12:29:16 +01:00
Pierre TASSEL	66605cfcbd	[RLLib] Random Parametric Trainer (#11366 )	2020-11-04 11:12:51 +01:00
mvindiola1	4518fe790f	[RLLIB] Convert torch state arrays to tensors during compute log likelihoods (#11708 )	2020-11-04 09:33:56 +01:00
Sven Mika	5b788ccb13	[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717 )	2020-11-03 12:53:34 -08:00
desktable	5af745c90d	[RLlib] Implement the SlateQ algorithm (#11450 )	2020-11-03 09:52:04 +01:00
Lara Codeca	e735add268	[RLlib] Integration with SUMO Simulator (#11710 )	2020-11-03 09:45:03 +01:00
dHannasch	8346dedc3a	Fix the linter failure. (#11755 )	2020-11-02 18:02:15 +01:00
bcahlit	26176ec570	[RLlib] Fix epsilon_greedy on nested_action_spaces only in pytorch (#11453 ) * [RLlib] Fix epsilon_greedy on nested_action_spaces only in pytorch * epsilon_greedy on Continuous action * formatt * Fix error * fix format * fix bug * increase speed * Update rllib/utils/exploration/epsilon_greedy.py * Update rllib/utils/exploration/epsilon_greedy.py * Update rllib/utils/exploration/epsilon_greedy.py Co-authored-by: Sven Mika <sven@anyscale.io>	2020-11-02 12:22:33 +01:00
Sven Mika	54d85a6c2a	[RLlib] Fix RNN learning for tf-eager/tf2.x. (#11720 )	2020-11-02 11:18:41 +01:00
Sven Mika	bfc4f95e01	[RLlib] Fix test_bc.py test case. (#11722 ) * Fix large json test file. * Fix large json test file. * WIP.	2020-10-31 00:16:09 -07:00
Jiajie Xiao	0b07af374a	allow tuple action space (#11429 ) Co-authored-by: Jiajie Xiao <jj@Jiajies-MBP-2.attlocal.net>	2020-10-29 16:05:38 +01:00
mvindiola1	9e68b77796	[RLLIB] Wait for remote_workers to finish closing environments before terminating (#11476 )	2020-10-28 14:23:06 -07:00
Sven Mika	d9f1874e34	[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609 )	2020-10-27 10:00:24 +01:00
Kingsley Kuan	d1dd5d578e	[RLlib] Fix PyTorch A3C / A2C loss function using mixed reduced sum / mean (#11449 )	2020-10-22 12:39:34 -07:00
Philsik Chang	ede9347127	[rllib] Add torch_distributed_backend flag for DDPPO (#11362 ) (#11425 )	2020-10-21 18:30:42 -07:00
Eric Liang	e8c77e2847	Remove memory quota enforcement from actors (#11480 ) * wip * fix * deprecate	2020-10-21 14:29:03 -07:00
Sven Mika	2aec77e305	[RLlib] Fix two test cases that only fail on Travis. (#11435 )	2020-10-16 13:53:30 -05:00
Sven Mika	414041c6dd	[RLlib] Do not create env on driver iff num_workers > 0. (#11307 )	2020-10-15 18:21:30 +02:00
Sven Mika	a6a94d3206	[RLlib] Fix test_env_with_subprocess.py. (#11356 )	2020-10-13 12:42:20 -07:00
Sven Mika	1ebcdf236f	[RLlib] Add support for custom MultiActionDistributions. (#11311 )	2020-10-12 13:50:43 -07:00
Sven Mika	0c0f67c14d	[RLlib] ARS/ES eval workers not working: Issue 9933. (#11308 )	2020-10-12 13:49:48 -07:00
Sven Mika	8ea1bc5ff9	[RLlib] Allow for more than 2^31 policy timesteps. (#11301 )	2020-10-12 13:49:11 -07:00
Sven Mika	f5e2cda68a	[RLlib] SAC: log_alpha not being learnt when on GPU. (#11298 )	2020-10-12 13:48:44 -07:00
Julius Frost	7dcfd258cd	[RLlib] Assert LongTensor in SAC Discrete PyTorch (#11245 )	2020-10-12 13:47:21 -07:00
Sven Mika	d3bc20b727	[RLlib] ConvTranspose2D module (#11231 )	2020-10-12 15:00:42 +02:00
Sven Mika	957877ad3f	Tf version of VisionNet (ray/rllib/model/tf/vision_net.py) crashes iff len(conv-filters)=1. (#11330 )	2020-10-11 12:49:47 +02:00
Thomas Tumiel	587319debc	[tune] move _SCHEDULERS to tune.schedulers and add all available schedulers (#11218 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-10-08 16:10:23 -07:00
desktable	8af9ff6dc2	[RLlib] Add MultiAgentEnv wrapper for Kaggle's football environment (#11249 ) * [RLlib] Add MultiAgentEnv wrapper for Kaggle's football environment * Add unit tests to BUILD * Add gfootball dependency * Revert the last two commits	2020-10-08 10:57:58 -07:00
desktable	f9621ce23c	[RLlib] Add recsim_wrapper unit test to BUILD (#11225 )	2020-10-08 08:23:27 +02:00

1 2 3 4 5 ...

475 commits