hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-08 19:41:38 -05:00

Author	SHA1	Message	Date
Sven Mika	e2edca45d4	[RLlib] PPO torch memory leak and unnecessary torch.Tensor creation and gc'ing. (#7238 ) * Take out stats to analyze memory leak in torch PPO. * WIP * WIP * WIP * WIP * WIP * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * LINT. * Fix determine_tests_to_run.py. * minor change to re-test after determine_tests_to_run.py. * LINT. * update comments. * WIP * WIP * WIP * FIX. * Fix sequence_mask being dependent on torch being installed. * Fix strange ray-core tf-error in test_memory_scheduling test case. * Fix strange ray-core tf-error in test_memory_scheduling test case. * Fix strange ray-core tf-error in test_memory_scheduling test case. * Fix strange ray-core tf-error in test_memory_scheduling test case.	2020-02-22 11:02:31 -08:00
Sven Mika	d537e9f0d8	[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155 )	2020-02-19 12:18:45 -08:00
Sven Mika	2e60f0d4d8	[RLlib] Move all jenkins RLlib-tests into bazel (rllib/BUILD). (#7178 ) * commit * comment	2020-02-15 14:50:44 -08:00
Eric Liang	026f6884b5	[rllib] Add Decentralized DDPPO trainer and documentation (#7088 )	2020-02-10 15:28:27 -08:00
Sven Mika	6e1c3ea824	[RLlib] Exploration API (+EpsilonGreedy sub-class). (#6974 )	2020-02-10 15:22:07 -08:00
roireshef	3c60caa448	[rllib] implemented compute_advantages without gae (#6941 )	2020-01-31 22:25:45 -08:00
Eric Liang	2fb53396ad	[rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918 )	2020-01-25 22:36:43 -08:00
Sven Mika	c957ed58ed	[RLlib] Implement PPO torch version. (#6826 )	2020-01-20 23:06:50 -08:00
Sven Mika	303547f119	[RLlib] Policy-classes cleanup and torch/tf unification. (#6770 )	2020-01-17 22:26:28 -08:00
Sven Mika	e6227082bd	[RLlib] Add `torch` flag to train.py (#6807 )	2020-01-17 18:48:44 -08:00
Sven	60d4d5e1aa	Remove future imports (#6724 ) * Remove all __future__ imports from RLlib. * Remove (object) again from tf_run_builder.py::TFRunBuilder. * Fix 2xLINT warnings. * Fix broken appo_policy import (must be appo_tf_policy) * Remove future imports from all other ray files (not just RLlib). * Remove future imports from all other ray files (not just RLlib). * Remove future import blocks that contain `unicode_literals` as well. Revert appo_tf_policy.py to appo_policy.py (belongs to another PR). * Add two empty lines before Schedule class. * Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.	2020-01-09 00:15:48 -08:00
Robert Nishihara	39a3459886	Remove (object) from class declarations. (#6658 )	2020-01-02 17:42:13 -08:00
Eric Liang	8fc2272f43	[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181 )	2019-11-18 10:39:07 -08:00
Eric Liang	16891e9379	[rllib] Don't use flat weights in non-eager mode (#6001 )	2019-10-31 15:16:02 -07:00
Ashwinee Panda	946ebfaa3c	[rllib] Validate that entropy coeff is not an integer (#5687 ) * Validate that entropy coeff is not an integer Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float. * Cast to float instead Also move this check after the negative value check	2019-09-11 14:35:42 -07:00
Eric Liang	bc6a95deb0	[rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent (#5683 )	2019-09-11 12:15:34 -07:00
gehring	b520f6141e	[rllib] Adds eager support with a generic `TFEagerPolicy` class (#5436 )	2019-08-23 14:21:11 +08:00
Eric Liang	a1d2e17623	[rllib] Autoregressive action distributions (#5304 )	2019-08-10 14:05:12 -07:00
Matthew A. Wright	e3c9f7e83a	Custom action distributions (#5164 ) * custom action dist wip * Test case for custom action dist * ActionDistribution.get_parameter_shape_for_action_space pattern * Edit exception message to also suggest using a custom action distribution * Clean up ModelCatalog.get_action_dist * Pass model config to ActionDistribution constructors * Update custom action distribution test case * Name fix * Autoformatter * parameter shape static methods for torch distributions * Fix docstring * Generalize fake array for graph initialization * Fix action dist constructors * Correct parameter shape static methods for multicategorical and gaussian * Make suggested changes to custom action dist's * Correct instances of not passing model config to action dist * Autoformatter * fix tuple distribution constructor * bugfix	2019-08-06 11:13:16 -07:00
Eric Liang	5d7afe8092	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00

20 commits