hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	b37a162076	[RLlib] Make envs specifiable in configs by their class path. (#8750 )	2020-06-03 08:14:29 +02:00
Sven Mika	2746fc0476	[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520 )	2020-05-27 16:19:13 +02:00
Eric Liang	9a83908c46	[rllib] Deprecate policy optimizers (#8345 )	2020-05-21 10:16:18 -07:00
mehrdadn	ebf060d484	Make more tests run on Windows (#8446 ) * Remove worker Wait() call due to SIGCHLD being ignored * Port _pid_alive to Windows * Show PID as well as TID in glog * Update TensorFlow version for Python 3.8 on Windows * Handle missing Pillow on Windows * Work around dm-tree PermissionError on Windows * Fix some lint errors on Windows with Python 3.8 * Simplify torch requirements * Quiet git clean * Handle finalizer issues * Exit with the signal number * Get rid of wget * Fix some Windows compatibility issues with tests Co-authored-by: Mehrdad <noreply@github.com>	2020-05-20 12:25:04 -07:00
Eric Liang	9d012626e5	[rllib] Distributed exec workflow for impala (#8321 )	2020-05-11 20:24:43 -07:00
Eric Liang	b14cc16616	[rllib] Enable functional execution workflow API by default (#8221 )	2020-05-05 12:36:42 -07:00
Eric Liang	f48da50e1c	[rllib] observation function api for multi-agent (#8236 )	2020-05-04 22:13:49 -07:00
Eric Liang	baadbdf8d4	[rllib] Execute PPO using training workflow (#8206 ) * wip * add kl * kl * works now * doc update * reorg * add ddppo * add stats * fix fetch * comment * fix learner stat regression * test fixes * fix test	2020-04-30 01:18:09 -07:00
Eric Liang	2298f6fb40	[rllib] Port DQN/Ape-X to training workflow api (#8077 )	2020-04-23 12:39:19 -07:00
roireshef	dbcad35022	[RLlib] Added DefaultCallbacks which replaces old callbacks dict interface (#6972 )	2020-04-16 16:06:42 -07:00
Xianyang Liu	e1d3f7eba6	[rllib]Add config for rllib to support set python environments (#8026 ) * support set extra python environments * wrap value with str * Apply suggestions from code review Co-Authored-By: Eric Liang <ekhliang@gmail.com> * addresses comments * fix lint errors * remove unrelated changes due to format.sh * remove unrelated changes due to format.sh Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-04-16 01:13:45 -07:00
Sven Mika	428516056a	[RLlib] SAC Torch (incl. Atari learning) (#7984 ) * Policy-classes cleanup and torch/tf unification. - Make Policy abstract. - Add `action_dist` to call to `extra_action_out_fn` (necessary for PPO torch). - Move some methods and vars to base Policy (from TFPolicy): num_state_tensors, ACTION_PROB, ACTION_LOGP and some more. * Fix `clip_action` import from Policy (should probably be moved into utils altogether). * - Move `is_recurrent()` and `num_state_tensors()` into TFPolicy (from DynamicTFPolicy). - Add config to all Policy c'tor calls (as 3rd arg after obs and action spaces). * Add `config` to c'tor call to TFPolicy. * Add missing `config` to c'tor call to TFPolicy in marvil_policy.py. * Fix test_rollout_worker.py::MockPolicy and BadPolicy classes (Policy base class is now abstract). * Fix LINT errors in Policy classes. * Implement StatefulPolicy abstract methods in test cases: test_multi_agent_env.py. * policy.py LINT errors. * Create a simple TestPolicy to sub-class from when testing Policies (reduces code in some test cases). * policy.py - Remove abstractmethod from `apply_gradients` and `compute_gradients` (these are not required iff `learn_on_batch` implemented). - Fix docstring of `num_state_tensors`. * Make QMIX torch Policy a child of TorchPolicy (instead of Policy). * QMixPolicy add empty implementations of abstract Policy methods. * Store Policy's config in self.config in base Policy c'tor. * - Make only compute_actions in base Policy's an abstractmethod and provide pass implementation to all other methods if not defined. - Fix state_batches=None (most Policies don't have internal states). * Cartpole tf learning. * Cartpole tf AND torch learning (in ~ same ts). * Cartpole tf AND torch learning (in ~ same ts). 2 * Cartpole tf (torch syntax-broken) learning (in ~ same ts). 3 * Cartpole tf AND torch learning (in ~ same ts). 4 * Cartpole tf AND torch learning (in ~ same ts). 5 * Cartpole tf AND torch learning (in ~ same ts). 6 * Cartpole tf AND torch learning (in ~ same ts). Pendulum tf learning. * WIP. * WIP. * SAC torch learning Pendulum. * WIP. * SAC torch and tf learning Pendulum and Cartpole after cleanup. * WIP. * LINT. * LINT. * SAC: Move policy.target_model to policy.device as well. * Fixes and cleanup. * Fix data-format of tf keras Conv2d layers (broken for some tf-versions which have data_format="channels_first" as default). * Fixes and LINT. * Fixes and LINT. * Fix and LINT. * WIP. * Test fixes and LINT. * Fixes and LINT. Co-authored-by: Sven Mika <sven@Svens-MacBook-Pro.local>	2020-04-15 13:25:16 +02:00
Sven Mika	e4bd5db4d8	[RLlib] Minimal ParamNoise PR. (#7772 )	2020-03-28 16:16:30 -07:00
Sven Mika	1138f2ebed	[RLlib] Issue 7046 cannot restore keras model from h5 file. (#7482 )	2020-03-23 12:19:30 -07:00
Robert Nishihara	ee8c9ff732	Remove six and cloudpickle from setup.py. (#7694 )	2020-03-23 11:42:05 -07:00
Eric Liang	dd70720578	[rllib] Rename sample_batch_size => rollout_fragment_length (#7503 ) * bulk rename * deprecation warn * update doc * update fig * line length * rename * make pytest comptaible * fix test * fi sys * rename * wip * fix more * lint * update svg * comments * lint * fix use of batch steps	2020-03-14 12:05:04 -07:00
Eric Liang	c3a8ba399f	[rllib] Enable distributed exec api for A2C, A3C, PG by default (#7580 )	2020-03-13 18:48:41 -07:00
Eric Liang	f5d12a958b	[rllib] Port Ape-X to distributed execution API (#7497 )	2020-03-12 00:54:08 -07:00
Sven Mika	7faf0d8f89	[RLlib] Make rollout always use `evaluation_config`. (#7396 )	2020-03-03 17:20:35 -08:00
Eric Liang	0f88444686	[rllib] Support multi-agent training in pipeline impls, add easy flag to enable (#7338 )	2020-03-02 15:16:37 -08:00
Sven Mika	83e06cd30a	[RLlib] DDPG refactor and Exploration API action noise classes. (#7314 ) * WIP. * WIP. * WIP. * WIP. * WIP. * Fix * WIP. * Add TD3 quick Pendulum regresison. * Cleanup. * Fix. * LINT. * Fix. * Sort quick_learning test cases, add TD3. * Sort quick_learning test cases, add TD3. * Revert test_checkpoint_restore.py (debugging) changes. * Fix old soft_q settings in documentation and test configs. * More doc fixes. * Fix test case. * Fix test case. * Lower test load. * WIP.	2020-03-01 11:53:35 -08:00
Eric Liang	3c6b94f3f5	[rllib] Enable performance metrics reporting for RLlib pipelines, add A3C (#7299 )	2020-02-28 16:44:17 -08:00
Sven Mika	d537e9f0d8	[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155 )	2020-02-19 12:18:45 -08:00
Eric Liang	399424c418	[rllib] Fix broken check in eval mode for IMPALA #7217	2020-02-19 11:54:30 -08:00
Eric Liang	42aea966ff	[rllib] Convert torch state arrays to tensors during compute actions (#7162 ) * convert to tensor * normalize fix	2020-02-17 10:26:58 -08:00
Sven Mika	6e1c3ea824	[RLlib] Exploration API (+EpsilonGreedy sub-class). (#6974 )	2020-02-10 15:22:07 -08:00
Eric Liang	fbc545c03b	[rllib] Support parallel, parameterized evaluation (#6981 ) * eval api * update * sync eval filters * sync fix * docs * update * docs * update * link * nit * doc updates * format	2020-02-01 22:12:12 -08:00
Eric Liang	e659699ca9	[tune] Fix directory naming regression (#6839 )	2020-01-27 15:53:40 -08:00
Sven Mika	e6227082bd	[RLlib] Add `torch` flag to train.py (#6807 )	2020-01-17 18:48:44 -08:00
Sven	60d4d5e1aa	Remove future imports (#6724 ) * Remove all __future__ imports from RLlib. * Remove (object) again from tf_run_builder.py::TFRunBuilder. * Fix 2xLINT warnings. * Fix broken appo_policy import (must be appo_tf_policy) * Remove future imports from all other ray files (not just RLlib). * Remove future imports from all other ray files (not just RLlib). * Remove future import blocks that contain `unicode_literals` as well. Revert appo_tf_policy.py to appo_policy.py (belongs to another PR). * Add two empty lines before Schedule class. * Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.	2020-01-09 00:15:48 -08:00
Sven	8b16847c02	Get utils ready for better Agent torch support. (#6561 )	2019-12-30 12:27:32 -08:00
Michael Luo	548df014ec	SAC Performance Fixes (#6295 ) * SAC Performance Fixes * Small Changes * Update sac_model.py * fix normalize wrapper * Update test_eager_support.py Co-authored-by: Eric Liang <ekhliang@gmail.com>	2019-12-20 10:51:25 -08:00
Eric Liang	8fc2272f43	[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181 )	2019-11-18 10:39:07 -08:00
Eric Liang	e4565c9cc6	Reduce RLlib log verbosity (#6154 )	2019-11-13 18:50:45 -08:00
Vince Jankovics	7e214fd95e	[tune] TensorBoard HParams for TF2.0 (#5678 )	2019-09-21 11:06:34 -07:00
Kilian Batzner	79b9c70ad6	Add local_tf_session_args to unknown subkeys whitelist (#5742 ) * Add local_tf_session_args to unknown subkeys whitelist * Remove trailing whitespace	2019-09-20 10:32:49 -07:00
gehring	8903bcd0c3	[rllib] Tracing for eager tensorflow policies with `tf.function` (#5705 ) * Added tracing of eager policies with `tf.function` * lint * add config option * add docs * wip * tracing now works with a3c * typo * none * file doc * returns * syntax error * syntax error	2019-09-17 01:44:20 -07:00
Philipp Moritz	747daff2cb	Fix impala stress test (#5596 )	2019-08-31 01:20:53 -07:00
Eric Liang	03a1b75852	[rllib] Fix some eager execution regressions with 1.13 (#5537 ) * fix bugs with 1.13 * allow disable	2019-08-26 23:23:35 -07:00
Eric Liang	97ccd75952	[rllib] Enable object store memory limit by default (#5534 )	2019-08-26 01:37:28 -07:00
gehring	b520f6141e	[rllib] Adds eager support with a generic `TFEagerPolicy` class (#5436 )	2019-08-23 14:21:11 +08:00
Eric Liang	e2e30ca507	Ray, Tune, and RLlib support for memory, object_store_memory options (#5226 )	2019-08-21 23:01:10 -07:00
Eric Liang	5d7afe8092	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00

43 commits