hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

Author	SHA1	Message	Date
Sven Mika	f41a9b9813	[RLlib] Fix KL method of MultiCategorial tf distribution (issue #7009 ). (#7119 ) * Fix KL method of MultiCategorial tf distribution. * Fix KL method of MultiCategorial tf distribution. * Merge AsyncReplayOptimizer fixes into this branch.	2020-02-12 12:46:15 -08:00
Sven Mika	2a0e4d94aa	[RLlib] Fix AsyncReplayOptimizer bug where it swallows all good worker tasks … (#7111 )	2020-02-11 12:51:44 -08:00
Eric Liang	026f6884b5	[rllib] Add Decentralized DDPPO trainer and documentation (#7088 )	2020-02-10 15:28:27 -08:00
Sven Mika	6e1c3ea824	[RLlib] Exploration API (+EpsilonGreedy sub-class). (#6974 )	2020-02-10 15:22:07 -08:00
Sven Mika	5ac5ac9560	[RLlib] Fix broken example: tf-eager with custom-RNN (#6732 ). (#7021 ) * WIP. * Fix float32 conversion in OneHot preprocessor (would cause float64 in eager, then NN-matmul-failure). Add proper seq-len + state-in construction in eager_tf_policy.py::_compute_gradients(). * LINT. * eager_tf_policy.py: Only set samples["seq_lens"] if RNN. Otherwise, eager-tracing will throw flattened-dict key-mismatch error. * Move issue code to examples folder. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-02-06 09:44:08 -08:00
Eric Liang	fbc545c03b	[rllib] Support parallel, parameterized evaluation (#6981 ) * eval api * update * sync eval filters * sync fix * docs * update * docs * update * link * nit * doc updates * format	2020-02-01 22:12:12 -08:00
Sven Mika	b9ad79d66f	Add cartpole PPO torch to regression (besides tf). (#7005 )	2020-02-01 17:41:38 -08:00
roireshef	3c60caa448	[rllib] implemented compute_advantages without gae (#6941 )	2020-01-31 22:25:45 -08:00
Jaroslaw Rzepecki	67319bc887	[RLlib] Update MARWIL to use tf policy template (#6975 ) * update MARWIL to use tf policy template * formatting fixes	2020-01-31 12:57:52 -08:00
Sven Mika	211a9be9a5	[RLlib] Bug fix: PR anneals beta parameter beyond final given value. (#6973 ) * Bug fix: PR anneals beta parameter beyond final given value. * LINT. * Trigger travis re-test.	2020-01-31 09:55:03 -08:00
Sven Mika	2ccf08ad10	[RLlib] Bug fix: DQN goes into negative epsilon values after reaching explora… (#6971 ) * Bug fix: DQN goes into negative epsilon values after reaching exploration percentage. * Add `epsilon_initial_eps` to SAC to pass test_nested_spaces.py. * Add `exploration_initial_eps` to QMIX default config.	2020-01-31 09:54:12 -08:00
roireshef	dc7a555260	[rllib] Feature/histograms in tensorboard (#6942 ) * Added histogram functionality to custom metrics infrastructure (another tab in tensorboard) * updated example to include histogram metric * added histograms to TBXLogger * add episode rewards * lint Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-01-30 22:02:53 -08:00
Sven Mika	136ada5fb9	[RLlib] Experiment with py_func as a means to further unify tf and torch (Schedule classes). (#6951 )	2020-01-30 11:27:57 -08:00
Sven Mika	4c97348cb6	[RLlib] Schedule-classes multi-framework support. (#6926 )	2020-01-28 11:07:55 -08:00
Eric Liang	e659699ca9	[tune] Fix directory naming regression (#6839 )	2020-01-27 15:53:40 -08:00
Eric Liang	2fb53396ad	[rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918 )	2020-01-25 22:36:43 -08:00
Sven Mika	446cbdf2e0	[RLlib] Fix issue (bug): LSTM + non-shared vf + PPO + tuple actions (#6890 ) * Add `RandomEnv` example to examples folder. Convert warning into Error message when using an LSTM in a non-shared-vf network (after the warning, the program would crash). * LINT. * Fix issue #6884. LSTM + non-shared vf NN + PPO crashes when using a Tuple action space. * LINT * Change warning message for Model: shared_vf=False, LSTM=True cases. * Bug fix. * Add examples/random_env.py test to Jenkins.	2020-01-24 10:29:35 -08:00
AnanthHari	aa2a0cb6da	Fixes empty `state` argument in compute_single_action method (#6894 ) * Fixes empty `state` parameter in compute_single_action method * Fixed style	2020-01-23 00:42:52 -08:00
Sven Mika	ae9a3a2237	[RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865 )	2020-01-22 17:02:58 -08:00
Sven Mika	c957ed58ed	[RLlib] Implement PPO torch version. (#6826 )	2020-01-20 23:06:50 -08:00
Eric Liang	a229bdf272	[rllib] Deprecate custom preprocessors (#6833 ) * deprecation warnings * add log warn * fix test	2020-01-18 23:30:09 -08:00
Sven Mika	7659cae3ba	[RLlib] Add PG torch regression test (#6828 ) * Add PG torch regression test to tuned_examples/regression_tests dir. * Rename cartpole-pg.yaml into cartpole-pg-tf.yaml * cartpole-pg-tf.yaml: Change cartpole-pg name of tuned_example to cartpole-pg-tf.	2020-01-18 15:57:12 -08:00
Justin Terry	97bf79917c	[RLlib] Update MADDPG example repo to maintained fork (#6831 )	2020-01-18 13:08:27 -08:00
Sven Mika	303547f119	[RLlib] Policy-classes cleanup and torch/tf unification. (#6770 )	2020-01-17 22:26:28 -08:00
Sven Mika	e6227082bd	[RLlib] Add `torch` flag to train.py (#6807 )	2020-01-17 18:48:44 -08:00
Sven Mika	2bcf72e306	DQN distributional model: Replace all legacy tf.contrib imports with tf.keras.layers.xyz or tf.initializers.xyz. (#6772 ) - This fixes a test case in test_evaluators.py.	2020-01-13 21:48:16 -08:00
Sven	60d4d5e1aa	Remove future imports (#6724 ) * Remove all __future__ imports from RLlib. * Remove (object) again from tf_run_builder.py::TFRunBuilder. * Fix 2xLINT warnings. * Fix broken appo_policy import (must be appo_tf_policy) * Remove future imports from all other ray files (not just RLlib). * Remove future imports from all other ray files (not just RLlib). * Remove future import blocks that contain `unicode_literals` as well. Revert appo_tf_policy.py to appo_policy.py (belongs to another PR). * Add two empty lines before Schedule class. * Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.	2020-01-09 00:15:48 -08:00
Ujval Misra	20ba7ef647	[tune] Move util to utils package (#6682 ) * Move util.py to utils * Fix import	2020-01-06 18:11:02 -08:00
Robert Nishihara	39a3459886	Remove (object) from class declarations. (#6658 )	2020-01-02 17:42:13 -08:00
Sven	f1b56fa5ee	PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). (#6650 ) * Unifying the code for PGTrainer/Policy wrt tf vs torch. Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch). * Fix LINT line-len errors. * Fix LINT errors. * Fix `tf_pg_policy` imports (formerly: `pg_policy`). * Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer. Retire `PGAgent` class (use PGTrainer instead). * - Move PG test into agents/pg/tests directory. - All test cases will be located near the classes that are tested and then built into the Bazel/Travis test suite. * Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c the function is not a tf-specific one. * Fix remaining import errors for agents/pg/... * Fix circular dependency in pg imports. * Add pg tests to Jenkins test suite.	2020-01-02 16:08:03 -08:00
Robert Nishihara	480206eef8	Remove some Python 2 compatibility code. (#6624 )	2019-12-31 17:14:58 -08:00
Michael Luo	1cb335487e	SAC for Mujoco Environments (#6642 )	2019-12-31 00:16:54 -08:00
Sven	8b16847c02	Get utils ready for better Agent torch support. (#6561 )	2019-12-30 12:27:32 -08:00
Eric Liang	7c1e0e5715	Implement wait_local for wait (#6524 )	2019-12-28 17:40:49 -08:00
Eric Liang	022954ac09	[rllib] Tuple action dist tensors not reduced properly in eager mode (#6615 )	2019-12-28 09:51:09 -08:00
Eric Liang	3af84ada47	Revert "[rllib] remove exists call (#6168 )" (#6616 ) This reverts commit `a68cda0a33`.	2019-12-26 22:44:26 -08:00
Zhongxia Yan	98689bd263	Changed foreach_policy to foreach_trainable_policy (#6564 ) Changed foreach_policy to foreach_trainable_policy in DQN when disabling exploration. This makes it consistent with the rest of the file	2019-12-26 19:50:48 -08:00
gehring	b40869d0e4	Wrapper for the dm_env interface (#6468 )	2019-12-26 13:22:17 -08:00
Michael Luo	548df014ec	SAC Performance Fixes (#6295 ) * SAC Performance Fixes * Small Changes * Update sac_model.py * fix normalize wrapper * Update test_eager_support.py Co-authored-by: Eric Liang <ekhliang@gmail.com>	2019-12-20 10:51:25 -08:00
Eyal Sela	7b955881f3	Initializing default saver inside the function (#6540 )	2019-12-19 12:29:45 -08:00
Eric Liang	2530eb90dc	Move tf.test.is_gpu_available() to after session init (#6515 ) * move to after session init * script fixes	2019-12-17 14:55:39 -08:00
Eugene Vinitsky	3cb499632e	(Bug Fix): Remove the extra 0.5 in the Diagonal Gaussian entropy (#6475 )	2019-12-13 14:42:30 -08:00
Eric Liang	be5dd8eb5e	Enable direct calls by default (#6367 ) * wip * add * timeout fix * const ref * comments * fix * fix * Move actor state into actor handle * comments 2 * enable by default * temp reorder * some fixes * add debug code * tmp * fix * wip * remove dbg * fix compile * fix * fix check * remove non direct tests * Increment ref count before resolving value * rename * fix another bug * tmp * tmp * Fix object pinning * build change * lint * ActorManager * tmp * ActorManager * fix test component failures * Remove old code * Remove unused * fix * fix * fix resources * fix advanced * eric's diff * blacklist * blacklist * cleanup * annotate * disable tests for now * remove * fix * fix * clean up verbosity * fix test * fix concurrency test * Update .travis.yml * Update .travis.yml * Update .travis.yml * split up analysis suite * split up trial runner suite * fix detached direct actors * fix * split up advanced tesT * lint * fix core worker test hang * fix bad check fail which breaks test_cluster.py in tune * fix some minor diffs in test_cluster * less workers * make less stressful * split up test * retry flaky tests * remove old test flags * fixes * lint * Update worker_pool.cc * fix race * fix * fix bugs in node failure handling * fix race condition * fix bugs in node failure handling * fix race condition * nits * fix test * disable heartbeatS * disable heartbeatS * fix * fix * use worker id * fix max fail * debug exit * fix merge, and apply [PATCH] fix concurrency test * [patch] fix core worker test hang * remove NotifyActorCreation, and return worker on completion of actor creation task * remove actor diied callback * Update core_worker.cc * lint * use task manager * fix merge * fix deadlock * wip * merge conflits * fix * better sysexit handling * better sysexit handling * better sysexit handling * check id * better debug * task failed msg * task failed msg * retry failed tasks with delay * retry failed tasks with delay * clip deps * fix * fix core worker tests * fix task manager test * fix all tests * cleanup * set to 0 for direct tests * dont check worker id for ownership rpc * dont check worker id for ownership rpc * debug messages * add comment * remove debug statements * nit * check worker id * fix test * owner * fix tests	2019-12-13 13:58:04 -08:00
Zack Polizzi	9e9c524823	Update pong-apex tuned example (#6462 )	2019-12-12 10:57:55 -08:00
Victor Le	4e24c805ee	AlphaZero and Ranked reward implementation (#6385 )	2019-12-07 12:08:40 -08:00
Eric Liang	4c6739476b	[rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() (#6365 )	2019-12-05 10:13:54 -08:00
Stephanie Wang	da41180dc0	[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py (#6306 ) * multinode failures direct * Add number of retries allowed for tasks * Retry tasks * Add failing test for object reconstruction * Handle return status and debug * update * Retry task unit test * update * update * todo * Fix max_retries decorator, fix test * Fix test that flaked * lint * comments	2019-12-02 10:20:57 -08:00
Eric Liang	77b5098e7d	[rllib] Warn about dict action spaces	2019-11-27 12:57:38 -08:00
Eric Liang	ddc8855f41	Fix wrap (#6293 )	2019-11-26 17:47:47 -08:00
Ameer Haj Ali	71316fa8d0	wrap models with DistributionalQModel when running DQN (#6258 ) * wrap models with DistributionalQModel when running DQN * wrap only for tensorflow models * Update custom_keras_model.py	2019-11-25 00:11:24 -08:00

... 4 5 6 7 8

356 commits