hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	83e06cd30a	[RLlib] DDPG refactor and Exploration API action noise classes. (#7314 ) * WIP. * WIP. * WIP. * WIP. * WIP. * Fix * WIP. * Add TD3 quick Pendulum regresison. * Cleanup. * Fix. * LINT. * Fix. * Sort quick_learning test cases, add TD3. * Sort quick_learning test cases, add TD3. * Revert test_checkpoint_restore.py (debugging) changes. * Fix old soft_q settings in documentation and test configs. * More doc fixes. * Fix test case. * Fix test case. * Lower test load. * WIP.	2020-03-01 11:53:35 -08:00
Eric Liang	3c6b94f3f5	[rllib] Enable performance metrics reporting for RLlib pipelines, add A3C (#7299 )	2020-02-28 16:44:17 -08:00
Sven Mika	0c9e5db9cb	Fix SAC bug (twin Q not used for min'ing over both Q-nets in loss func). (#7354 )	2020-02-27 12:49:08 -08:00
Sven Mika	357232d124	[Core/RLlib] Move `log_once` from rllib to ray.util. (#7273 ) * Move log_once from rllib to tune. * Move log_once from rllib to tune. * LINT. * Move to ray.util.debug.	2020-02-27 10:40:44 -08:00
Sven Mika	44ac0ead34	[RLlib] rollout.py; make video-recording options more intuitive and add warnings/errors (issue 7121). (#7347 )	2020-02-27 10:39:02 -08:00
Eric Liang	58073f7260	[rllib] Fix multiagent example crash due to undefined abstract method (#7329 ) * fix multiagent example * 0 workers	2020-02-26 22:54:40 -08:00
Sven Mika	aec03656d5	[RLlib] TupleActions cannot be exported by Policy: Fixes issues 7231 and 5593. #7333	2020-02-26 15:22:54 -08:00
Matthew Brulhardt	75f683eec6	[rllib] Fix error in shape calculation. (#7301 )	2020-02-25 14:16:29 -08:00
Sven Mika	e1fc8368d4	[RLlib] SAC refactor with new SquashedGaussian distribution class. (#7272 )	2020-02-23 16:10:20 -08:00
Eric Liang	1660b52751	[rllib] Fix torch GPU / yaml load warning (#7278 ) * fix * safe load * reduce num buffer shardscZZ	2020-02-23 13:13:43 -08:00
Sven Mika	0db2046b0a	[RLlib] Policy.compute_log_likelihoods() and SAC refactor. (issue #7107 ) (#7124 ) * Exploration API (+EpsilonGreedy sub-class). * Exploration API (+EpsilonGreedy sub-class). * Cleanup/LINT. * Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents). * Add `error` option to deprecation_warning(). * WIP. * Bug fix: Get exploration-info for tf framework. Bug fix: Properly deprecate some DQN config keys. * WIP. * LINT. * WIP. * Split PerWorkerEpsilonGreedy out of EpsilonGreedy. Docstrings. * Fix bug in sampler.py in case Policy has self.exploration = None * Update rllib/agents/dqn/dqn.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Update rllib/agents/trainer.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Change requests. * LINT * In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set * Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps). * Update rllib/evaluation/worker_set.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Review fixes. * Fix default value for DQN's exploration spec. * LINT * Fix recursion bug (wrong parent c'tor). * Do not pass timestep to get_exploration_info. * Update tf_policy.py * Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs. * Bug fix tf-action-dist * DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG). * Switch off exploration when getting action probs from off-policy-estimator's policy. * LINT * Fix test_checkpoint_restore.py. * Deprecate all SAC exploration (unused) configs. * Properly use `model.last_output()` everywhere. Instead of `model._last_output`. * WIP. * Take out set_epsilon from multi-agent-env test (not needed, decays anyway). * WIP. * Trigger re-test (flaky checkpoint-restore test). * WIP. * WIP. * Add test case for deterministic action sampling in PPO. * bug fix. * Added deterministic test cases for different Agents. * Fix problem with TupleActions in dynamic-tf-policy. * Separate supported_spaces tests so they can be run separately for easier debugging. * LINT. * Fix autoregressive_action_dist.py test case. * Re-test. * Fix. * Remove duplicate py_test rule from bazel. * LINT. * WIP. * WIP. * SAC fix. * SAC fix. * WIP. * WIP. * WIP. * FIX 2 examples tests. * WIP. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Renamed test file. * WIP. * Add unittest.main. * Make action_dist_class mandatory. * fix * FIX. * WIP. * WIP. * Fix. * Fix. * Fix explorations test case (contextlib cannot find its own nullcontext??). * Force torch to be installed for QMIX. * LINT. * Fix determine_tests_to_run.py. * Fix determine_tests_to_run.py. * WIP * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Rename some stuff. * Rename some stuff. * WIP. * WIP. * Fix SAC. * Fix SAC. * Fix strange tf-error in ray core tests. * Fix strange ray-core tf-error in test_memory_scheduling test case. * Fix test_io.py. * LINT. * Update SAC yaml files' config. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-02-22 14:19:49 -08:00
Sven Mika	e2edca45d4	[RLlib] PPO torch memory leak and unnecessary torch.Tensor creation and gc'ing. (#7238 ) * Take out stats to analyze memory leak in torch PPO. * WIP * WIP * WIP * WIP * WIP * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * LINT. * Fix determine_tests_to_run.py. * minor change to re-test after determine_tests_to_run.py. * LINT. * update comments. * WIP * WIP * WIP * FIX. * Fix sequence_mask being dependent on torch being installed. * Fix strange ray-core tf-error in test_memory_scheduling test case. * Fix strange ray-core tf-error in test_memory_scheduling test case. * Fix strange ray-core tf-error in test_memory_scheduling test case. * Fix strange ray-core tf-error in test_memory_scheduling test case.	2020-02-22 11:02:31 -08:00
Sven Mika	cbc808bc6b	[Tests] determine_tests_to_run.sh has a bug affecting RLlib testing to be skipped sometimes. (#7243 )	2020-02-20 19:02:17 -08:00
Sven Mika	6043ce710d	Fix old exploration configs. (#7240 )	2020-02-20 08:39:16 -08:00
Simon Mo	b804d40c04	Stop vendoring pyarrow (#7233 )	2020-02-19 19:01:26 -08:00
Eric Liang	46af992efd	[rllib] [experimental] custom RL training pipelines (PG_pl, A2C_pl) (#7213 )	2020-02-19 16:07:37 -08:00
Simon Mo	7bef7031c2	Revert "Revert "Revert "Removing Pyarrow dependency (#7146 )" (#7209 ) (#7214 )" (#7232 )	2020-02-19 13:35:29 -08:00
Sven Mika	d537e9f0d8	[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155 )	2020-02-19 12:18:45 -08:00
Eric Liang	399424c418	[rllib] Fix broken check in eval mode for IMPALA #7217	2020-02-19 11:54:30 -08:00
Simon Mo	e8941b1b79	Revert "Revert "Removing Pyarrow dependency (#7146 )" (#7209 ) (#7214 )	2020-02-19 10:08:52 -08:00
Eric Liang	0aa9373d62	Revert "Removing Pyarrow dependency (#7146 )" (#7209 ) This reverts commit `2116fd3bca`.	2020-02-18 14:12:06 -08:00
Eric Liang	5df801605e	Add ray.util package and move libraries from experimental (#7100 )	2020-02-18 13:43:19 -08:00
ijrsvt	2116fd3bca	Removing Pyarrow dependency (#7146 )	2020-02-17 18:00:13 -08:00
mehrdadn	3bd82d0bcd	Fix various issues/warnings that come up on Jenkins (#7147 ) * Avoid warning about swap being unlimited Currently we get the following message on Jenkins: "Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap." Since we're not limiting swap anyway, we might as well avoid trying to. https://docs.docker.com/config/containers/resource_constraints/#--memory-swap-details * Fix escaping in re.search() * Fix escaping in _noisy_layer() * Raise a more descriptive error when dashboard data isn't found * Don't error on dashboard files not being found when webui isn't required * Change dashboard error to a warning instead	2020-02-17 16:08:55 -08:00
Eric Liang	42aea966ff	[rllib] Convert torch state arrays to tensors during compute actions (#7162 ) * convert to tensor * normalize fix	2020-02-17 10:26:58 -08:00
Eric Liang	b6233dff3c	[rllib] Fix bad sample count assert	2020-02-15 17:22:23 -08:00
Sven Mika	2e60f0d4d8	[RLlib] Move all jenkins RLlib-tests into bazel (rllib/BUILD). (#7178 ) * commit * comment	2020-02-15 14:50:44 -08:00
Adrian O'Grady	fe6ce714a0	[rllib] - TaskPool.completed_prefetch() no longer returns stale object ids after an error (#7139 )	2020-02-13 22:30:44 -08:00
Sven Mika	5518a738b3	[RLlib] Fix erroneous use of LinearSchedule (in DDPG's exploration annealing). (#7125 ) * Fix erroneous use of LinearSchedule (in DDPG's exploration annealing). Erase schedules_obsoleted.py. * Trigger re-test. * Re-test.	2020-02-12 23:46:49 -08:00
Sven Mika	f41a9b9813	[RLlib] Fix KL method of MultiCategorial tf distribution (issue #7009 ). (#7119 ) * Fix KL method of MultiCategorial tf distribution. * Fix KL method of MultiCategorial tf distribution. * Merge AsyncReplayOptimizer fixes into this branch.	2020-02-12 12:46:15 -08:00
Sven Mika	2a0e4d94aa	[RLlib] Fix AsyncReplayOptimizer bug where it swallows all good worker tasks … (#7111 )	2020-02-11 12:51:44 -08:00
Eric Liang	026f6884b5	[rllib] Add Decentralized DDPPO trainer and documentation (#7088 )	2020-02-10 15:28:27 -08:00
Sven Mika	6e1c3ea824	[RLlib] Exploration API (+EpsilonGreedy sub-class). (#6974 )	2020-02-10 15:22:07 -08:00
Sven Mika	5ac5ac9560	[RLlib] Fix broken example: tf-eager with custom-RNN (#6732 ). (#7021 ) * WIP. * Fix float32 conversion in OneHot preprocessor (would cause float64 in eager, then NN-matmul-failure). Add proper seq-len + state-in construction in eager_tf_policy.py::_compute_gradients(). * LINT. * eager_tf_policy.py: Only set samples["seq_lens"] if RNN. Otherwise, eager-tracing will throw flattened-dict key-mismatch error. * Move issue code to examples folder. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-02-06 09:44:08 -08:00
Eric Liang	fbc545c03b	[rllib] Support parallel, parameterized evaluation (#6981 ) * eval api * update * sync eval filters * sync fix * docs * update * docs * update * link * nit * doc updates * format	2020-02-01 22:12:12 -08:00
Sven Mika	b9ad79d66f	Add cartpole PPO torch to regression (besides tf). (#7005 )	2020-02-01 17:41:38 -08:00
roireshef	3c60caa448	[rllib] implemented compute_advantages without gae (#6941 )	2020-01-31 22:25:45 -08:00
Jaroslaw Rzepecki	67319bc887	[RLlib] Update MARWIL to use tf policy template (#6975 ) * update MARWIL to use tf policy template * formatting fixes	2020-01-31 12:57:52 -08:00
Sven Mika	211a9be9a5	[RLlib] Bug fix: PR anneals beta parameter beyond final given value. (#6973 ) * Bug fix: PR anneals beta parameter beyond final given value. * LINT. * Trigger travis re-test.	2020-01-31 09:55:03 -08:00
Sven Mika	2ccf08ad10	[RLlib] Bug fix: DQN goes into negative epsilon values after reaching explora… (#6971 ) * Bug fix: DQN goes into negative epsilon values after reaching exploration percentage. * Add `epsilon_initial_eps` to SAC to pass test_nested_spaces.py. * Add `exploration_initial_eps` to QMIX default config.	2020-01-31 09:54:12 -08:00
roireshef	dc7a555260	[rllib] Feature/histograms in tensorboard (#6942 ) * Added histogram functionality to custom metrics infrastructure (another tab in tensorboard) * updated example to include histogram metric * added histograms to TBXLogger * add episode rewards * lint Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-01-30 22:02:53 -08:00
Sven Mika	136ada5fb9	[RLlib] Experiment with py_func as a means to further unify tf and torch (Schedule classes). (#6951 )	2020-01-30 11:27:57 -08:00
Sven Mika	4c97348cb6	[RLlib] Schedule-classes multi-framework support. (#6926 )	2020-01-28 11:07:55 -08:00
Eric Liang	e659699ca9	[tune] Fix directory naming regression (#6839 )	2020-01-27 15:53:40 -08:00
Eric Liang	2fb53396ad	[rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918 )	2020-01-25 22:36:43 -08:00
Sven Mika	446cbdf2e0	[RLlib] Fix issue (bug): LSTM + non-shared vf + PPO + tuple actions (#6890 ) * Add `RandomEnv` example to examples folder. Convert warning into Error message when using an LSTM in a non-shared-vf network (after the warning, the program would crash). * LINT. * Fix issue #6884. LSTM + non-shared vf NN + PPO crashes when using a Tuple action space. * LINT * Change warning message for Model: shared_vf=False, LSTM=True cases. * Bug fix. * Add examples/random_env.py test to Jenkins.	2020-01-24 10:29:35 -08:00
AnanthHari	aa2a0cb6da	Fixes empty `state` argument in compute_single_action method (#6894 ) * Fixes empty `state` parameter in compute_single_action method * Fixed style	2020-01-23 00:42:52 -08:00
Sven Mika	ae9a3a2237	[RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865 )	2020-01-22 17:02:58 -08:00
Sven Mika	c957ed58ed	[RLlib] Implement PPO torch version. (#6826 )	2020-01-20 23:06:50 -08:00
Eric Liang	a229bdf272	[rllib] Deprecate custom preprocessors (#6833 ) * deprecation warnings * add log warn * fix test	2020-01-18 23:30:09 -08:00

... 24 25 26 27 28

1385 commits