hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	bb6c675231	[RLlib] Bug fix: Copy `is_exploring` placeholder for multi-GPU tower generation. (#7846 )	2020-04-03 10:44:58 -07:00
Sven Mika	5537fe13b0	[RLlib] Exploration API: ParamNoise Integration into DQN; working example/test cases. (#7814 )	2020-04-03 10:44:25 -07:00
Sven Mika	7b08db9f8c	[RLlib] Remove all instances of tf.contrib.layers. ... from RLlib code (deprecated). (#7851 )	2020-04-01 18:03:14 -07:00
Sven Mika	e153e3179f	[RLlib] Exploration API: Policy changes needed for forward pass noisifications. (#7798 ) * Rollback. * WIP. * WIP. * LINT. * WIP. * Fix. * Fix. * Fix. * LINT. * Fix (SAC does currently not support eager). * Fix. * WIP. * LINT. * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * WIP. * Fix. * LINT. * LINT. * Fix and LINT. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Fix. * Fix and LINT. * Update rllib/utils/exploration/exploration.py * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Fixes. * LINT. * WIP. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-04-01 00:43:21 -07:00
Sven Mika	66df8b8c35	[RLlib] Working/learning example: PPO + torch + LSTM. (#7797 )	2020-03-31 22:00:28 -07:00
Sven Mika	e356e97eb2	[RLlib] Assert correct policy class being used in Worker. (#7769 )	2020-03-30 14:03:29 -07:00
Eric Liang	d6255c3395	Fix build breakage due to soft torch import (#7790 )	2020-03-28 19:08:31 -07:00
Sven Mika	e4bd5db4d8	[RLlib] Minimal ParamNoise PR. (#7772 )	2020-03-28 16:16:30 -07:00
Eric Liang	5cebee68d6	[rllib] Add scaling guide to documentation, improve bandit docs (#7780 ) * update * reword * update * ms * multi node sgd * reorder * improve bandit docs * contrib * update * ref * improve refs * fix build * add pillow dep * add pil * update pil * pillow * remove false	2020-03-27 22:05:43 -07:00
Carl Balmer	0cfb6488a7	changed get_agent_class to from get_trainable_cls (#7758 )	2020-03-27 12:17:16 -07:00
Sven Mika	93b5c38b7d	[RLlib] Noisy layers in DQN throw different errors (issue #7635 ). (#7750 ) * Rollback. * Fix issue 7635. * Fix issue 7635. * LINT and bug fix.	2020-03-26 22:08:34 -07:00
Sven Mika	369a3417c4	[RLlib] Add tf-graph by default when doing `Policy.export_model()`. (#7759 ) * Rollback. * WIP. * WIP. * Fix. * LINT.	2020-03-26 22:07:10 -07:00
Saurabh Gupta	6ddf84b019	Contextual Bandit algorithms (WIP) (#7642 )	2020-03-26 13:41:16 -07:00
Sven Mika	bcf963a53b	[RLlib] Bug default policy overrides torch policy. (#7756 ) * Rollback. * Bug fix!	2020-03-26 10:03:20 -07:00
Eric Liang	9a590ac6a5	[rllib] Fix custom model metrics in multi-device case (#7640 ) * fix example * add example test * lin	2020-03-23 12:40:22 -07:00
Sven Mika	1138f2ebed	[RLlib] Issue 7046 cannot restore keras model from h5 file. (#7482 )	2020-03-23 12:19:30 -07:00
Robert Nishihara	ee8c9ff732	Remove six and cloudpickle from setup.py. (#7694 )	2020-03-23 11:42:05 -07:00
Eric Liang	288933ec6b	[rllib] Fix shared metrics context in parallel iterators (#7666 ) * debug * build * update * wip * wpi * update * recurisve sync * comment * stream * fix * Update .travis.yml	2020-03-22 14:15:01 -07:00
Sven Mika	2fb219a658	[Ray RLlib] Fix tree import (#7662 ) * Rollback. * Fix import tree error by adding meaningful error and replacing by tf.nest wherever possible. * LINT. * LINT. * Fix. * Fix log-likelihood test case failing on travis.	2020-03-22 13:51:24 -07:00
Eric Liang	7ebc6783e4	[rllib] Add back get_policy_output method for SAC model (#7604 )	2020-03-20 12:44:04 -07:00
Eric Liang	9392cdbf74	[rllib] Add high-performance external application connector (#7641 )	2020-03-20 12:43:57 -07:00
mehrdadn	a0700e2f86	Change /tmp to platform-specific temporary directory (#7529 )	2020-03-16 18:10:14 -07:00
Eric Liang	797e6cfc2a	[rllib][tune] fix some nans (#7611 )	2020-03-16 11:19:58 -07:00
Eric Liang	dd70720578	[rllib] Rename sample_batch_size => rollout_fragment_length (#7503 ) * bulk rename * deprecation warn * update doc * update fig * line length * rename * make pytest comptaible * fix test * fi sys * rename * wip * fix more * lint * update svg * comments * lint * fix use of batch steps	2020-03-14 12:05:04 -07:00
Eric Liang	52cf77f5a9	[rllib] SAC no_done_at_end should default to False (#7594 ) * update * update doc * stochastic * cleanu	2020-03-14 11:16:54 -07:00
Eric Liang	c3a8ba399f	[rllib] Enable distributed exec api for A2C, A3C, PG by default (#7580 )	2020-03-13 18:48:41 -07:00
Sven Mika	552cfb37ea	[RLlib] Fix bugs and speed up SegmentTree	2020-03-13 01:03:07 -07:00
Sven Mika	f165766813	[RLlib] Bug: If trainer config `horizon` is provided, should try to increase env steps to that value. (#7531 )	2020-03-12 11:03:37 -07:00
Sven Mika	80d314ae5e	[RLlib] Add all agents to `rllib rollout` tests. (#7534 )	2020-03-12 11:02:51 -07:00
Eric Liang	f5d12a958b	[rllib] Port Ape-X to distributed execution API (#7497 )	2020-03-12 00:54:08 -07:00
Sven Mika	20ef4a8603	[RLlib] Cleanup/unify all test cases. (#7533 )	2020-03-11 20:39:47 -07:00
Sven Mika	dded5b6d22	[RLlib] ES `env_config` is not a EnvContext object (e.g. does not contain `worker_index`). (#7560 )	2020-03-11 20:33:20 -07:00
Sven Mika	bc120730e5	[RLlib] PPO(torch) on CartPole not tuned well enough for consistent learning (#7556 )	2020-03-11 20:31:27 -07:00
Eric Liang	be48e1964b	[rllib] Fix per-worker exploration in Ape-X; make more kwargs required for future safety (#7504 ) * fix sched * lintc * lint * fix * add unit test * fix * format * fix test * fix test	2020-03-10 11:14:14 -07:00
Sven Mika	f08687f550	[RLlib] `rllib train` crashes when using torch PPO/PG/A2C. (#7508 ) * Fix. * Rollback. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST.	2020-03-08 13:03:18 -07:00
Eric Liang	a644060daa	[rllib] First pass at pipeline implementation of DQN (#7433 ) * wip iters * add test * speed up * update docs * document it * support serial sampling * add test * spacing * annotate it * update * rename to pipeline * comment * iter2 wip * update * update * context test * update * fix * fix * a3c pipeline * doc * update * move timer * comment * add piepline test * fix * clean up * document * iter s * wip dqn * wip * wip * metrics * metrics rename * metrics ctx * wip * constants * add todo * suppport .union * wip * support union * remove prints * add todo * remove auto timer * fix up * fix pipeline test * typing * fix breakage * remove bad assert * wip * fix multiagent example * fixapply * update a3c * remove a2c pl * 0 workers * wip * wip * share metrics * wip * wip * doc * fix weight sync and global var updates * mode * fix * fix * doc * fix	2020-03-07 14:47:58 -08:00
Sven Mika	876a1ba5bd	[RLlib] Issue 7421: can't convert cuda tensor to numpy in torch ppo. (#7445 )	2020-03-06 12:45:30 -08:00
Sven Mika	510c850651	[RLlib] SAC add discrete action support. (#7320 ) * Exploration API (+EpsilonGreedy sub-class). * Exploration API (+EpsilonGreedy sub-class). * Cleanup/LINT. * Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents). * Add `error` option to deprecation_warning(). * WIP. * Bug fix: Get exploration-info for tf framework. Bug fix: Properly deprecate some DQN config keys. * WIP. * LINT. * WIP. * Split PerWorkerEpsilonGreedy out of EpsilonGreedy. Docstrings. * Fix bug in sampler.py in case Policy has self.exploration = None * Update rllib/agents/dqn/dqn.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Update rllib/agents/trainer.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Change requests. * LINT * In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set * Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps). * Update rllib/evaluation/worker_set.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Review fixes. * Fix default value for DQN's exploration spec. * LINT * Fix recursion bug (wrong parent c'tor). * Do not pass timestep to get_exploration_info. * Update tf_policy.py * Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs. * Bug fix tf-action-dist * DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG). * Switch off exploration when getting action probs from off-policy-estimator's policy. * LINT * Fix test_checkpoint_restore.py. * Deprecate all SAC exploration (unused) configs. * Properly use `model.last_output()` everywhere. Instead of `model._last_output`. * WIP. * Take out set_epsilon from multi-agent-env test (not needed, decays anyway). * WIP. * Trigger re-test (flaky checkpoint-restore test). * WIP. * WIP. * Add test case for deterministic action sampling in PPO. * bug fix. * Added deterministic test cases for different Agents. * Fix problem with TupleActions in dynamic-tf-policy. * Separate supported_spaces tests so they can be run separately for easier debugging. * LINT. * Fix autoregressive_action_dist.py test case. * Re-test. * Fix. * Remove duplicate py_test rule from bazel. * LINT. * WIP. * WIP. * SAC fix. * SAC fix. * WIP. * WIP. * WIP. * FIX 2 examples tests. * WIP. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Renamed test file. * WIP. * Add unittest.main. * Make action_dist_class mandatory. * fix * FIX. * WIP. * WIP. * Fix. * Fix. * Fix explorations test case (contextlib cannot find its own nullcontext??). * Force torch to be installed for QMIX. * LINT. * Fix determine_tests_to_run.py. * Fix determine_tests_to_run.py. * WIP * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Rename some stuff. * Rename some stuff. * WIP. * update. * WIP. * Gumbel Softmax Dist. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP * WIP. * WIP. * Hypertune. * Hypertune. * Hypertune. * Lock-in. * Cleanup. * LINT. * Fix. * Update rllib/policy/eager_tf_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/agents/sac/sac_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/agents/sac/sac_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/models/tf/tf_action_dist.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/models/tf/tf_action_dist.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Fix items from review comments. * Add dm_tree to RLlib dependencies. * Add dm_tree to RLlib dependencies. * Fix DQN test cases ((Torch)Categorical). * Fix wrong pip install. Co-authored-by: Eric Liang <ekhliang@gmail.com> Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>	2020-03-06 10:37:12 -08:00
Eric Liang	1989eed3bf	[RLlib] Issue 7136: rollout not working for ES and ARS. (#7444 ) * Fix. * Fix issue #7136. * ARS fix.	2020-03-04 23:57:44 -08:00
Eric Liang	596b39e36a	[rllib] Make timestep a required arg for exploration classes (#7380 )	2020-03-04 13:00:37 -08:00
Eric Liang	fddeb6809c	[RLlib] Issue 7401: In eval mode (if evaluation_episodes > 0), agent hangs if Env does not terminate. (#7448 ) * Fix. * Rollback. * Fix issue 7421. * Fix.	2020-03-04 12:58:34 -08:00
Eric Liang	c38224d8e5	[RLlib] Issue 7438 evaluation not working in pytorch. (#7443 )	2020-03-04 12:53:04 -08:00
Eric Liang	aa4861c2a0	Checkpoint Adam momenta for DDPG (#7449 )	2020-03-04 10:03:41 -08:00
Sven Mika	4198db5038	Torch multicat support (7419)	2020-03-04 00:41:40 -08:00
Sven Mika	7faf0d8f89	[RLlib] Make rollout always use `evaluation_config`. (#7396 )	2020-03-03 17:20:35 -08:00
Eric Liang	0f88444686	[rllib] Support multi-agent training in pipeline impls, add easy flag to enable (#7338 )	2020-03-02 15:16:37 -08:00
Sven Mika	d8eeb96413	Fix issue with torch PPO not handling action spaces of shape=(>1,). (#7398 )	2020-03-02 10:53:19 -08:00
Sven Mika	2d97650b1e	[RLlib] Add Exploration API documentation. (#7373 ) * Add Exploration API documentation. * Add Exploration API documentation. * Add Exploration API documentation. * Update exporation docs.	2020-03-01 16:55:41 -08:00
Sven Mika	83e06cd30a	[RLlib] DDPG refactor and Exploration API action noise classes. (#7314 ) * WIP. * WIP. * WIP. * WIP. * WIP. * Fix * WIP. * Add TD3 quick Pendulum regresison. * Cleanup. * Fix. * LINT. * Fix. * Sort quick_learning test cases, add TD3. * Sort quick_learning test cases, add TD3. * Revert test_checkpoint_restore.py (debugging) changes. * Fix old soft_q settings in documentation and test configs. * More doc fixes. * Fix test case. * Fix test case. * Lower test load. * WIP.	2020-03-01 11:53:35 -08:00
Eric Liang	3c6b94f3f5	[rllib] Enable performance metrics reporting for RLlib pipelines, add A3C (#7299 )	2020-02-28 16:44:17 -08:00

... 2 3 4 5 6 ...

333 commits