hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Stephanie Wang	fdb528514b	[core] Ref counting for actor handles (#7434 ) * tmp * Move Exit handler into CoreWorker, exit once owner's ref count goes to 0 * fix build * Remove __ray_terminate__ and add test case for distributed ref counting * lint * Remove unused * Fixes for detached actor, duplicate actor handles * Remove unused * Remove creation return ID * Remove ObjectIDs from python, set references in CoreWorker * Fix crash * Fix memory crash * Fix tests * fix * fixes * fix tests * fix java build * fix build * fix * check status * check status	2020-03-10 17:45:07 -07:00
Edward Oakes	119a303ea0	Remove static concurrency limit from gRPC server (#7544 )	2020-03-10 16:27:02 -07:00
Edward Oakes	dbbf0c0e70	Add Apache 2 license to C++ files (#7520 )	2020-03-10 16:07:17 -07:00
Eric Liang	be48e1964b	[rllib] Fix per-worker exploration in Ape-X; make more kwargs required for future safety (#7504 ) * fix sched * lintc * lint * fix * add unit test * fix * format * fix test * fix test	2020-03-10 11:14:14 -07:00
Richard Liaw	d192ef0611	[raysgd] Cleanup User API (#7384 ) * Init fp16 * fp16 and schedulers * scheduler linking and fp16 * to fp16 * loss scaling and documentation * more documentation * add tests, refactor config * moredocs * more docs * fix logo, add test mode, add fp16 flag * fix tests * fix scheduler * fix apex * improve safety * fix tests * fix tests * remove pin memory default * rm * fix * Update doc/examples/doc_code/raysgd_torch_signatures.py * fix * migrate changes from other PR * ok thanks * pass * signatures * lint' * Update python/ray/experimental/sgd/pytorch/utils.py * Apply suggestions from code review Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * should address most comments * comments * fix this ci * first_pass * add overrides * override * fixing up operators * format * sgd * constants * rm * revert * save * failures * fixes * trainer * run test * operator * code * op * ok done * operator * sgd test fixes * ok * trainer * format * Apply suggestions from code review Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Update doc/source/raysgd/raysgd_pytorch.rst * docstring * dcgan * doc * commits * nit * testing * revert * Start renaming pytorch to torch * Rename PyTorchTrainer to TorchTrainer * Rename PyTorch runners to Torch runners * Finish renaming API * Rename to torch in tests * Finish renaming docs + tests * Run format + fix DeprecationWarning * fix * move tests up * benchmarks * rename * remove some args * better metrics output * fix up the benchmark * benchmark-yaml * horovod-benchmark * benchmarks * Remove benchmark code for cleanups * makedatacreator * relax * metrics * autosetsampler * profile * movements * OK * smoothen * fix * nitdocs * loss * comments * fix * fix * runner_tests * codes * example * fix_test * fix * tests Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com> Co-authored-by: Maksim Smolin <maximsmol@gmail.com>	2020-03-10 08:41:42 -07:00
Anthony Yu	89ec4adb72	[tune] Dragonfly Optimizer (#5955 ) * Add sample example * Copy relevant lines of ask from inherited Optimizer * Ignore strategy * Additional changes * Add DragonflySearch for tune connector for Dragonfly * Add example and fix small errors * lint * Remove skopt references * Update example based off of Dragonfly changes * Edit example for final Dragonfly edits * Formatting and documentation edits * Add documentation and add to test pipeline * Address PR comments * Fix Jenkins test * Adjust Dragonfly to PR#7366 * Lint * fix_tests Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-03-10 08:40:36 -07:00
fangfengbin	fa785a2ad2	ServiceBasedGcsClient support detect gcs server availability and retry (#7292 )	2020-03-10 21:01:07 +08:00
mehrdadn	fc76586518	Redis on Windows (#7509 ) * Switch hiredis on Windows to that of the Windows port of Redis * Use boost::asio::ip::tcp::socket::native_handle_type * Use normal hiredis instead of Windows-specific one * Finish up using normal hiredis Co-authored-by: Mehrdad <noreply@github.com>	2020-03-09 18:49:54 -07:00
Eric Liang	90e23a5c43	[iterators] Add duplicate() call and fix broken test case (#7510 )	2020-03-09 17:18:52 -07:00
Edward Oakes	883ee4912d	Return reconcile.Result{}, not nil (#7521 )	2020-03-09 16:27:15 -07:00
Edward Oakes	4ab80eafb9	Deprecate use_pickle flag (#7474 )	2020-03-09 16:03:56 -07:00
Edward Oakes	0c254295b0	Remove experimental.signal API (#7477 ) * Remove experimental.signal API * fix test	2020-03-09 16:03:36 -07:00
Ujval Misra	023d4c02a9	[tune] Prevent deletion of checkpoint from user-initiated resto… (#7501 ) * Fix restore bug * Add test * Lint * Indent	2020-03-09 15:53:10 -07:00
Edward Oakes	08d4cb3822	[operator] Minor cleanup (#7498 )	2020-03-09 11:23:46 -07:00
Edward Oakes	b4e2d5317e	Remove experimental.NoReturn (#7475 )	2020-03-09 11:09:36 -07:00
Edward Oakes	27b4ffa98e	Improve k8s operator documentation (#7496 )	2020-03-09 11:09:06 -07:00
Stephanie Wang	95bb0c5357	Upgrade plasma to latest version, use synchronous Seal (#7470 ) * Upgrade arrow to master * fix build * todo * lint * Fix hanging test	2020-03-09 10:30:44 -07:00
Markus Cozowicz	e03259455f	[autoscaler] azure init script path (#7515 )	2020-03-09 09:49:07 -07:00
Markus Cozowicz	145ebe14c7	added Azure Resource Manager (ARM) template (#7494 ) * added Azure Resource Manager (ARM) template * removed Azure doc (moved to separate PR) * nit * fixpaths * nit Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-03-08 22:29:10 -07:00
Eric Liang	e7bc5c612d	Add testing strategy to PR template (#7505 )	2020-03-08 15:16:49 -07:00
Sven Mika	f08687f550	[RLlib] `rllib train` crashes when using torch PPO/PG/A2C. (#7508 ) * Fix. * Rollback. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST. * TEST.	2020-03-08 13:03:18 -07:00
Sven Mika	bc637a2546	[Tune Jenkins tests] Add dm_tree to docker. (#7500 ) * Fix. * Rollback. * Add dm_tree to docker examples and tune_test containers.	2020-03-07 23:16:00 -08:00
Eric Liang	a644060daa	[rllib] First pass at pipeline implementation of DQN (#7433 ) * wip iters * add test * speed up * update docs * document it * support serial sampling * add test * spacing * annotate it * update * rename to pipeline * comment * iter2 wip * update * update * context test * update * fix * fix * a3c pipeline * doc * update * move timer * comment * add piepline test * fix * clean up * document * iter s * wip dqn * wip * wip * metrics * metrics rename * metrics ctx * wip * constants * add todo * suppport .union * wip * support union * remove prints * add todo * remove auto timer * fix up * fix pipeline test * typing * fix breakage * remove bad assert * wip * fix multiagent example * fixapply * update a3c * remove a2c pl * 0 workers * wip * wip * share metrics * wip * wip * doc * fix weight sync and global var updates * mode * fix * fix * doc * fix	2020-03-07 14:47:58 -08:00
Landcold7	beb9b02dbd	Add numba test (#7298 ) (#7487 )	2020-03-07 11:12:25 -08:00
Richard Liaw	115468de2c	[tune] Repeated evals (#7366 ) * easyrepeat * done * suggest * doc * ok * commit * Apply suggestions from code review Co-Authored-By: Ujval Misra <misraujval@gmail.com> * Apply suggestions from code review Co-Authored-By: Ujval Misra <misraujval@gmail.com> * Apply suggestions from code review * ok * docs Co-authored-by: Ujval Misra <misraujval@gmail.com>	2020-03-07 11:08:23 -08:00
mehrdadn	a8bda9b551	Fix incorrect handling of command-lines (#7439 )	2020-03-06 15:51:49 -08:00
Sven Mika	876a1ba5bd	[RLlib] Issue 7421: can't convert cuda tensor to numpy in torch ppo. (#7445 )	2020-03-06 12:45:30 -08:00
Sven Mika	510c850651	[RLlib] SAC add discrete action support. (#7320 ) * Exploration API (+EpsilonGreedy sub-class). * Exploration API (+EpsilonGreedy sub-class). * Cleanup/LINT. * Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents). * Add `error` option to deprecation_warning(). * WIP. * Bug fix: Get exploration-info for tf framework. Bug fix: Properly deprecate some DQN config keys. * WIP. * LINT. * WIP. * Split PerWorkerEpsilonGreedy out of EpsilonGreedy. Docstrings. * Fix bug in sampler.py in case Policy has self.exploration = None * Update rllib/agents/dqn/dqn.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Update rllib/agents/trainer.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Change requests. * LINT * In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set * Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps). * Update rllib/evaluation/worker_set.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Review fixes. * Fix default value for DQN's exploration spec. * LINT * Fix recursion bug (wrong parent c'tor). * Do not pass timestep to get_exploration_info. * Update tf_policy.py * Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs. * Bug fix tf-action-dist * DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG). * Switch off exploration when getting action probs from off-policy-estimator's policy. * LINT * Fix test_checkpoint_restore.py. * Deprecate all SAC exploration (unused) configs. * Properly use `model.last_output()` everywhere. Instead of `model._last_output`. * WIP. * Take out set_epsilon from multi-agent-env test (not needed, decays anyway). * WIP. * Trigger re-test (flaky checkpoint-restore test). * WIP. * WIP. * Add test case for deterministic action sampling in PPO. * bug fix. * Added deterministic test cases for different Agents. * Fix problem with TupleActions in dynamic-tf-policy. * Separate supported_spaces tests so they can be run separately for easier debugging. * LINT. * Fix autoregressive_action_dist.py test case. * Re-test. * Fix. * Remove duplicate py_test rule from bazel. * LINT. * WIP. * WIP. * SAC fix. * SAC fix. * WIP. * WIP. * WIP. * FIX 2 examples tests. * WIP. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Renamed test file. * WIP. * Add unittest.main. * Make action_dist_class mandatory. * fix * FIX. * WIP. * WIP. * Fix. * Fix. * Fix explorations test case (contextlib cannot find its own nullcontext??). * Force torch to be installed for QMIX. * LINT. * Fix determine_tests_to_run.py. * Fix determine_tests_to_run.py. * WIP * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Rename some stuff. * Rename some stuff. * WIP. * update. * WIP. * Gumbel Softmax Dist. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP * WIP. * WIP. * Hypertune. * Hypertune. * Hypertune. * Lock-in. * Cleanup. * LINT. * Fix. * Update rllib/policy/eager_tf_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/agents/sac/sac_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/agents/sac/sac_policy.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/models/tf/tf_action_dist.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Update rllib/models/tf/tf_action_dist.py Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com> * Fix items from review comments. * Add dm_tree to RLlib dependencies. * Add dm_tree to RLlib dependencies. * Fix DQN test cases ((Torch)Categorical). * Fix wrong pip install. Co-authored-by: Eric Liang <ekhliang@gmail.com> Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>	2020-03-06 10:37:12 -08:00
Qing Wang	7a33a6ea3c	[Java] Enable skipped direct call cases (#7363 ) * Comment out * Refine * Revert	2020-03-06 16:22:08 +08:00
Stephanie Wang	7c174d0ffe	Make the ref counting test more stressful (#7473 )	2020-03-05 20:51:24 -08:00
Edward Oakes	e29f2ef788	[operator] Small bugfixes (#7459 )	2020-03-05 10:57:56 -08:00
Eric Liang	1989eed3bf	[RLlib] Issue 7136: rollout not working for ES and ARS. (#7444 ) * Fix. * Fix issue #7136. * ARS fix.	2020-03-04 23:57:44 -08:00
Eric Liang	476b5c6196	[Parallel Iterators] Allow for operator chaining after repartition (#7268 ) * bug fix repartition * change add_transform from private to inner * formatting * addressing comments * formatting	2020-03-04 14:42:52 -08:00
Richard Liaw	c7f0b303f3	Mention that calling some_function.remote() is non-blocking (#7417 ) * Mention that calling some_function.remote() is non-blocking. * Apply suggestions from code review Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-03-04 13:35:46 -08:00
Richard Liaw	beddaf65b4	Small correction in documentation (#7453 ) * corrected import statement in docs * Update doc/source/tune-usage.rst Co-Authored-By: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-03-04 13:28:28 -08:00
Philipp Moritz	0d7ef46c83	Bazel improvements (#7427 ) * Make wget quiet * Make sphinx-build quiet * Remove -q from pip install in CI script as config already takes care of it * Add documentation on custom dependencies * formatting * python	2020-03-04 13:13:21 -08:00
Eric Liang	596b39e36a	[rllib] Make timestep a required arg for exploration classes (#7380 )	2020-03-04 13:00:37 -08:00
Eric Liang	fddeb6809c	[RLlib] Issue 7401: In eval mode (if evaluation_episodes > 0), agent hangs if Env does not terminate. (#7448 ) * Fix. * Rollback. * Fix issue 7421. * Fix.	2020-03-04 12:58:34 -08:00
Eric Liang	c38224d8e5	[RLlib] Issue 7438 evaluation not working in pytorch. (#7443 )	2020-03-04 12:53:04 -08:00
Philipp Moritz	de0c99876e	Fix fate_share not being passed to Redis shards (#7432 )	2020-03-04 11:29:45 -08:00
Edward Oakes	0abcca258f	Add entries to in-memory store on Put() (#7085 )	2020-03-04 10:17:27 -08:00
Eric Liang	aa4861c2a0	Checkpoint Adam momenta for DDPG (#7449 )	2020-03-04 10:03:41 -08:00
Hao Chen	fe7820fec9	[Java] New Java actor API (#7414 )	2020-03-04 22:39:23 +08:00
Sven Mika	4198db5038	Torch multicat support (7419)	2020-03-04 00:41:40 -08:00
Philipp Moritz	fb1c1e2d27	Revert "Keep cloudpickle up-to-date with the upstream (#7406 )" (#7437 ) This reverts commit `f6883bf725`.	2020-03-03 18:36:15 -08:00
Sven Mika	7faf0d8f89	[RLlib] Make rollout always use `evaluation_config`. (#7396 )	2020-03-03 17:20:35 -08:00
Maksim Smolin	3a134c7224	[RaySGD] Rename PyTorch API endpoints to start with Torch (#7425 ) * Start renaming pytorch to torch * Rename PyTorchTrainer to TorchTrainer * Rename PyTorch runners to Torch runners * Finish renaming API * Rename to torch in tests * Finish renaming docs + tests * Run format + fix DeprecationWarning * fix * move tests up * rename Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-03-03 16:44:42 -08:00
Siyuan (Ryans) Zhuang	f6883bf725	Keep cloudpickle up-to-date with the upstream (#7406 )	2020-03-03 13:52:54 -08:00
Edward Oakes	b0bf5450c2	Fix flaky multiprocessing tests (#7413 )	2020-03-03 15:07:59 -06:00
ijrsvt	fb76092d75	Re-route asyncio plasma code path through raylet instead of direct plasma connection (#7234 )	2020-03-03 15:43:46 -05:00

... 2 3 4 5 6 ...

4323 commits