hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Edward Oakes	d69fe54f6d	Temporarily skip testEndToEndReporting (#7402 )	2020-03-02 18:27:34 -06:00
Siyuan (Ryans) Zhuang	0792b5cb93	Fix the numpy ndarray subclass serialization bug (#7392 )	2020-03-01 23:05:59 -08:00
Richard Liaw	48cdca843f	[raysgd] Custom training operator (#7211 )	2020-03-01 21:22:48 -08:00
Eric Liang	3c6b94f3f5	[rllib] Enable performance metrics reporting for RLlib pipelines, add A3C (#7299 )	2020-02-28 16:44:17 -08:00
Richard Liaw	fb73d51d4d	[tune] fix hparams for tbx (#7312 ) * fix * test_hist * remove unnecessary value check * pbt * queue * skip_for_now * Apply suggestions from code review	2020-02-28 11:51:56 -08:00
Richard Liaw	ca40b0fcc6	[tune][minor] Avoid throwing error when gpu check fails (#7362 )	2020-02-28 11:32:44 -08:00
Edward Oakes	f321eaec9b	Working but not passing test (#7358 )	2020-02-28 12:57:28 -06:00
mehrdadn	fb0bc7b947	Partially revert "[Core/RLlib] Move `log_once` from rllib to ray.util. (#7273 )" (#7361 ) This partially reverts commit `357232d124`. The addition of python/__init__.py broke the build on Windows. However, this is difficult to notice because Bazel doesn't seem to notice this dependency. You first have to go to a commit that fails on this issue, and then try to re-build this commit, so that Bazel actually performs a rebuild. A useful command-line for triggering the exact build i: bazel build --compile_one_dependency //:python/ray/_raylet.pyx	2020-02-28 10:27:45 -08:00
Edward Oakes	93fe4b0b58	Change actor.__ray_kill__() to ray.kill(actor) (#7360 )	2020-02-28 11:55:13 -06:00
Richard Liaw	3fc162f93c	[tune] Add Unit Test for nested PBT + Jenkins (#7324 )	2020-02-27 18:17:11 -08:00
mehrdadn	8730996682	Windows changes (#7315 )	2020-02-27 15:14:10 -08:00
Edward Oakes	ced062319d	Decrease test_object_manager put size to avoid OOMs in CI (#7355 )	2020-02-27 11:08:10 -08:00
Edward Oakes	cbf55d69a6	Remove serialized from_random object ids in tests (#7340 )	2020-02-27 11:04:06 -08:00
Edward Oakes	bd9411f849	Call TriggerGlobalGC when the plasma store is full (#7337 )	2020-02-27 11:01:49 -08:00
Sven Mika	357232d124	[Core/RLlib] Move `log_once` from rllib to ray.util. (#7273 ) * Move log_once from rllib to tune. * Move log_once from rllib to tune. * LINT. * Move to ray.util.debug.	2020-02-27 10:40:44 -08:00
Edward Oakes	d9027acaf2	Deprecate non-direct-call API (#7336 )	2020-02-27 10:37:23 -08:00
Edward Oakes	55ccfb6089	Fix asyncio actor race condition (#7335 )	2020-02-27 10:16:04 -08:00
Edward Oakes	ee0f71e398	Add __commit__ field to ray package in wheels (#7305 )	2020-02-26 17:54:22 -08:00
Edward Oakes	2ad9bc5684	Move plasma retry logic into plasma store provider (#7328 )	2020-02-26 16:57:02 -08:00
Eric Liang	b310661338	Add internal_api.global_gc() method, which triggers gc.collect() on all workers (#7327 )	2020-02-26 14:09:29 -08:00
Stephanie Wang	9964657815	Fix plasma bug (#7322 )	2020-02-25 18:15:28 -08:00
Edward Oakes	44b4394afa	Remove unused AddContainedObjectIDs (#7323 )	2020-02-25 16:42:20 -08:00
Richard Liaw	226fcd5aff	Add Dashboard and Util to setup-dev (#7321 )	2020-02-25 15:25:09 -08:00
Eric Liang	1ea05a2c08	[tune] Fix a number of reporter regressions and add end-to-end tests (#7274 )	2020-02-25 14:31:56 -08:00
Eric Liang	f14b6e477b	Raise gRPC message size limit to 100MB (#7269 )	2020-02-24 23:22:49 -08:00
Edward Oakes	f2faf8d26e	Fix passing duplicate by-reference arguments (#7306 )	2020-02-24 19:18:16 -08:00
chaokunyang	8b6784de06	[Streaming] Streaming Python API (#6755 )	2020-02-25 10:33:33 +08:00
Mitchell Stern	669bb403c3	Add TypeScript and HTML linting to Travis lint job (#7294 )	2020-02-24 11:12:07 -08:00
Eric Liang	0ae4fe020d	revert omp threads fix (#7288 )	2020-02-23 21:26:49 -08:00
fangfengbin	e7d0ec9531	Enable GCS server when running python unit tests (#7101 ) * Enable GCS server when running python unit tests * restart ci * restart ci * fix code style * restart ci * restart ci * restart ci * restart ci * restart ci * Define RAY_GCS_SERVICE_ENABLED as a constant * fix review comments * fix code style * fix code style * fix code style * fix code style * fix review comments * add gcs service python testcase * fix TESTSUITE name bug	2020-02-24 09:48:40 +08:00
Sven Mika	0db2046b0a	[RLlib] Policy.compute_log_likelihoods() and SAC refactor. (issue #7107 ) (#7124 ) * Exploration API (+EpsilonGreedy sub-class). * Exploration API (+EpsilonGreedy sub-class). * Cleanup/LINT. * Add `deterministic` to generic Trainer config (NOTE: this is still ignored by most Agents). * Add `error` option to deprecation_warning(). * WIP. * Bug fix: Get exploration-info for tf framework. Bug fix: Properly deprecate some DQN config keys. * WIP. * LINT. * WIP. * Split PerWorkerEpsilonGreedy out of EpsilonGreedy. Docstrings. * Fix bug in sampler.py in case Policy has self.exploration = None * Update rllib/agents/dqn/dqn.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Update rllib/agents/trainer.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * Change requests. * LINT * In tune/utils/util.py::deep_update() Only keep deep_updat'ing if both original and value are dicts. If value is not a dict, set * Completely obsolete syn_replay_optimizer.py's parameters schedule_max_timesteps AND beta_annealing_fraction (replaced with prioritized_replay_beta_annealing_timesteps). * Update rllib/evaluation/worker_set.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Review fixes. * Fix default value for DQN's exploration spec. * LINT * Fix recursion bug (wrong parent c'tor). * Do not pass timestep to get_exploration_info. * Update tf_policy.py * Fix some remaining issues with test cases and remove more deprecated DQN/APEX exploration configs. * Bug fix tf-action-dist * DDPG incompatibility bug fix with new DQN exploration handling (which is imported by DDPG). * Switch off exploration when getting action probs from off-policy-estimator's policy. * LINT * Fix test_checkpoint_restore.py. * Deprecate all SAC exploration (unused) configs. * Properly use `model.last_output()` everywhere. Instead of `model._last_output`. * WIP. * Take out set_epsilon from multi-agent-env test (not needed, decays anyway). * WIP. * Trigger re-test (flaky checkpoint-restore test). * WIP. * WIP. * Add test case for deterministic action sampling in PPO. * bug fix. * Added deterministic test cases for different Agents. * Fix problem with TupleActions in dynamic-tf-policy. * Separate supported_spaces tests so they can be run separately for easier debugging. * LINT. * Fix autoregressive_action_dist.py test case. * Re-test. * Fix. * Remove duplicate py_test rule from bazel. * LINT. * WIP. * WIP. * SAC fix. * SAC fix. * WIP. * WIP. * WIP. * FIX 2 examples tests. * WIP. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Renamed test file. * WIP. * Add unittest.main. * Make action_dist_class mandatory. * fix * FIX. * WIP. * WIP. * Fix. * Fix. * Fix explorations test case (contextlib cannot find its own nullcontext??). * Force torch to be installed for QMIX. * LINT. * Fix determine_tests_to_run.py. * Fix determine_tests_to_run.py. * WIP * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Add Random exploration component to tests (fixed issue with "static-graph randomness" via py_function). * Rename some stuff. * Rename some stuff. * WIP. * WIP. * Fix SAC. * Fix SAC. * Fix strange tf-error in ray core tests. * Fix strange ray-core tf-error in test_memory_scheduling test case. * Fix test_io.py. * LINT. * Update SAC yaml files' config. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-02-22 14:19:49 -08:00
Stephanie Wang	4c2de7be54	[core] Ref counting for returning object IDs created by a different process (#7221 ) * Add regression tests * Refactor, split RemoveSubmittedTaskReferences into submitted and finished paths * Add nested return IDs to UpdateFinishedTaskRefs, rename WrapObjectIds * Basic unit tests pass * Fix unit test and add an out-of-order regression test * Add stored_in_objects to ObjectReferenceCount, regression test now passes * Add an Address to the ReferenceCounter so we can determine ownership * Set the nested return IDs from the TaskManager * Add another test * Simplify * Update src/ray/core_worker/reference_count_test.cc Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Apply suggestions from code review Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * comments * Add python test Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>	2020-02-22 13:29:48 -08:00
Amog Kamsetty	1737a113be	[Parallel Iterators] Repartition functionality (#7163 ) * repartition and tests * blacklist lib/ files from import checks * addressing comments and splitting up tests * code readability * adding explicit ref for parent iterator * formatting	2020-02-21 13:20:18 -08:00
mehrdadn	c6f50ecc51	setpgrp fix (#7250 )	2020-02-21 13:15:11 -08:00
Edward Oakes	d190e73727	Use our own implementation of parallel_memcopy (#7254 )	2020-02-21 11:03:50 -08:00
Kai Yang	007333b960	[Java] Support direct call for normal tasks (#7193 )	2020-02-21 10:03:34 +08:00
Edward Oakes	6c80071a7d	Remove gc.collect() calls from reference counting tests (#7218 )	2020-02-20 10:51:02 -08:00
Edward Oakes	16e37416cd	Fix raylet pinning race condition (#7235 )	2020-02-20 10:41:36 -08:00
Siyuan (Ryans) Zhuang	0d210a99c3	Ensure deserialized numpy arrays are immutable (#7181 ) * ensure numpy arrays are immutable when deserialized from the memory buffer	2020-02-19 23:30:10 -08:00
Simon Mo	b804d40c04	Stop vendoring pyarrow (#7233 )	2020-02-19 19:01:26 -08:00
Siyuan (Ryans) Zhuang	48c06f5042	Enhance the serialization refcount test for dynamic classes (#7222 ) * enhance the test for dynamic classes	2020-02-19 18:35:35 -08:00
Simon Mo	7bef7031c2	Revert "Revert "Revert "Removing Pyarrow dependency (#7146 )" (#7209 ) (#7214 )" (#7232 )	2020-02-19 13:35:29 -08:00
Sven Mika	d537e9f0d8	[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155 )	2020-02-19 12:18:45 -08:00
Simon Mo	e8941b1b79	Revert "Revert "Removing Pyarrow dependency (#7146 )" (#7209 ) (#7214 )	2020-02-19 10:08:52 -08:00
Stephanie Wang	f76ce836b2	Distributed ref counting for serialized ObjectIDs (#6945 ) * Skeleton plus a unit test for simple borrower case * First unit test passes - forward an ID and task returns with 1 submitted task pending on the inner ID * Invariant for contained_in * Unit test passes for testing task return without creating a borrower * Wrap ref count functionality in test case * Fix bad delete * Unit test and fix for borrowers creating more borrowers * Unit test and fix for simple borrowing, but owner sends call after borrower's ref count goes to 0 * Refactor: - keep a sentinel ref count for task argument IDs - keep contained_in_borrowed in addition to contained_in_owned * Unit test for nested IDs passes * Refactor so that an object ID can only be contained in 1 borrowed ID at a time * Add check * Fix * Unit test (passes) to test nesting object IDs but no borrowers created * Unit test for nested objects from different owners passes, refactor to unset contained_in when popping refs * Unit tests for borrowers receiving an ObjectID from multiple sources, skip adding ownership info if we already have it to handle duplicate refs * Unit test for returning object ID passes * More unit tests for returning object IDs pass * Add serialized ID tests * fix serialization issue * remove swap * It builds! * debugging and some fixes: - register handler for WaitForRefRemoved - don't create a python reference for arg IDs - pass in client factory into ReferenceCounter - fix bad decrement in PopBorrowerRefs * Fix accounting for serialized IDs: - don't decrement for IDs on dependency resolution, wait until task finished - add object IDs that were inlined when building the arguments to the task spec, pin these on the task executor until task finishes * mu_ -> mutex_ * lint * fix build * clear outer_object_id * add direct call type check * Fix test for direct call IDs and return IDs for actor calls * Fix CoreWorkerClient.Addr() * Remove unneeded lock * Remove unnecessary ObjectID refs * Fix worker holding serialized refs test * Fix hex IDs * fix * fix tests * fix tests * refactor and cleanups * lint * Put inlined Ids in task args and some cleanup * Add back gc.collect() line for test case * Refactor and fixes: - store inlined IDs in RayObject - allow storing objects with inlined IDs in memory store - pin objects that were promoted to plasma * oops * make sure worker ID is set in address, pass in rpc::Address to CoreWorkerClient * todos * cleanups and test builds * Fix tests * Add feature flag * cleanups * address comments and some cleanups * cleanup * fix recursive test * Comments for tests * Turn off ref counting by default * Skip tests * Fix some bugs for test_array.py, java build * Don't include nested objects in the ref count when the feature flag is off * C++ feature flag does not work... * Remove * Turn on python tests and add a warning when plasma objects are evicted before being pinned * Fix build and remove irrelevant test * Fix for java * Revert "Fix build and remove irrelevant test" This reverts commit 056cca9b263ed05b0f9ab2250907338edcbca2d5. * Fix ray.internal.free * Fixes and skip some flaky tests * fix java build * fix windows build * Add IDs contained in owned objects * Update src/ray/protobuf/core_worker.proto Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Update src/ray/core_worker/reference_count.cc Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Update src/ray/protobuf/core_worker.proto Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Update src/ray/protobuf/core_worker.proto Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Update src/ray/core_worker/reference_count.h Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Update src/ray/core_worker/reference_count.h Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Update src/ray/core_worker/reference_count.cc Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * Apply suggestions from code review Co-Authored-By: Edward Oakes <ed.nmi.oakes@gmail.com> * update * Try to fix ::test_direct_call_serialized_id_eviction Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>	2020-02-18 18:21:34 -08:00
mehrdadn	4a12243336	Use Process instead of pid_t (round 2) (#6882 ) * Revert "Revert "Use Boost.Process instead of pid_t (#6510)" (#6909)" This reverts commit `bde575b8dd`. * Process wrapper, using Boost.Process on Windows - Reverts `bde575b8dd`. - Re-applies `fb8e3615d5` after some refactoring. * Remove Boost.Process dependency * Don't open /proc file on Linux * Change FATAL to ERROR and modify error message when process doesn't exist	2020-02-18 17:44:46 -08:00
Eric Liang	0aa9373d62	Revert "Removing Pyarrow dependency (#7146 )" (#7209 ) This reverts commit `2116fd3bca`.	2020-02-18 14:12:06 -08:00
Eric Liang	5df801605e	Add ray.util package and move libraries from experimental (#7100 )	2020-02-18 13:43:19 -08:00
ijrsvt	2116fd3bca	Removing Pyarrow dependency (#7146 )	2020-02-17 18:00:13 -08:00
mehrdadn	3bd82d0bcd	Fix various issues/warnings that come up on Jenkins (#7147 ) * Avoid warning about swap being unlimited Currently we get the following message on Jenkins: "Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap." Since we're not limiting swap anyway, we might as well avoid trying to. https://docs.docker.com/config/containers/resource_constraints/#--memory-swap-details * Fix escaping in re.search() * Fix escaping in _noisy_layer() * Raise a more descriptive error when dashboard data isn't found * Don't error on dashboard files not being found when webui isn't required * Change dashboard error to a warning instead	2020-02-17 16:08:55 -08:00

... 2 3 4 5 6 ...

2234 commits