hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Philipp Moritz	869ee8e25d	Integrate plasma store list facility (#2752 )	2018-09-01 16:53:51 -07:00
Alexey Tumanov	fdc9688226	[xray] push warning to driver for infeasible tasks (#2784 ) This PR pushes a warning to the user for infeasible tasks to alert them to the fact that they can't currently be executed. Fixes #2780.	2018-09-01 13:21:27 -07:00
Philipp Moritz	4db196438b	fix 'from ray.rllib import ppo' in doc (#2794 )	2018-08-31 23:34:47 -07:00
Yucong He	5b45f0bdff	[xray] Implementing Gcs sharding (#2409 ) Basically a re-implementation of #2281, with modifications of #2298 (A fix of #2334, for rebasing issues.). [+] Implement sharding for gcs tables. [+] Keep ClientTable and ErrorTable managed by the primary_shard. TaskTable is managed by the primary_shard for now, until a good hashing for tasks is implemented. [+] Move AsyncGcsClient's initialization into Connect function. [-] Move GetRedisShard and bool sharding from RedisContext's connect into AsyncGcsClient. This may make the interface cleaner.	2018-08-31 15:54:30 -07:00
Robert Nishihara	eda6ebb87d	Convert some unittests to pytest. (#2779 ) * Convert multi_node_test.py to pytest. * Convert array_test.py to pytest. * Convert failure_test.py to pytest. * Convert microbenchmarks to pytest. * Convert component_failures_test.py to pytest and some minor quotes changes. * Convert tensorflow_test.py to pytest. * Convert actor_test.py to pytest. * Fix. * Fix	2018-08-31 11:24:15 -07:00
wangyiguang	3813ae34b3	[tune] Add AutoMLBoard: Monitoring UI (experimental) (#2574 )	2018-08-31 00:26:44 -07:00
Ryan Sepassi	b6260003cb	Some small changes (#2782 ) * Add some imports that make it easier to build with Bazel * Use "/tmp" paths for sockets in tests * Move `asio_test` into `run_gcs_tests.sh` instead of starting and stopping Redis within the test fixture with a `system` call.	2018-08-30 22:42:49 -07:00
Richard Liaw	0347e6418b	[tune] Add PyTorch MNIST Example + Misc. Tweaks (#2708 )	2018-08-30 16:18:56 -07:00
Robert Nishihara	224d38cbb2	Name Python threads. (#2767 )	2018-08-30 11:08:24 -07:00
Robert Nishihara	32f7d6fcf5	Add back some tests for xray. (#2772 )	2018-08-30 11:07:23 -07:00
Yuhong Guo	9f06c19edd	Fix glog wheel failure on MacOS (#2775 )	2018-08-30 09:06:19 -07:00
Wang Qing	514633456b	[Java] Fix out-dated signatures of JNI methods (#2756 ) 1) Renamed the native JNI methods and some parameters of JNI methods. 2) Fixed native JNI methods' signatures by `javah` tool. 3) Removed some useless native methods.	2018-08-30 17:59:29 +08:00
Robert Nishihara	ba7efafa67	Remove force_start argument from StartWorkerProcess. (#2762 ) This removes the force_start argument from StartWorkerProcess in the worker pool so that no more than maximum_startup_concurrency are ever started concurrently. In particular, when the raylet starts up, it my start fewer than num_workers workers.	2018-08-30 13:43:47 +08:00
Robert Nishihara	5021795190	Update documents to replace 0.5.0 with 0.5.2. (#2761 ) * Update documents to replace 0.5.0 with 0.5.1. * Update documentation from 0.5.1 -> 0.5.2.	2018-08-29 21:05:09 -07:00
Robert Nishihara	f4f3478b45	Bump version number to 0.5.2. (#2765 )	2018-08-29 13:39:25 -07:00
Praveen Palanisamy	357c0d6156	[tune] Adds option to checkpoint at end of trials (#2754 ) * Added checkpoint_at_end option. To fix #2740 * Added ability to checkpoint at the end of trials if the option is set to True * checkpoint_at_end option added; Consistent with Experience and Trial runner * checkpoint_at_end option mentioned in the tune usage guide * Moved the redundant checkpoint criteria check out of the if-elif * Added note that checkpoint_at_end is enabled only when checkpoint_freq is not 0 * Added test case for checkpoint_at_end * Made checkpoint_at_end have an effect regardless of checkpoint_freq * Removed comment from the test case * Fixed the indentation * Fixed pep8 E231 * Handled cases when trainable does not have _save implemented * Constrained test case to a particular exp using the MockAgent * Revert "Constrained test case to a particular exp using the MockAgent" This reverts commit e965a9358ec7859b99a3aabb681286d6ba3c3906. * Revert "Handled cases when trainable does not have _save implemented" This reverts commit 0f5382f996ff0cbf3d054742db866c33494d173a. * Simpler test case for checkpoint_at_end * Preserved bools from loosing their actual value * Revert "Moved the redundant checkpoint criteria check out of the if-elif" This reverts commit 783005122902240b0ee177e9e206e397356af9c5. * Fix linting error.	2018-08-29 13:14:17 -07:00
Robert Nishihara	6edbbf4fbf	Document the release process. (#2760 )	2018-08-29 00:06:33 -07:00
Robert Nishihara	132f133214	Limit number of concurrent workers started by hardware concurrency. (#2753 ) * Limit number of concurrent workers started by hardware concurrency. * Check if std:🧵:hardware_concurrency() returns 0. * Pass in max concurrency from Python. * Fix Java call to startRaylet. * Fix typo * Remove unnecessary cast. * Fix linting. * Cleanups on Java side. * Comment back in actor test. * Require maximum_startup_concurrency to be at least 1. * Fix linting and test. * Improve documentation. * Fix typo.	2018-08-29 14:53:40 +08:00
Mitar	3850e3ba64	Added extra logging related arguments to "ray start" (#2664 )	2018-08-28 23:00:37 -07:00
Eric Liang	69d1354016	[rllib] Document ARS & rainbow (#2744 ) * wip * rainbow doc too * e not used * fix ppo doc * clean list * use same title	2018-08-28 18:13:36 -07:00
Robert Nishihara	6e1de19cc2	Bump version to 0.5.1. (#2755 )	2018-08-28 16:52:17 -07:00
Robert Nishihara	b7722897b4	Deprecate 'driver_mode' argument. (#2758 ) * Deprecate 'driver_mode' argument. * Fix * Fix	2018-08-28 16:45:49 -07:00
Alexey Tumanov	de047daea7	[xray] raylet scheduling mechanism with a simple spillback policy (#2749 ) ## What do these changes do? * distribute load and resource information on a heartbeat * for each raylet, maintain total and available resource capacity as well as measure of current load * this PR introduces a new notion of load, defined as a sum of all resource demand induced by queued ready tasks on the local raylet. This provides a heterogeneity-aware measure of load that supersedes legacy Ray's task count as a proxy for load. * modify the scheduling policy to perform capacity-based, load-aware, optimistically concurrent resource allocation * perform task spillover to the heartbeating node in response to a heartbeat, implementing heterogeneity-aware late-binding/work-stealing.	2018-08-28 00:03:34 -07:00
adoda	90ae8f11df	The function get_node_ip_address while catch an exception and return … (#2722 ) …'127.0.0.1', when we forbid the external network. Instead of we can get ip address from hostname. The function get_node_ip_address while catch an exception and return '127.0.0.1' when we forbid the external network. Instead of we can get ip address from hostname. https://github.com/ray-project/ray/issues/2721	2018-08-27 22:24:49 -07:00
Wang Qing	b4cba9a49f	[java] Fix the logic of generating TaskID (#2747 ) ## What do these changes do? Because the logic of generating `TaskID` in java is different from python's, there are many tests fail when we change the `Ray Core` code. In this change, I rewrote the logic of generating `TaskID` in java which is the same as the python's. In java, we call the native method `_generateTaskId()` to generate a `TaskID` which is also used in python. We change `computePutId()`'s logic too. ## Related issue number [#2608](https://github.com/ray-project/ray/issues/2608)	2018-08-27 13:11:33 -07:00
Hao Chen	f37c260bdb	[multi-language part 3] support multiple languages in raylet backend (#2672 ) This PR enables multi-language support in the raylet backend. - `Worker` class now has a `language` label; - `WorkerPool`: - It now maintains one set of states for each language. - `PopWorker` function's parameter type is changed to `TaskSpecification`, and it will choose a worker to pop based on both task's language and actor id. - `Size` and `StartWorkerProcess` functions now have an extra `language` parameter. - `RegisterClientRequest` message now has an extra `language` field in raylet mode, which tells the node manager which language the worker is.	2018-08-26 22:06:25 -07:00
Yuhong Guo	0b6e08ebee	Separate python logger module-wise (#2703 ) ## What do these changes do? 1. Separate the log related code to logger.py from services.py. 2. Allow users to modify logging formatter in `ray start`. ## Related issue number https://github.com/ray-project/ray/pull/2664	2018-08-26 13:46:14 -07:00
Wang Qing	26d3c0655c	[java] Improve UniqueID code. (#2723 )	2018-08-26 12:32:57 -07:00
Hao Chen	4f4bea086a	[java] Remove multi-return API (#2724 )	2018-08-26 00:04:54 -07:00
Richard Liaw	dbba7f2a53	[autoscaler] Cleanup Logging (#2709 ) Moves Autoscaler onto Python `logging` module.	2018-08-25 17:08:45 -07:00
Jones Wong	982cde664f	[rllib] Add noisy network and distributional Q-learning to implement Rainbow (#2737 ) * add noisy network * distributional q-learning in dev * add distributional q-learning * validated rainbow module * add some comments * supply some comments * remove redundant argument to pass CI test * async replay optimizer does NOT need annealing beta * ignore rainbow specific arguments for DDPG and Apex * formatted by yapf * Update dqn_policy_graph.py * Update dqn_policy_graph.py	2018-08-25 14:17:14 -07:00
eugenevinitsky	6201a6d1c7	[rllib] add augmented random search (#2714 ) * added ars * functioning ars with regression test * added regression tests for ARs * fixed default config for ARS * ARS code runs, now time to test * ARS working and tested, changed std deviation of meanstd filter to initialize to 1 * ARS working and tested, changed std deviation of meanstd filter to initialize to 1 * pep8 fixes * removed unused linear model * address comments * more fixing comments * post yapf * fixed support failure * Update LICENSE * Update policies.py * Update test_supported_spaces.py * Update policies.py * Update LICENSE * Update test_supported_spaces.py * Update policies.py * Update policies.py * Update filter.py	2018-08-24 22:20:02 -07:00
Robert Nishihara	5fd44afb8a	Add note about huge pages using up memory. (#2733 ) * Add note about huge pages using up memory. * Update doc * Update	2018-08-24 17:02:54 -07:00
Yuhong Guo	697bfb14db	Hotfix for glog PR (#2734 )	2018-08-24 16:30:51 -07:00
Michael Tu	d16b6f6a32	[tune] Rename 'repeat' to 'num_samples' (#2698 ) Deprecates the `repeat` argument and introduces `num_samples`. Also updates docs accordingly.	2018-08-24 15:05:24 -07:00
Eric Liang	bcab5bcd02	fix it (#2735 )	2018-08-24 15:01:12 -07:00
Philipp Moritz	b4c47a5861	Upgrade arrow to include more detailed flushing message (#2706 )	2018-08-24 11:44:04 -07:00
Robert Nishihara	e467f546b5	Upgrade version of anaconda. (#2730 )	2018-08-23 19:14:39 -07:00
Eric Liang	aa014af85b	[rllib] Fix atari reward calculations, add LR annealing, explained var stat for A2C / impala (#2700 ) Changes needed to reproduce Atari plots in IMPALA / A2C: https://github.com/ray-project/rl-experiments	2018-08-23 17:49:10 -07:00
Stephanie Wang	1b3de31ff1	[xray] Fix bug where driver task ID is assumed to be nil (#2725 ) ## What do these changes do? #2362 left a bug where it assumed that the driver task ID was nil. This fixes the bug to check the `SchedulingQueue` for any driver task IDs instead.	2018-08-23 14:44:47 -07:00
Yuhong Guo	344a83f327	Fix build failure of Arrow and Parquet when the folder is empty. (#2720 )	2018-08-23 09:44:26 -07:00
Yuhong Guo	eec1a3eb89	Support pluggable backend log lib with glog (#2695 ) * [WIP] Support different backend log lib * Refine code, unify level, address comment * Address comment and change formatter * Fix linux building failure. * Fix lint * Remove log4cplus. * Add log init to raylet main and add test to travis. * Address comment and refine. * Update logging_test.cc	2018-08-23 09:43:38 -07:00
old-bear	4be324efc3	[tune] Support infinity value in report result (#2693 ) * + Compatibility fix under py2 on ray.tune * + Revert changes on master branch * + Use default JsonEncoder in ray.tune.logger * + Add UT for infinity support	2018-08-22 13:09:14 -07:00
joyyoj	38867eea4e	[tune] Cross-Framework Compatibility (#2646 ) This commit is a first pass at restructuring the Trial execution logic to support running on multiple frameworks.	2018-08-22 10:55:45 -07:00
Eric Liang	fbe6c59f72	[rllib] Misc fixes, A2C (#2679 ) A bunch of minor rllib fixes: pull in latest baselines atari wrapper changes (and use deepmind wrapper by default) move reward clipping to policy evaluator add a2c variant of a3c reduce vision network fc layer size to 256 units switch to 84x84 images doc tweaks print timesteps in tune status	2018-08-20 15:28:03 -07:00
Yucong He	880ef1bd21	doc fix (#2696 )	2018-08-20 14:11:32 -07:00
Robert Nishihara	89d4a6df93	Start Redis in protected mode when started via ray.init(). (#2697 ) This PR makes it so that when Ray is started via ray.init() (as opposed to via ray start) the Redis servers will be started in "protected mode" (which means that clients can only connect by connecting to localhost). In practice, we actually connect redis clients by passing in the node IP address (not localhost), so I need to create a redis config file on the fly to allow both localhost and the node's actual IP address (it would have been nice to find a way to do this from the Python redis client, but I couldn't find one).	2018-08-20 14:08:01 -07:00
Stephanie Wang	8fd5757aaa	[xray] Don't process any more messages from dead node managers (#2688 )	2018-08-19 21:11:40 -07:00
old-bear	230ac7aa80	[tune] Compatibility fix under py2 on str condition (#2673 ) * * Compatibility fix under py2 on ray.tune * + Fix compatibility * + Use package six to achieve str compatibility	2018-08-19 20:43:03 -07:00
Eric Liang	9473da69bd	[autoscaler] Experimental support for local / on-prem clusters (#2678 ) This adds some experimental (undocumented) support for launching Ray on existing nodes. You have to provide the head ip, and the list of worker ips. There are also a couple additional utils added for rsyncing files and port-forward.	2018-08-19 12:43:04 -07:00

1 2 3 4 5 ...

1967 commits