1
0
Fork 0
mirror of https://github.com/vale981/ray synced 2025-03-17 00:26:38 -04:00
Commit graph

7425 commits

Author SHA1 Message Date
Yu Kobayashi
d2d66c576e Support non ascii characters in the source code () 2019-02-16 11:45:44 +08:00
Hao Chen
de17443dc2
Propagate backend error to worker () 2019-02-16 11:39:15 +08:00
William Ma
4be3d0c5d3 Update shipped modin to 0.3.1 () 2019-02-15 15:49:38 -08:00
Robert Nishihara
2d07df7f3f Replace '__main__' with "__main__". () 2019-02-15 13:32:43 -08:00
Robert Nishihara
5f71751891 API cleanups. Remove worker argument. Remove some deprecated arguments. ()
* Remove worker argument from API methods.

* Remove deprecated arguments and deprecate redirect_output and redirect_worker_output.

* Fix
2019-02-15 10:49:16 -08:00
Hao Chen
042ad84573
Simplify Cython ID types and fix bug of ActorCheckpointID () 2019-02-15 20:15:16 +08:00
Richard Liaw
bb7c4ce9c4
[tune] Improve error message when Ray crashes () 2019-02-15 01:04:17 -08:00
Richard Liaw
7cf62a10cd
[tune] Fix TF checkpointing example ()
Closes , closes .
2019-02-15 00:30:27 -08:00
Stephanie Wang
3684e5bc0d Fix memory leak in Redis by using auto memory management ()
* Table appends should always succeed

* Use Redis auto memory management

* Remove unneeded namespace
2019-02-14 19:51:18 -08:00
Eric Liang
0c0bd4d41c
[rllib] Use model.value_function() in MARWIL ()
* fix marwil

* add ph

* fix
2019-02-14 19:35:21 -08:00
William Ma
8ee53297b1 Add documentation on how to use debug tools () 2019-02-14 13:50:21 -08:00
Philipp Moritz
077ffd99bf Bump version from 0.6.3 to 0.7.0.dev0 in docs and .yaml () 2019-02-14 12:08:48 -08:00
Yuhong Guo
4b0db437ee
Linting Bazel scripts ()
* Use buildifier as bazel script linter

* Checkout golang version in travis

* Using golang-1.8-go in travis

* Add golang apt-repository

* Fix the bazel lint failure example.

* Address comment
2019-02-14 22:16:19 +08:00
Philipp Moritz
810cc17062 Fix LRU eviction of client notification datastructure ()
* convert notification_key map to C++ datastructure

* fix crash and add debug string

* clean notification map up (this was a bug before)

* remove checks

* add jenkins test

* linting

* fixes

* properly erase

* clean up

* linting

* Update test_wait_hanging.py

* Update run_multi_node_tests.sh

* increase redis_max_memory

* fix dat jenkins

* update

* Update run_multi_node_tests.sh
2019-02-13 22:20:27 -08:00
Stephanie Wang
fd5b58a827 Increase timeout for object manager valgrind tests ()
* Avoid second copy of data for inlined objects

* Increase Wait timeout for valgrind tests

* Run object manager tests with and without inlined objects

* Fix test
2019-02-13 18:29:03 -08:00
Wang Qing
1fb56a4316 Remove deprecated module () 2019-02-14 10:04:09 +08:00
Si-Yuan
2de31eb489 minor fix () 2019-02-13 17:22:45 -08:00
Eric Liang
2dccf383dd
[rllib] Basic infrastructure for off-policy estimation (IS, WIS) () 2019-02-13 16:25:05 -08:00
Kristian Hartikainen
729d0b2825 [autoscaler] docker run options ()
Adds support for docker options, allowing for use of nvidia-docker.

Closes .
2019-02-13 12:26:28 -08:00
Stephanie Wang
4347ab644e
Use Redis lists in the GCS instead of zset ()
* Convert zset to list

* Remove object evictions map from the object directory, yay

* comments

* Fix tests
2019-02-13 10:32:57 -08:00
bjg2
0e37ac6d1d [wingman -> rllib] Remote and entangled environments ()
* added all our environment changes

* fixed merge request comments and remote env

* fixed remote check

* moved remote_worker_envs to correct config section

* lint

* auto wrap impl

* fix

* fixed the tests
2019-02-13 10:08:26 -08:00
Philipp Moritz
b3f72e8a75 Add regression tests for dataclass serialization () 2019-02-13 09:07:03 -08:00
Hao Chen
f31a79f3f7
Implement actor checkpointing ()
* Implement Actor checkpointing

* docs

* fix

* fix

* fix

* move restore-from-checkpoint to HandleActorStateTransition

* Revert "move restore-from-checkpoint to HandleActorStateTransition"

This reverts commit 9aa4447c1e3e321f42a1d895d72f17098b72de12.

* resubmit waiting tasks when actor frontier restored

* add doc about num_actor_checkpoints_to_keep=1

* add num_actor_checkpoints_to_keep to Cython

* add checkpoint_expired api

* check if actor class is abstract

* change checkpoint_ids to long string

* implement java

* Refactor to delay actor creation publish until checkpoint is resumed

* debug, lint

* Erase from checkpoints to restore if task fails

* fix lint

* update comments

* avoid duplicated actor notification log

* fix unintended change

* add actor_id to checkpoint_expired

* small java updates

* make checkpoint info per actor

* lint

* Remove logging

* Remove old actor checkpointing Python code, move new checkpointing code to FunctionActionManager

* Replace old actor checkpointing tests

* Fix test and lint

* address comments

* consolidate kill_actor

* Remove __ray_checkpoint__

* fix non-ascii char

* Loosen test checks

* fix java

* fix sphinx-build
2019-02-13 19:39:02 +08:00
Andrew Tan
57dcd3033e [tune] Trial reporter fix ()
Fixes .
2019-02-13 01:03:54 -08:00
Wang Qing
3a7fb182cc Change the num of parallel jobs when building 2019-02-13 00:33:05 -08:00
William Ma
e1a479b137 Add teardown_module to test_queue.py () 2019-02-12 22:43:09 -08:00
Si-Yuan
21472b890a Integrate "tempfile_service" into "ray.node.Node" () 2019-02-12 17:34:04 -08:00
Adi Zimmerman
dac1969647 [tune] Add Nevergrad to Tune () 2019-02-12 11:00:04 -08:00
Wang Qing
c523bc04ad Enable redis password in Java worker ()
* Support Java redis password

* Fix

* Refine

* Fix lint.
2019-02-12 13:11:25 +08:00
Adi Zimmerman
9797028a91 [tune] Add scikit-optimize to Tune () 2019-02-11 17:06:02 -08:00
Eric Liang
8df772867c
[rllib] rename compute_apply to learn_on_batch 2019-02-11 15:22:15 -08:00
Eric Liang
c4182463f6
[rllib] Add helper to iterate over envs in a vectorized environment ()
* add foreach env func

* fix

* add test
2019-02-11 10:40:47 -08:00
Daniel Edgecumbe
a70ae1687b .gitignore: Add Vim swap files () 2019-02-11 10:27:10 -08:00
Ion
3c32343c63 Ray signal () 2019-02-11 10:14:48 -08:00
ebrevdo
52dfde1cbb Update flatbuffer bazel rule to work with flatbuffer master branch. () 2019-02-11 10:00:06 -08:00
Zhijun Fu
7097ba393b protect raylet against bad messages ()
* protect raylet against bad messages

* address comments

* linting and regression test
2019-02-12 00:39:38 +08:00
Wang Qing
bc438ca73b [Java] Refine Java config item ()
* Refine

* Address comment.
2019-02-11 23:55:40 +08:00
Philipp Moritz
ab809bd927 update ray version to 0.7.0dev () 2019-02-10 19:56:42 -08:00
Eric Liang
8e9f2c923f
[autoscaler] Use RLock in addition to FileLock 2019-02-10 19:16:43 -08:00
Yuhong Guo
5fb1efd60d Fix CI test failures () 2019-02-11 11:01:14 +08:00
bjg2
e703b9f49d [wingman -> rllib] Improved stats changes in AsyncSamplesOptimizer ()
* added stats changes to optimizer

* changes timers

* fix python 2 compat

* improved optimizer throughput stats

* Update async_samples_optimizer.py

* fix python2 compat
2019-02-10 01:25:22 -08:00
Yuhong Guo
3a66d47a3a
Remove RAY_CHECK from JNI code ()
* Remove RAY_CHECK in JNI

* Try to add mvn test to test the exception.

* Refine

* Address comments
2019-02-09 18:10:22 +08:00
bibabolynn
728031a972 [java] when put an object in plasma store, ignore "object alreay exists" exception ()
* distinct plasma client exception

* Update ObjectStoreProxy.java

* Update and rename PlasmaArrowTest.java to PlasmaStoreTest.java

* store put

* Use testng to replace junit to fix test failure
2019-02-09 18:03:17 +08:00
Eric Liang
29322c7389
[rllib] Replay buffer for IMPALA should default to 0 slots. ()
* disable replay

* make lq configurable

* leak test

* Update run_multi_node_tests.sh
2019-02-08 10:03:11 -08:00
Robert Nishihara
6a32b410bb Update versions from 0.6.2 -> 0.6.3 in the documentation. () 2019-02-07 20:57:37 -08:00
Robert Nishihara
ef527f84ab Stream logs to driver by default. ()
* Stream logs to driver by default.

* Fix from rebase

* Redirect raylet output independently of worker output.

* Fix.

* Create redis client with services.create_redis_client.

* Suppress Redis connection error at exit.

* Remove thread_safe_client from redis.

* Shutdown driver threads in ray.shutdown().

* Add warning for too many log messages.

* Only stop threads if worker is connected.

* Only stop threads if they exist.

* Remove unnecessary try/excepts.

* Fix

* Only add new logging handler once.

* Increase timeout.

* Fix tempfile test.

* Fix logging in cluster_utils.

* Revert "Increase timeout."

This reverts commit b3846b89040bcd8e583b2e18cb513cb040e71d95.

* Retry longer when connecting to plasma store from node manager and object manager.

* Close pubsub channels to avoid leaking file descriptors.

* Limit log monitor open files to 200.

* Increase plasma connect retries.

* Add comment.
2019-02-07 19:53:50 -08:00
Philipp Moritz
0aa74fb1fd Update cloudpickle to 0.8.0.dev0 () 2019-02-07 15:24:06 -08:00
Eric Liang
ae4bc7d6e8
[revert] [rllib] Add copy() in async samples optimizer 2019-02-07 14:14:39 -08:00
markgoodhead
5ce670cb36 [tune] Add Initial Parameter Suggestion for HyperOpt ()
Allows users of the HyperOptSearch suggestion algorithm to specify initial experiment values to run (typically already known good baseline parameters within the domain specified)
2019-02-07 10:57:51 -08:00
Ion
f987572795 Inline objects ()
* added store_client_ to object_manager and node_manager

* half through...

* all code in, and compiling! Nothing tested though...

* something is working ;-)

* added a few more comments

* now, add only one entry to the in GCS for inlined objects

* more comments

* remove a spurious todo

* some comment updates

* add test

* added support for meta data for inline objects

* avoid some copies

* Initialize plasma client in tests

* Better comments. Enable configuring nline_object_max_size_bytes.

* Update src/ray/object_manager/object_manager.cc

Co-Authored-By: istoica <istoica@cs.berkeley.edu>

* Update src/ray/raylet/node_manager.cc

Co-Authored-By: istoica <istoica@cs.berkeley.edu>

* Update src/ray/raylet/node_manager.cc

Co-Authored-By: istoica <istoica@cs.berkeley.edu>

* fiexed comments

* fixed various typos in comments

* updated comments in object_manager.h and object_manager.cc

* addressed all comments...hopefully ;-)

* Only add eviction entries for objects that are not inlined

* fixed a bunch of comments

* Fix test

* Fix object transfer dump test

* lint

* Comments

* Fix test?

* Fix test?

* lint

* fix build

* Fix build

* lint

* Use const ref

* Fixes, don't let object manager hang

* Increase object transfer retry time for travis?

* Fix test

* Fix test?

* Add internal config to java, fix PlasmaFreeTest
2019-02-07 10:32:39 -08:00