1
0
Fork 0
mirror of https://github.com/vale981/ray synced 2025-03-16 08:06:38 -04:00
Commit graph

12577 commits

Author SHA1 Message Date
Yuhong Guo
1b98fb8238 Fix Jenkins test failures and function descriptor bug. ()
## What do these changes do?
1. Fix the Jenkins test failure by add driver id to Actor GCS Key.
2. Move `object_manager_test.py` from Jenkins to Travis.
2018-12-25 23:31:44 -08:00
Wang Qing
a971b73bbe [Java] Fix the issue when waiting an empty list or a null pointer () 2018-12-26 11:29:29 +08:00
Hao Chen
f4011754d6 Fix: ServerConnection should be closed before being removed ()
Otherwise, in the event of a remote raylet crashing, the connection might be held by boost asio forever, and the pending callbacks will never get invoked. See also .
2018-12-25 11:01:53 -08:00
Robert Nishihara
5426234cd8 Update documentation to reflect 0.6.1 release. () 2018-12-24 11:10:04 -08:00
Robert Nishihara
1e8cdb5421 Update release documentation. ()
* Update release instructions.

* Add note about wheels.

* Fix

* Update

* update example

* Update RELEASE_PROCESS.rst
2018-12-24 11:09:09 -08:00
nam-cern
3d8f56409b Ensure numpy is at least 1.10.4 in setup.py ()
In the build script, numpy is specifically set at 1.10.4. We should also ensure that it is indeed the case in `setup.py`.
2018-12-24 11:01:25 -08:00
Eric Liang
9f63119a83
[rllib] Allow development without needing to compile Ray ()
* wip

* lint

* wip

* wip

* rename

* wip

* Cleaner handling of cli prompt
2018-12-24 18:08:23 +09:00
Devin Petersohn
c13b2685f5 [modin] Append to path to avoid namespace collision on development branches () 2018-12-23 23:58:56 -08:00
Si-Yuan
a1995ff3b0 Resize logo in README. () 2018-12-23 22:59:23 -08:00
Alexey Tumanov
9b8d7573fe bump version from 0.6.0 to 0.6.1 () 2018-12-23 17:03:42 -08:00
Robert Nishihara
bb7ca3bae7 Upgrade flatbuffers version to 1.10.0. ()
* Upgrade flatbuffers version to 1.10.0.

* Temporarily change ray.utils.decode for backwards compatibility.
2018-12-23 14:56:34 -08:00
Robert Nishihara
ddd4c842f1 Initialize some variables in constructor instead of header file. ()
* Initialize some variables in constructor instead of header file
2018-12-23 02:44:23 -08:00
Alexey Tumanov
bada42c334 object store notification mgr: fix using uninitialized variables ()
Initialize private class variables to avoid valgrind errors. They are used before initialization.
2018-12-22 19:51:22 -08:00
Philipp Moritz
e578a38116 Fix TensorFlow and PyTorch compatibility ()
* remove tensorflow workaround
* update docker
* add boost threads
* add date_time, too
* change link order
* cosmetics
2018-12-22 13:25:48 -08:00
Tianming Xu
deb26b954e [rllib] Export tensorflow model of policy graph ()
* Export tensorflow model of policy graph

* Add tests,examples,pydocs and infer extra signatures from existing methods

* Add example usage in export_policy_model comment

* Fix lint error

* Fix lint error

* Fix lint error
2018-12-22 17:35:25 +09:00
Wang Qing
8393df2516 Use BaseTest to instead of TestListener. () 2018-12-21 16:29:16 -08:00
Eric Liang
ddc97864df [rllib] Add requested clarifications to test requirement of contrib docs () 2018-12-21 11:02:02 -08:00
Alexey Tumanov
6b179cb8a7 change the order of allocation for io_service and gcs client in raylet main () 2018-12-21 00:13:28 -08:00
bibabolynn
e65b8f18f4 [java] change RayLog.core to org.slf4j.Logger () 2018-12-21 15:58:32 +08:00
Richard Liaw
e046a5c767
[tune] resources_per_trial from trial_resources ()
Renaming variable due to user errors.
2018-12-20 19:00:47 -08:00
Devin Petersohn
a174a46e02 Allowing multiple users to access the /tmp/ray file at the same time ()
* Allowing multiple users to access the /tmp/ray file at the same time

Previous sequence that caused this issue:
* User A starts ray with `ray.init` when /tmp/ray does not exist
* User B starts ray with `ray.init` and /tmp/ray now exists

User B will get a permissions error
Checking the permissions, /tmp/ray is 700

I have identified a race condition in `try_to_create_directory`
* Multiple processes try to create /tmp/ray at the same time
* chmod is either silently erroring or working properly within the race condition

Resolution: Move chmod outside of the check for whether the directory exists or not.

* Adding try except for users who do not own the directory
2018-12-20 18:46:54 -08:00
Stephanie Wang
34bab6291c
Cleanup actor handle pickling code ()
* Cleanup actor handle pickling code

* remove unused

* fix

* lint
2018-12-20 16:37:21 -08:00
Eric Liang
6bb1103930 [rllib] Avoid sample wastage with bad PPO configurations ()
## What do these changes do?

Previously we logged a warning if the PPO configuration would waste many samples. However, this didn't apply in the case of long episodes in `complete_episodes` batch mode, and also the amount of waste is up to 2x in common cases.

This pr:
- Estimates the number of sampling tasks needed to avoid over-sampling.
- Collects all sample results and never discards any. In principle this can degrade performance at large scale if certain machines are slower. Add a config flag to enable this legacy behavior.

## Related issue number

Closes: https://github.com/ray-project/ray/issues/3549
2018-12-20 10:50:44 -08:00
Richard Liaw
ac48a58e4e
[tune] Reduce scope of variant generator ()
This PR provides a better error message when the generate_variants code
breaks. Also removes a comment about nesting dependencies.

This comes mainly as a hotfix solution for . We should leave that issue open for future contribution 🙂
2018-12-20 10:48:28 -08:00
Eric Liang
303883a3b6 [rllib] [rfc] add contrib module and guideline for merging ()
This adds guidelines for merging code into `rllib/contrib` vs `rllib/agents`. Also, clean up the agent import code to make registration easier.
2018-12-20 10:44:34 -08:00
adoda
cf0c4745f4 [rllib] support running older version tensorflow(version < 1.5.0) () 2018-12-19 20:27:24 -08:00
Robert Nishihara
a5309bec7c Make README render properly on PyPI. ()
* Make README render properly in pypi.

* Add small logo

* temporary fix

* smaller image

* Remove image size.

* Add author and email to setup.py.
2018-12-19 18:41:09 -08:00
Hao Chen
132a23354e Fix pending callback not called when ServerConnection destructs () 2018-12-19 17:29:36 -08:00
Eric Liang
ffa6ee3ec8
[rllib] streaming minibatching for IMPALA ()
* mb impala

* fix

* paropt

* update

* cpu warn

* on cpu

* fix mb

* doc

* docs

* comment

* larger num

* early release

* remove grad clip

* only check loader count in multi gpu mode

* revert bad multigpu changes

* num sgd iter

* comment

* reuse optimizer

* add test

* par load test

* loosen test

* Update run_multi_node_tests.sh

* fix local mode

* Update agent.py
2018-12-19 02:23:29 -08:00
Alexey Tumanov
c4cba98c75 Remove deprecation warnings when running actor tests ()
* remove deprecation warnings when running actor tests

* replacing logger.warn with logger.warning

* Update worker.py

* Update policy_client.py

* Update compression.py
2018-12-18 17:04:51 -08:00
Yuhong Guo
fb33fa9097 Enable function_descriptor in backend to replace the function_id () 2018-12-18 18:53:59 -05:00
Alexey Tumanov
3822b20319 [doc] update testing and dev instructions ()
* [doc] update python testing command

* update installation/dev instructions
2018-12-18 14:45:24 -08:00
Stephanie Wang
26ca40817e Convert UniqueID::nil() to a constructor ()
* Initialize UniqueID to nil

* Return reference to static const variable
2018-12-18 11:59:02 -08:00
Yuhong Guo
75ddf7cca4 Fix 2 small bugs () 2018-12-18 14:52:21 -05:00
Eric Liang
db0dee573e
[rllib] Q-Mix implementation (Q-Mix, VDN, IQN, and Ape-X variants) () 2018-12-18 10:40:01 -08:00
YifengHuang
bc4aa85ea3 fix link in doc () 2018-12-18 00:10:55 -08:00
opherlieber
854b06854f remove auto-concat of rollouts in AsyncSampler ()
* remove auto-concat of rollouts in AsyncSampler

* remote auto-concat test

* remove unused reference
2018-12-17 13:54:52 -08:00
Devin Petersohn
3833ba4e4b Bump modin version to 0.2.5 () 2018-12-17 14:36:47 -05:00
Tianming Xu
7767aba637 Note requirement cython==0.29.0 in installation instructions () 2018-12-17 20:43:47 +08:00
Robert Nishihara
417c7f2d6f Update arrow and remove plasma_manager references. () 2018-12-15 23:36:02 -08:00
Philipp Moritz
b3bf608608 Update arrow to reduce plasma IPCs. () 2018-12-14 23:49:37 -05:00
Stephanie Wang
fcc37021b2
Throw exception for ray.get of an evicted actor object ()
* Add a flag for whether an object has been created before

* Add regression test

* doc

* Share object directory between object and node managers

* Treat evicted actor tasks as failed

* minor

* Check return value

* Fix bug where object locations weren't getting updated on client death

* Fix mac build

* Use RayTaskError
2018-12-14 11:41:27 -08:00
bibabolynn
7fd24e384b [java] Pass large args by reference () 2018-12-14 23:32:35 +08:00
Richard Liaw
de3fdeb5b5
[autoscaler] Fix Error Handling for botocore ()
Unfortunately Boto generates error classes dynamically, so this catches
the expected error and raises the error if it is the wrong class.

Closes .
2018-12-14 00:20:49 -08:00
Yuhong Guo
2a4685a08b Add a script to collect built thirdparty libs to avoid download and building again. () 2018-12-13 23:56:40 -08:00
Yuhong Guo
a4abe6c0fe Add test to test raylet client connection when raylet crashes. () 2018-12-13 23:40:50 -08:00
Hao Chen
e7b51cbd1b [xray] Implement Actor Reconstruction ()
* Implement Actor Reconstruction

* fix

* fix actor handle __del__

* fix lint

* add comment

* Remove actorCreationDummyObjectId

* address comments

* fix

* address comments

* avoid copy

* change log to debug

* fix error name
2018-12-13 21:28:58 -08:00
Alexey Tumanov
2455de78ce save initial config instead of initial resource config () 2018-12-13 20:39:42 -08:00
Si-Yuan
84fae57ab5 Convert the raylet client (the code in local_scheduler_client.cc) to proper C++. ()
* refactoring

* fix bugs

* create client class

* create client class for java; bug fix

* remove legacy code

* improve code by using std::string, std::unique_ptr rename private fields and removing legacy code

* rename class

* improve naming

* fix

* rename files

* fix names

* change name

* change return types

* make a mutex private field

* fix comments

* fix bugs

* lint

* bug fix

* bug fix

* move too short functions into the header file

* Loose crash conditions for some APIs.

* Apply suggestions from code review

Co-Authored-By: suquark <suquark@gmail.com>

* format

* update

* rename python APIs

* fix java

* more fixes

* change types of cpython interface

* more fixes

* improve error processing

* improve error processing for java wrapper

* lint

* fix java

* make fields const

* use pointers for [out] parameters

* fix java & error msg

* fix resource leak, etc.
2018-12-13 13:39:10 -08:00
Chunyang Wen
5dcc333199 [sgd] Modify: add interface for model ()
* Modify: add interface for model

* Modify: remove single quota and build; add metrics

* Modify: flatten into list of dict

* Update distributed_sgd.rst

* Modify: update format with scripts/format.sh

* Update sgd_worker.py
2018-12-12 21:23:25 -08:00