Commit graph

2290 commits

Author SHA1 Message Date
bibabolynn
9a5c273db7 [java] fix check exception type (#3093)
<!--
Thank you for your contribution!

Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request.
-->

## What do these changes do?
remove TaskExecutionException, use RayException instead
<!-- Please give a short brief about these changes. -->

## Related issue number

<!-- Are there any issues opened that will be resolved by merging this change? -->
2018-10-19 06:43:42 -07:00
Wang Qing
b410ee0d29 [Java] Support dynamically defining resources when submitting task. (#3070)
## What do these changes do?
Before this PR, if we want to specify some resources, we must do as following codes:
```java
@RayRemote(Resources={ResourceItem("CPU", 10)})
public static void f1() {
// do sth
}

@RayRemote(Resources={ResourceItem("CPU", 10)})
class Demo {
// sth
}
```
Unfortunately, it's no way for us to create another actor or task with different resources required.

After this PR, the thing will be:
```java
ActorCreationOptions option = new ActorCreationOptions(); 
option.resources.put("CPU", 4.0);
RayActor<Echo> echo1 = Ray.createActor(Echo::new, option);
option.resources.put("Res-A", 4.0);
RayActor<Echo> echo2 = Ray.createActor(Echo::new, option);


//if we don't specify resource,  the resources will be `{"cpu":0.0}` by default.
Ray.call(Echo::echo, echo2, 100);
```


## Related issue number
N/A
2018-10-19 06:22:32 -07:00
Eric Liang
9d23fa03c9 [xray] All messages on main asio event loop should be written asynchronously (#3023)
* copy over ref code

* wip async writes

* compiles

* fix error handling

* add test

* amend

* fix test

* clang fmgt

* clang format

* wip

* yapf

* rename format script

* test error

* clangfmt

* add test to list

* warn

* ref test

* fix test

* comment

* add capture

* Update client_connection.cc

* wip

* fix compile
2018-10-18 21:56:22 -07:00
Peter Schafhalter
fa469783d8 Fix bug when connecting to password-secured cluster (#3083) 2018-10-18 21:43:03 -07:00
Devin Petersohn
8fcdafc6ea Adding Python3.7 wheels support (#2546)
* Adding Python3.7 wheels support

* Adding Mac wheels update

* fix

* numpy version

* choose different numpy versions depending on python version

* fix
2018-10-18 17:58:39 -07:00
Yuhong Guo
653c5b114a [c++] Refine Log Code (#2816)
* Support setting logging level from env variable

* Remove Env Variable related code

* lint
2018-10-18 10:51:36 -07:00
Peter Schafhalter
b82fd157a7 Remove Redis protected mode (#3073)
Follow-up to #2925 and #2952. Removes the Redis protected mode implementation from Ray which was replaced by Redis port authentication.
2018-10-17 22:48:14 -07:00
Philipp Moritz
2c52d9dfa0 Fix actor handle id creation when actor handle was pickled (#3074) 2018-10-17 18:00:52 -07:00
Richard Liu
3c0803e7e9 [rllib] use ray.wait to get next worker result in async sample optimizer (#2993) 2018-10-17 17:44:51 -07:00
Peter Schafhalter
a41bbc10ef Add password authentication to Redis ports (#2952)
* Implement Redis authentication

* Throw exception for legacy Ray

* Add test

* Formatting

* Fix bugs in CLI

* Fix bugs in Raylet

* Move default password to constants.h

* Use pytest.fixture

* Fix bug

* Authenticate using formatted strings

* Add missing passwords

* Add test

* Improve authentication of async contexts

* Disable Redis authentication for credis

* Update test for credis

* Fix rebase artifacts

* Fix formatting

* Add workaround for issue #3045

* Increase timeout for test

* Improve C++ readability

* Fixes for CLI

* Add security docs

* Address comments

* Address comments

* Adress comments

* Use ray.get

* Fix lint
2018-10-16 22:48:30 -07:00
Eric Liang
a9e454f6fd
[rllib] Include config dicts in the sphinx docs (#3064) 2018-10-16 15:55:11 -07:00
Wang Qing
64e5eb305e [Java] Add jvm-parameters in Config. (#3065) 2018-10-16 15:03:18 -07:00
Praveen Palanisamy
4d8cfc0bf5 [tune] Fix (some more) misleading comments in tune/results.py (#3068)
## What do these changes do?

Fix the misleading comments in code for:
 - `EPISODES_THIS_ITER`
 - `EPISODES_TOTAL`

Had noted it before and planned to fix it along with some other changes but seemed very relevant to stay next to #3058 so sending this now.
2018-10-16 11:07:53 -07:00
Eric Liang
6240ccbc6e
[rllib] Add more warnings when multi-agent envs might not be set up right (#3061) 2018-10-15 13:42:56 -07:00
Eric Liang
3c891c6ece
[rllib] Parallel-data loading and multi-gpu support for IMPALA (#2766) 2018-10-15 11:02:50 -07:00
Marlon
4dc78b735b [tune] Fix misleading comment (#3058) 2018-10-14 22:25:39 -07:00
Eric Liang
866c7a574c
[rllib] Don't crash printing out error message (#3054)
* fix er

* update
2018-10-13 19:50:23 -07:00
Eric Liang
473ee4eb3f
[rllib] Add unit test and some better error messages for custom policy states (#3032) 2018-10-13 00:03:52 -07:00
Hanwei Jin
87639b9e26 move make clean before cmake command, avoid always running mvn install plasma java lib (#3047) 2018-10-12 09:03:30 -07:00
Richard Liaw
f9b58d7b02
[tune] Tweaks to Trainable and Verbosity (#2889) 2018-10-11 23:42:13 -07:00
Wang Qing
828fe24b39 [Java] Fix loading driver resources issue. (#3046)
## What do these changes do?
Fix the issue how we load driver resources by a specified path.
Also this addressed the comments from the related PR [3044](https://github.com/ray-project/ray/pull/3044).

## Related PRs:
 [#3044](https://github.com/ray-project/ray/pull/3044) and [#3001](https://github.com/ray-project/ray/pull/3001).
2018-10-11 09:45:21 -07:00
Wang Qing
4a2ed47b6c [Java] Improve some Java code (#3040)
This PR improves some java codes,  and removes some duplicated code.
2018-10-10 17:30:23 -07:00
Hanwei Jin
060891a9c9 [cmake] avoid to re-build pyarrow (#2963)
* bugfix: env exists check error

* support to avoid re-build pyarrow in project

* bugfix: adapt gtest for centos lib64

* bugfix: check gtest lib exists in the directory

* bugfix: find gtest with checking all libs exists

* prefix RAY_ to thirdparty env variables to avoid conflicts with other module

* arrow use glog from ray

* change the glog and gtest install dir
2018-10-10 14:33:15 -07:00
Wang Qing
ef1f2fde95 Fix the uniqueId toString format. (#3035) 2018-10-08 13:12:14 -07:00
Wang Qing
84bf5fc8f3 [Java] Load driver resources from local path. (#3001)
## What do these changes do?
1. Add a configuration item `driver.resource-path`.
2. Load driver resources from the local path which is specified in the `ray.conf`.

Before this change, we should add all driver resources(like user's jar package, dependencies package and config files) into `classpath`.

After this change, we should add the driver resources into the mount path which we can configure it in `ray.conf`, and we shouldn't configure `classpath` for driver resources any more.

## Related issue number
N/A
2018-10-08 21:05:26 +01:00
Kristian Hartikainen
2d35a97a76 Bug/log syncer fails with parentheses (#2653)
* Update rsync command

* Escape rsync locations

* Fix the accidental variable move

* Update rsync to use -s flag
2018-10-06 00:34:53 -07:00
Richard Liaw
ecd8f39580 [core] Improve logging message when plasma store is started. (#3029)
Improve logging message when plasma store is started.
2018-10-05 15:24:24 -07:00
Richard Liaw
0651d3b629 [tune/core] Use Global State API for resources (#3004) 2018-10-04 17:23:17 -07:00
Robert Nishihara
faa31ae018 Introduce concept of resources required for placing a task. (#2837)
* Introduce concept of resources required for placement.
* Add placement resources to task spec
* Update java worker
* Update taskinfo.java
2018-10-04 10:35:39 -07:00
Richard Liaw
01bb073569 Suppress errors when worker or driver intentionally disconnects. (#2935) 2018-10-04 00:06:34 -07:00
Si-Yuan
f2dbd3096c Minor improvements and fixes in Python code. (#3022)
This commit fix some small defects. 
1. Remove a comment that should have been removed in #3003
2. Remove `redis_protected_mode` that is never used in `ray.init()`
3. Fix `object_id_seed` that is forgotten to be passed into `ray._init()`
4. Remove several redundant brackets.
2018-10-03 21:08:20 -07:00
Yuhong Guo
9948e8c11b Move function/actor exporting & loading code to function_manager.py (#3003)
Move function/actor exporting & loading code to function_manager.py to prepare the code change for function descriptor for python.
2018-10-03 16:21:04 -07:00
Robert Nishihara
d73ee36e60 Update links to use latest 0.5.3 wheels instead of 0.5.2. (#3018) 2018-10-03 13:43:40 -07:00
Si-Yuan
cc7e2ecdd5 Change logfile names and also allow plasma store socket to be passed in. (#2862) 2018-10-03 10:03:53 -07:00
bibabolynn
9c606ea06c fix bug: (#3000)
before fix,RAY_FUN_CACHE use only get method ,can only get null
  fix : put after create
2018-10-02 22:53:54 -07:00
Robert Nishihara
3ce8eb2d4c Test dying_worker_get and dying_worker_wait for xray. (#2997)
This tests the case in which a worker is blocked in a call to ray.get or ray.wait, and then the worker dies. Then later, the object that the worker was waiting for becomes available. We need to make sure not to try to send a message to the dead worker and then die. Related to #2790.
2018-10-02 00:08:47 -07:00
Eric Liang
2019b4122b
[rllib] Remove legacy multiagent support (#2975)
* remove legacy

* remove reshaper
2018-10-01 13:07:11 -07:00
Wang Qing
fcef4edd46 [Java] Fix the required-resources issue of actor member function in Java worker. (#3002)
This fixes a bug in which Java actor methods inherit the resource requirements of the actor creation task.
2018-10-01 12:56:36 -07:00
Eric Liang
b45bed4bce
[rllib] Propagate model options correctly in ARS / ES, to action dist of PPO (#2974)
* fix

* fix

* fix it

* propagate conf to action dist

* move carla example too

* rr

* Update policies.py

* wip

* lint
2018-10-01 12:49:39 -07:00
Eric Liang
e4bea8d10e
[rllib] Default to truncate_episodes and add some more config validators (#2967)
* update

* link it

* warn about truncation

* fix

* Update rllib-training.rst

* deprecate tests failing
2018-09-30 18:37:55 -07:00
Eric Liang
814c35b7d7
[rllib] Simplify sample batch size and num envs config, n_step adjustment (#2995)
* simplify vec batch requirements

* Update rllib-training.rst

* Update rllib-training.rst

* Update rllib-training.rst

* Update rllib-training.rst

* Update rllib-training.rst

* Update rllib-models.rst
2018-09-30 18:36:22 -07:00
old-bear
8aa736572b [tune] Fix hyperband edge case for None entries (#2964) 2018-09-30 09:57:43 -07:00
Robert Nishihara
ed6289771a Convert runtest.py to use pytest. (#2966)
* Convert runtest.py to use pytest.

* Linting.

* Fix

* Fix

* Fix

* Fix
2018-09-30 07:59:44 -07:00
Eric Liang
65dcafdc3f
[rllib] Refactor save() / restore() code of agents and avoid O(n_workers) save size (#2982) 2018-09-30 01:15:13 -07:00
Eric Liang
747253e0f6
[rllib] Don't shuffle samples in PPO when using lstm 2018-09-30 01:13:56 -07:00
Eric Liang
b06c604a51
[rllib] Add some more tuned atari results to documentation (#2991)
* dqn results ++

* add scale

* hour

* fix

* small dqn table

* update

* steps

* upd

* apex

* up

* add apex results

* tip
2018-09-29 23:13:36 -07:00
Eric Liang
cf9cd5da9d
[ray] Add --new flag for ray attach (#2973)
* new flag

* yapf
2018-09-29 23:04:13 -07:00
Eric Liang
cb56f39070 [rllib] Entropy calculation for diag gaussian missing 0.5 term (#2968)
See: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Entropy
2018-09-29 22:57:47 -07:00
old-bear
b3f0dcf20b [tune] Add a raise_on_failed_trial flag in run_experiments (#2961)
Adds a flag to control raising TuneError if some trial fails in `run_experiments`.
2018-09-29 11:29:46 -07:00
Wang Qing
a879302355 Improve log message when failing to fork worker process (#2990)
## What do these changes do?
```c++
  // Try to execute the worker command.
  int rv = execvp(worker_command_args[0],
                  const_cast<char *const *>(worker_command_args.data()));
  // The worker failed to start. This is a fatal error.
  RAY_LOG(FATAL) << "Failed to start worker with return value " << rv;
```
When starting a process fails, the return value `rv` always be set to -1.
It is useless for us.
The log message should show some meaningful infos.

For example, If we did't install java. The message showed for us should be:
```shell
 Failed to start worker: No such file or directory.
```
This could help us to locate issue quickly.

## Related issue number
N/A
2018-09-29 22:10:57 +08:00