If we set `ray.home` configuration item to `""`.
The current `RayConfig` will set it to current work directory, like `/User/My/Ray`.
But the some other configuration items(like `redisServerExecutablePath`) will be set to `/User/My/Ray//build/src/common/thirdparty/redis/src/redis-server` by mistake.
Note: There are 2 `/` between current work directory and `build/src/common....`
This PR will fix this issue.
Adds a tmux flag that can be used to support background execution of experiments. Cannot be used together with screen. Seems to be useful feature that has shown up with different users.
## What do these changes do?
Previously, Java worker configuration is complicated, because it requires setting environment variables as well as command-line arguments.
This PR aims to simplify Java worker's configuration.
1) Configuration management is now migrated to [lightbend config](https://github.com/lightbend/config), thus doesn't require setting environment variables.
2) Many unused config items are removed.
3) Provide a simple `example.conf` file, so users can get started quickly.
4) All possible options and their default values are declared and documented in `ray.default.conf` file.
This PR also simplifies and refines the following code:
1) The process of `Ray.init()`.
2) `RunManager`.
3) `WorkerContext`.
### How to use this configuration?
1. Copy `example.conf` into your classpath and rename it to `ray.conf`.
2. Modify/add your configuration items. The all items are declared in `ray.default.conf`.
3. You can also set the items in java system prosperities.
Note: configuration is read in this priority:
System properties > `ray.conf` > `ray.default.conf`
## Related issue number
N/A
Update the version in maven from 0.1 to 0.1-SNAPSHOT, because SNAPSHOT is the conventional version name in dev process. Non-snapshot versions are only used for release.
This PR adds a `function_desc` field into task spec. a function descriptor is a list of strings that can uniquely describe a function.
- For a Python function, it should be: [module_name, class_name, function_name]
- For a Java function, it should be: [class_name, method_name, type_descriptor]
There're a couple of purposes to add this field:
In this PR:
- Java worker needs to know function's class name to load it. Previously, since task spec didn't have such a field to hold this info, we did a hack by appending the class name to the argument list. With this change, we fixed that hack and significantly simplified function management in Java.
Will be done in subsequent PRs:
- Support cross-language invocation (#2576): currently Python worker manages functions by saving them in GCS and pass function id in task spec. However, if we want to call a Python function from Java, we cannot save it in GCS and get the function id. But instead, we can pass the function descriptor (module name, class name, function name) in task spec and use it to load the function.
- Support deployment: one major problem of Python worker's current function management mechanism is #2327. In prod env, we should have a mechanism to deploy code and dependencies to the cluster. And when code is already deployed, we don't need to save functions to GCS any more and can use `function_desc` to manage functions.
Previous changes broke single-process mode in raylet. This PR fixes the hello-world example work in single-process mode. Follow-up diffs will completely fix single-process mode and add tests.
* enable using thirdparty env variable to find installed dependency, to speed up the build process
* fix target dependency in cmake. :-) too chaos in each CMakeLists
* check env variable defined directory exists
When a new raylet starts, `ClientAdded` will be called with the disconnected client data. However, since the client was closed, the connection will fail.
* Update Arrow to Plasma with glog and update the building process
* Remove ParquetExternalProject.cmake
* Fix Mac building error in CI
* Use find_package(BISON) instead of hard code
* Revert BISON binary to hard code.
* Remove build_parquet.sh
* Update setup.sh
* Added agent name & env id to default logdir prefix
* Revert "Added agent name & env id to default logdir prefix"
This reverts commit 07cfdf80d2537da3c67dd4f553c5f3e43671cc7d.
* Added default logger creator with informative prefix to Agent
* Updated import order & improved str cat
* Update agent.py
Fixes issue where object manager sometimes crashes within the `Wait` method: The issue stems from inconsistent behavior of the boost deadline timer's `cancel` method, which is invoked within `WaitComplete` to enforce exactly one `WaitComplete` invocation for each `Wait` request. The `cancel` method sometimes fails to actually prevent the timer's invocation of the provided handler with non-zero error code.
This PR makes it so debugging logs are only evaluated during debugging. We found that for the current code, functions called in debug logging code are evaluated even in release mode (even though nothing is printed).
* Trigger reconstruction in ray.wait and mark worker as blocked.
* Add test.
* Linting.
* Don't run new test with legacy Ray.
* Only call HandleClientUnblocked if it actually blocked in ray.wait.
* Reduce time to ray.wait in the test.
* use cmake to build ray project, no need to appply build.sh before cmake, fix some abuse of cmake, improve the build performance
* support boost external project, avoid using the system or build.sh boost
* keep compatible with build.sh, remove boost and arrow build from it.
* bugfix: parquet bison version control, plasma_java lib install problem
* bugfix: cmake, do not compile plasma java client if no need
* bugfix: component failures test timeout machenism has problem for plasma manager failed case
* bugfix: arrow use lib64 in centos, travis check-git-clang-format-output.sh does not support other branches except master
* revert some fix
* set arrow python executable, fix format error in component_failures_test.py
* make clean arrow python build directory
* update cmake code style, back to support cmake minimum version 3.4
A fix to an example for tune (`python/ray/tune/examples/pbt_tune_cifar10_with_keras.py`) where the hyperparameters for the optimizer, learning rate and decay, were not being passed into the optimizer.
This means that the current optimizer uses default values for the hyperparameters no matter the config.
Add new search algorithm (genetic) along with the base framework of the searcher (which performs some basic jobs such as logging, recording and organizing in our project).
Note that this is the initial commit. In the following days, we will add example, UT, and other refinements.
When running in a screen (or any other time it is hard to scroll up), printing "Suppressing previous error message" is not helpful since the previous error is lost far above past scrollback. Better to just print it repeatedly at the end.
tada 1
Adds the ability for trainables to reset their configurations during experiments. These changes in particular add the base functions to the trial_executor and trainable interfaces as well as giving the basic implementation on the PopulationBasedTraining scheduler.
Related issue number: #2741
* removed cv2
* remove opencv
* increased number of default rollouts ARS
* put cv2 back in this branch
* put cv2 back in this branch
* moved cv2 back where it belongs in preprocessors