Commit graph

14 commits

Author SHA1 Message Date
Philipp Moritz
1a926c9b7c Fix $MACOSX_DEPLOYMENT_TARGET (#3337) 2018-11-21 10:56:17 -08:00
Philipp Moritz
99bac44375 Update CMake to support Mac OS X 10.14 (#3218) 2018-11-04 16:32:58 -08:00
Hanwei Jin
060891a9c9 [cmake] avoid to re-build pyarrow (#2963)
* bugfix: env exists check error

* support to avoid re-build pyarrow in project

* bugfix: adapt gtest for centos lib64

* bugfix: check gtest lib exists in the directory

* bugfix: find gtest with checking all libs exists

* prefix RAY_ to thirdparty env variables to avoid conflicts with other module

* arrow use glog from ray

* change the glog and gtest install dir
2018-10-10 14:33:15 -07:00
Hanwei Jin
9f9e49e4a1 [cmake] enable using thirdparty env variable to find installed dependency (#2912)
* enable using thirdparty env variable to find installed dependency, to speed up the build process

* fix target dependency in cmake. :-) too chaos in each CMakeLists

* check env variable defined directory exists
2018-09-23 07:52:33 -07:00
Yuhong Guo
93ded5a3d5 Update arrow using Plasma with glog (#2913)
* Update Arrow to Plasma with glog and update the building process

* Remove ParquetExternalProject.cmake

* Fix Mac building error in CI

* Use find_package(BISON) instead of hard code

* Revert BISON binary to hard code.

* Remove build_parquet.sh

* Update setup.sh
2018-09-20 13:37:44 -07:00
Hanwei Jin
dc76e51a60 bugfix: cmake copy plasma java lib from lib64 directory in centos (#2885) 2018-09-16 22:32:09 -07:00
Hanwei Jin
fbf214e408 update ray cmake build process (#2853)
* use cmake to build ray project, no need to appply build.sh before cmake, fix some abuse of cmake, improve the build performance

* support boost external project, avoid using the system or build.sh boost

* keep compatible with build.sh, remove boost and arrow build from it.

* bugfix: parquet bison version control, plasma_java lib install problem

* bugfix: cmake, do not compile plasma java client if no need

* bugfix: component failures test timeout machenism has problem for plasma manager failed case

* bugfix: arrow use lib64 in centos, travis check-git-clang-format-output.sh does not support other branches except master

* revert some fix

* set arrow python executable, fix format error in component_failures_test.py

* make clean arrow python build directory

* update cmake code style, back to support cmake minimum version 3.4
2018-09-12 11:19:33 -07:00
Philipp Moritz
a34a7172b4 Remove gflags (#2813)
Seems like gflags is not needed. This *might* remove writing spurious files into the home directory on the RISE infrastructure.
2018-09-03 16:10:47 -07:00
Yuhong Guo
9f06c19edd Fix glog wheel failure on MacOS (#2775) 2018-08-30 09:06:19 -07:00
Yuhong Guo
eec1a3eb89 Support pluggable backend log lib with glog (#2695)
* [WIP] Support different backend log lib

* Refine code, unify level, address comment

* Address comment and change formatter

* Fix linux building failure.

* Fix lint

* Remove log4cplus.

* Add log init to raylet main and add test to travis.

* Address comment and refine.

* Update logging_test.cc
2018-08-23 09:43:38 -07:00
Philipp Moritz
4c82ac72df Upgrade arrow to include the plasma TensorFlow op (#2412) 2018-07-18 12:33:02 -07:00
Alexey Tumanov
91464a56dd [XRay] Raylet node and object manager unification/backend redesign. (#1640)
* directory for raylet

* some initial class scaffolding -- in progress

* node_manager build code and test stub files.

* class scaffolding for resources, workers, and the worker pool

* Node manager server loop

* raylet policy and queue - wip checkpoint

* fix dependencies

* add gen_nm_fbs as target.

* object manager build, stub, and test code.

* Start integrating WorkerPool into node manager

* fix build on mac

* tmp

* adding LsResources boilerplate

* add/build Task spec boilerplate

* checkpoint ActorInformation and LsQueue

* Worker pool maintains started and removed workers

* todos for e2e task assignment

* fix build on mac

* build/add lsqueue interface

* channel resource config through from NodeServer to LsResources; prep LsResources to replace/provide worker_pool

* progress on LsResources class: resource availability check implementation

* Read task submission messages from a client

* Submit tasks from the client to the local scheduler

* Assign a task to a worker from the WorkerPool

* change the way node_manager is built to prevent build issues for object_manager.

* add namespaces. fix build.

* Move ClientConnection message handling into server, remove reference to
WorkerPool

* Add raw constructors for TaskSpecification

* Define TaskArgument by reference and by value

* Flatbuffer serialization for TaskSpec

* expand resource implementation

* Start integrating TaskExecutionSpecification into Task

* Separate WorkerPool from LsResources, give ownership to NodeServer

* checkpoint queue and resource code

* resoving merge conflicts

* lspolicy::schedule ; adding lsqueue and lspolicy to the nodeserver

* Implement LsQueue RemoveTasks and QueueReadyTasks

* Fill in some LsQueue code for assigning a task

* added suport for test_asio

* Implement LsQueue queue tasks methods, queue running tasks

* calling into policy from nodeserver; adding cluster resource map

* Feedback and Testing.
Incorporate Alexey's feedback. Actually test some code. Clean up callback imp.

* end to end task assignment

* Decouple local scheduler from node server

* move TODO

* Move local scheduler to separate file

* Add scaffolding for reconstruction policy, task dependency manager, and object manager

* fix

* asio for store client notifications.
added asio for plasma store connection.
added tests for store notifications.
encapsulate store interaction under store_messenger.

* Move Worker inside of ClientConnection

* Set the assigned task ID in the worker

* Several changes toward object manager implementation.
Store client integration with asio.
Complete OM/OD scaffolding.

* simple simulator to estimate number of retry timeouts

* changing dbclientid --> clientid

* fix build (include sandbox after it's fixed).

* changes to object manager, adding lambdas to the interface

* changing void * callbacks to std::function typed callbacks

* remove use namespace std from .h files.
use ray:: for Status everywhere.

* minor

* lineage cache interfaces

* TODO for object IDs

* Interface for the GCS client table

* Revert "Set the assigned task ID in the worker"

This reverts commit a770dd31048a289ef431c56d64e491fa7f9b2737.

* Revert "Move Worker inside of ClientConnection"

This reverts commit dfaa0d662a76976c05be6d76b214b45d88482818.

* OD/OM: ray::Status

* mock gcs integration.

* gcs mock clientinfo assignment

* Allow lookup of a Worker in the WorkerPool

* Split out Worker and ClientConnection source files

* Allow assignment of a task ID to a worker, skeleton for finishing a task

* integrate mock gcs with om tests.

* added tcp connection acceptor

* integrated OM with NM.
integrated GcsClient with NM.
Added multi-node integration tests.

* OM to receive incoming tcp connections.

* implemented object manager connection protocol.

* Added todos.

* slight adjustment to add/remove handler invocation on object store client.

* Simplify Task interface for getting dependencies

* Remove unused object manager file

* TaskDependencyManager tracks missing task dependencies and processes object add notifications

* Local scheduler queues tasks according to argument availability

* Fill in TaskSpecification methods to get arguments

* Implemented push.

* Queue tasks that have been scheduled but that are waiting for a worker

* Pull + mock gcs cleanup.

* OD/OM/GCS mock code review, fixing unused-result issues, eliminating copy ctor

* Remove unique_ptr from object_store_client

* Fix object manager Push memory error

* Pull task arguments in task dependency manager

* Add a demo script for remote task dependencies

* Some comments for the TaskDependencyManager

* code cleanup; builds on mac

* Make ClientConnection a templated type based on the connection protocol

* Add gmock to build

* Add WorkerPool unit tests

* clean up.

* clean up connection code.

* instantiate a template instance in the module

* Virtual destructors

* Document public api.

* Separate read and write buffers in ClientConnection; documentation

* Remove ObjectDirectory from NodeServer constructor, make directory InitGcs call a separate constructor

* Convert NodeServer Terminate to a destructor

* NodeServer documentation

* WorkerPool documentation

* TaskDependencyManager doc

* unifying naming conventions

* unifying naming conventions

* Task cleanup and documentation

* unifying naming conventions

* unifying naming conventions

* code cleanup and naming conventions

* code cleanup

* Rename om --> object_manager

* Merge with master

* SchedulingQueue doc

* Docs and implementation skeleton for ClientTable

* Node manager documentation

* ReconstructionPolicy doc

* Replace std::bind with lambda in TaskDependencyManager

* lineage cache doc

* Use \param style for doc

* documentation for scheduling policy and resources

* minor code cleanup

* SchedulingResources class documentation + code cleanup

* referencing ray/raylet directory; doxygen documentation

* updating trivial policy

* Fix bug where event loop stops after task submission

* Define entry point for ClientManager for handling new connections

* Node manager to node manager protocol, heartbeat protocol

* Fix flatbuffer

* Fix GCS flatbuffer naming conflict

* client connection moved to common dir.

* rename based on feedback.

* Added google style and 90 char lines clang-format file under src/ray.

* const ref ClientID.

* Incorporated feedback from PR.

* raylet: includes and namespaces

* raylet/om/gcs logging/using

* doxygen style

* camel casing, comments, other style; DBClientID -> ClientID

* object_manager : naming, defines, style

* consistent caps and naming; misc style

* cleaning up client connection + other stylistic fixes

* cmath, std::nan

* more style polish: OM, Raylet, gcs tables

* removing sandbox (moved to ray-project/sandbox)

* raylet linting

* object manager linting

* gcs linting

* all other linting


Co-authored-by: Melih <elibol@gmail.com>
Co-authored-by: Stephanie <swang@cs.berkeley.edu>
2018-03-08 12:53:24 -08:00
Philipp Moritz
eabc4027c8 Hiredis asio integration (#1547) 2018-02-20 13:37:09 -08:00
Philipp Moritz
cac5f47600 First Part of Internal Ray API Refactor (#1173)
* add Ray status class

* add C++ util files

* add ID types

* more APIs

* build system integration

* add test infrastructure and implement some APIs

* add more tests

* fix bugs

* add task table tests

* update

* add toolchain file

* fix

* test

* link with pthread

* update

* fix

* more fixes

* fixes

* always vendor gtest and gflags

* linting

* fixes

* add constants file

* comments

* more fixes

* fix linting
2017-12-14 14:54:09 -08:00