Commit graph

33 commits

Author SHA1 Message Date
mylinyuzhi
fa0ade2bc5 [Java] Replace binary rewrite with Remote Lambda Cache (SerdeLambda) (#2245)
* <feature> : serde lambda

* <feature>:fixed CR

with issue #2245

* <feature>: fixed CR
2018-06-13 12:58:07 -07:00
eric-jj
34bc6ce6ea remove UniqueIDHasher (#1957)
* remove UniqueIDHasher

* Format the change

* remove unused line

* Fix format

* fix lint error

* fix linting whitespace
2018-04-30 06:31:23 -07:00
Alexey Tumanov
91464a56dd [XRay] Raylet node and object manager unification/backend redesign. (#1640)
* directory for raylet

* some initial class scaffolding -- in progress

* node_manager build code and test stub files.

* class scaffolding for resources, workers, and the worker pool

* Node manager server loop

* raylet policy and queue - wip checkpoint

* fix dependencies

* add gen_nm_fbs as target.

* object manager build, stub, and test code.

* Start integrating WorkerPool into node manager

* fix build on mac

* tmp

* adding LsResources boilerplate

* add/build Task spec boilerplate

* checkpoint ActorInformation and LsQueue

* Worker pool maintains started and removed workers

* todos for e2e task assignment

* fix build on mac

* build/add lsqueue interface

* channel resource config through from NodeServer to LsResources; prep LsResources to replace/provide worker_pool

* progress on LsResources class: resource availability check implementation

* Read task submission messages from a client

* Submit tasks from the client to the local scheduler

* Assign a task to a worker from the WorkerPool

* change the way node_manager is built to prevent build issues for object_manager.

* add namespaces. fix build.

* Move ClientConnection message handling into server, remove reference to
WorkerPool

* Add raw constructors for TaskSpecification

* Define TaskArgument by reference and by value

* Flatbuffer serialization for TaskSpec

* expand resource implementation

* Start integrating TaskExecutionSpecification into Task

* Separate WorkerPool from LsResources, give ownership to NodeServer

* checkpoint queue and resource code

* resoving merge conflicts

* lspolicy::schedule ; adding lsqueue and lspolicy to the nodeserver

* Implement LsQueue RemoveTasks and QueueReadyTasks

* Fill in some LsQueue code for assigning a task

* added suport for test_asio

* Implement LsQueue queue tasks methods, queue running tasks

* calling into policy from nodeserver; adding cluster resource map

* Feedback and Testing.
Incorporate Alexey's feedback. Actually test some code. Clean up callback imp.

* end to end task assignment

* Decouple local scheduler from node server

* move TODO

* Move local scheduler to separate file

* Add scaffolding for reconstruction policy, task dependency manager, and object manager

* fix

* asio for store client notifications.
added asio for plasma store connection.
added tests for store notifications.
encapsulate store interaction under store_messenger.

* Move Worker inside of ClientConnection

* Set the assigned task ID in the worker

* Several changes toward object manager implementation.
Store client integration with asio.
Complete OM/OD scaffolding.

* simple simulator to estimate number of retry timeouts

* changing dbclientid --> clientid

* fix build (include sandbox after it's fixed).

* changes to object manager, adding lambdas to the interface

* changing void * callbacks to std::function typed callbacks

* remove use namespace std from .h files.
use ray:: for Status everywhere.

* minor

* lineage cache interfaces

* TODO for object IDs

* Interface for the GCS client table

* Revert "Set the assigned task ID in the worker"

This reverts commit a770dd31048a289ef431c56d64e491fa7f9b2737.

* Revert "Move Worker inside of ClientConnection"

This reverts commit dfaa0d662a76976c05be6d76b214b45d88482818.

* OD/OM: ray::Status

* mock gcs integration.

* gcs mock clientinfo assignment

* Allow lookup of a Worker in the WorkerPool

* Split out Worker and ClientConnection source files

* Allow assignment of a task ID to a worker, skeleton for finishing a task

* integrate mock gcs with om tests.

* added tcp connection acceptor

* integrated OM with NM.
integrated GcsClient with NM.
Added multi-node integration tests.

* OM to receive incoming tcp connections.

* implemented object manager connection protocol.

* Added todos.

* slight adjustment to add/remove handler invocation on object store client.

* Simplify Task interface for getting dependencies

* Remove unused object manager file

* TaskDependencyManager tracks missing task dependencies and processes object add notifications

* Local scheduler queues tasks according to argument availability

* Fill in TaskSpecification methods to get arguments

* Implemented push.

* Queue tasks that have been scheduled but that are waiting for a worker

* Pull + mock gcs cleanup.

* OD/OM/GCS mock code review, fixing unused-result issues, eliminating copy ctor

* Remove unique_ptr from object_store_client

* Fix object manager Push memory error

* Pull task arguments in task dependency manager

* Add a demo script for remote task dependencies

* Some comments for the TaskDependencyManager

* code cleanup; builds on mac

* Make ClientConnection a templated type based on the connection protocol

* Add gmock to build

* Add WorkerPool unit tests

* clean up.

* clean up connection code.

* instantiate a template instance in the module

* Virtual destructors

* Document public api.

* Separate read and write buffers in ClientConnection; documentation

* Remove ObjectDirectory from NodeServer constructor, make directory InitGcs call a separate constructor

* Convert NodeServer Terminate to a destructor

* NodeServer documentation

* WorkerPool documentation

* TaskDependencyManager doc

* unifying naming conventions

* unifying naming conventions

* Task cleanup and documentation

* unifying naming conventions

* unifying naming conventions

* code cleanup and naming conventions

* code cleanup

* Rename om --> object_manager

* Merge with master

* SchedulingQueue doc

* Docs and implementation skeleton for ClientTable

* Node manager documentation

* ReconstructionPolicy doc

* Replace std::bind with lambda in TaskDependencyManager

* lineage cache doc

* Use \param style for doc

* documentation for scheduling policy and resources

* minor code cleanup

* SchedulingResources class documentation + code cleanup

* referencing ray/raylet directory; doxygen documentation

* updating trivial policy

* Fix bug where event loop stops after task submission

* Define entry point for ClientManager for handling new connections

* Node manager to node manager protocol, heartbeat protocol

* Fix flatbuffer

* Fix GCS flatbuffer naming conflict

* client connection moved to common dir.

* rename based on feedback.

* Added google style and 90 char lines clang-format file under src/ray.

* const ref ClientID.

* Incorporated feedback from PR.

* raylet: includes and namespaces

* raylet/om/gcs logging/using

* doxygen style

* camel casing, comments, other style; DBClientID -> ClientID

* object_manager : naming, defines, style

* consistent caps and naming; misc style

* cleaning up client connection + other stylistic fixes

* cmath, std::nan

* more style polish: OM, Raylet, gcs tables

* removing sandbox (moved to ray-project/sandbox)

* raylet linting

* object manager linting

* gcs linting

* all other linting


Co-authored-by: Melih <elibol@gmail.com>
Co-authored-by: Stephanie <swang@cs.berkeley.edu>
2018-03-08 12:53:24 -08:00
Simon Mo
d78a22f94c [DataFrame] Implement IO for ray_df (#1599)
* Add parquet-cpp to gitignore

* Add read_csv and read_parquet

* Gitignore pytest_cache

* Fix flake8

* Add io to __init__

* Changing Index. Currently running tests, but so far untested.

* Removing issue of reassigning DF in from_pandas

* Fixing lint

* Fix bug

* Fix bug

* Fix bug

* Better performance

* Fixing index issue with sum

* Address comments

* Update io with index

* Updating performance and implementation. Adding tests

* Fixing off-by-1

* Fix lint

* Address Comments

* Make pop compatible with new to_pandas

* Format Code

* Cleanup some index issue

* Bug fix: assigned reset_index back

* Remove unused debug line
2018-02-26 18:26:38 -08:00
Melih Elibol
d8850eac4b Suppress object transfer requests when object is already being received. (#1430)
* added deterministic check for objects received in fetch_timeout_handler.

* use receive time, in case something goes wrong after object is received.

* increase timeout for removal.

* indentation fix.

* make log info log debug. clean up debug log.

* undo unecessary changes.

* changed description var.

* shorten line 949.

* incorporate feedback.

* linting; make is_object_received function consts.

* change semantics of received_objects to objects being received.
added checks to both points at which objects are re-requested.
updated object receive initialization accordingly.

* eliminate erase on receive init. check call to request_transfer_from instead of request_transfer.

* updated comments.

* added todo for multiple object transfers.

* linting.
2018-02-01 22:45:31 -08:00
Eric Liang
715737cc06 [docs] Add backlinks from hyperopt / rl algorithm examples to the built-on Ray libraries (#1356) 2017-12-23 00:31:33 -08:00
Daniel Suo
4f0da6f81c Add basic functionality for Cython functions and actors (#1193)
* Add basic functionality for Cython functions and actors

* Fix up per @pcmoritz comments

* Fixes per @richardliaw comments

* Fixes per @robertnishihara comments

* Forgot double quotes when updating masked_log

* Remove import typing for Python 2 compatibility
2017-11-09 17:49:06 -08:00
Eric Liang
5a50e0e1d7 [rllib] Add the ability to run arbitrary Python scripts with ray.tune (#1132)
* fix yaml bug

* add ext agent

* gpus

* update

* tuning

* docs

* Sun Oct 15 21:09:25 PDT 2017

* lint

* update

* Sun Oct 15 22:39:55 PDT 2017

* Sun Oct 15 22:40:17 PDT 2017

* Sun Oct 15 22:43:06 PDT 2017

* Sun Oct 15 22:46:06 PDT 2017

* Sun Oct 15 22:46:21 PDT 2017

* Sun Oct 15 22:48:11 PDT 2017

* Sun Oct 15 22:48:44 PDT 2017

* Sun Oct 15 22:49:23 PDT 2017

* Sun Oct 15 22:50:21 PDT 2017

* Sun Oct 15 22:53:00 PDT 2017

* Sun Oct 15 22:53:34 PDT 2017

* Sun Oct 15 22:54:33 PDT 2017

* Sun Oct 15 22:54:50 PDT 2017

* Sun Oct 15 22:55:20 PDT 2017

* Sun Oct 15 22:56:56 PDT 2017

* Sun Oct 15 22:59:03 PDT 2017

* fix

* Update tune_mnist_ray.py

* remove script trial

* fix

* reorder

* fix ex

* py2 support

* upd

* comments

* comments

* cleanup readme

* fix trial

* annotate

* Update rllib.rst
2017-10-18 11:49:28 -07:00
Eric Liang
6ecc899cf2 [rllib] Fix DQN checkpoint/restore and enable test in jenkins (#1063)
* fix dqn restore and add test

* Update .gitignore

* Update test_checkpoint_restore.py

* add checkpoint restore
2017-10-03 23:17:54 -07:00
Eric Liang
19562f6ce5 [rllib] Fix issues with PPO model restoration (#1018)
* fix filter

* add test

* lint

* fix

* commit

* Update a3c.py
2017-09-28 13:12:06 -07:00
Philipp Moritz
7030ef366f Rebase Ray on latest arrow (remove numbuf from Ray). (#910)
* remove some stuff

* put get roundtrip working

* fixes

* more fixes

* cleanup

* fix tests

* latest arrow

* fixes

* fix tests

* fix linting

* rebase

* fixes

* fix bug

* bring back libgcc error

* fix linting

* use official arrow repo

* fixes
2017-09-04 22:58:49 -07:00
Eric Liang
1ebfe9608f [rllib] Add downscale and frameskip options for Montezumas (#908)
* up

* update

* fix

* update

* update

* update

* api break

* Update run_multi_node_tests.sh

* fix
2017-09-02 17:20:56 -07:00
Eric Liang
c81821b856 [rllib] Make Pong-v0 + EvolutionStrategies work by sharing preprocessors with PPO (#848)
* fix by sharing preprocessors

* revert param changeg

* Update evolution_strategies.py

* Update catalog.py
2017-08-21 18:51:49 -07:00
Robert Nishihara
e0867c8845 Switch Python indentation from 2 spaces to 4 spaces. (#726)
* 4 space indentation for actor.py.

* 4 space indentation for worker.py.

* 4 space indentation for more files.

* 4 space indentation for some test files.

* Check indentation in Travis.

* 4 space indentation for some rl files.

* Fix failure test.

* Fix multi_node_test.

* 4 space indentation for more files.

* 4 space indentation for remaining files.

* Fixes.
2017-07-13 21:53:57 +00:00
Eric Liang
06241daf61 Policy gradient example: record stats for tensorboard (#577)
* add tf metrics

* comments

* fix network scopes

* add doc

* use format string

* fix trace level

* plot intermediate and final sgd stats

* add back a global step
2017-05-21 14:51:24 -07:00
Philipp Moritz
d394a3fdf6 Website for the v0.1 release (#576)
* commit jekyll template

* Port blog post to markdown.

* Small changes.

* Improvements to layout and post.

* More improvements.

* Add computation graph figures to the blog post.

* Small changes.

* Update gitignore.
2017-05-20 18:33:36 -07:00
Eric Liang
e2e9e4ce6f Fix segmentation fault when calling ray.put on a dictionary with object keys (#548)
* fix segfault when serializing dict key

* fix style

* fix test

* Fix linting.
2017-05-15 01:09:13 -07:00
Guru Medasani
0189b09581 Fixes Mac OSX installation error (#464)
* changes to address ARROW-826 and ARROW-444

* changes to address ARROW-826 and ARROW-444

* ignoring cmake-build-debug

* additional IDEA ignore files

* additional IDEA ignore files

* remove arrow ipc and arrow io libraries

* add boost dependencies

* fix arrow origin and remove submodule
2017-04-16 15:02:15 -07:00
Robert Nishihara
964d5cac48 Expand API documentation. (#375)
* Expand API documentation and convert tutorial to rst.

* Fix formatting error in tutorial.

* Address William's comments.

* Address Stephanie's comments.
2017-03-17 16:48:25 -07:00
Robert Nishihara
6a4bde54dc Only install ray python packages. (#330)
* Only install ray python packages.

* Add some __init__.py files.

* Install Ray before building documentation.

* Fix install-ray.sh.

* Fix.
2017-03-01 23:34:44 -08:00
Philipp Moritz
a708e36225 Switch build system to use CMake completely. (#200)
* switch to CMake completely

...

* cleanup

* Run C tests, update installation instructions.
2017-01-17 16:56:40 -08:00
Philipp Moritz
0ca0864856 Use flatcc for serialization of IPC messages. (#140)
* added Phllipp's updates

* Switch to using flatbuffers for IPC.

* Various changes.

* convert remaining messages and cleanups

* fix

* fix function signatures

* fix valgrind errors

* clang-format

* final commit

* Fix valgrind test.
2016-12-20 14:46:25 -08:00
Philipp Moritz
817f1e730c Implement tables with redis modules (#114)
* initial redis module

* temp commit

* temp commit

* temp commit

* Empty object table functions and broken object_table_lookup

* fix segfault and clean up code

* cleanup and tests

* try to ignore redismodule.h

* check if data_size is integer

* Minor changes to redis-module tests.

* try to exclude redismodule from clang-format

* try something different

* fix clang-format and tests

* sleep a bit

* Result table

* fix redis_module tests

* fix tests and add tests for result table

* more tests

* randomize ports

* Minor changes.

* More fixes.
2016-12-11 17:40:19 -08:00
Robert Nishihara
02e8dd245d Update gitignore. (#94) 2016-12-07 11:54:16 -08:00
mehrdadn
3714984094 Remove Redis version from Linux scripts (#56)
* Remove Redis version from Linux scripts

* Add documentation.
2016-11-21 15:02:40 -08:00
Robert Nishihara
5c50d97670 Update .gitignore file. (#7) 2016-10-28 11:40:08 -07:00
Johann Schleier-Smith
149c13913a ignoring build output and example datasets (#360) 2016-08-09 11:30:33 -07:00
Johann Schleier-Smith
3ee0fd8f34 Update cluster guide (#347)
* clarify cluster setup instructions

* update multinode documentation, update cluster script, fix minor bug in worker.py

* clarify cluster documentation and fix update_user_code
2016-08-04 09:14:20 -07:00
mehrdadn
1961deeffb Update Windows support (#317) 2016-07-28 13:11:13 -07:00
mehrdadn
84321d3f75 Ignore protobuf generated files (#200) 2016-07-02 21:07:28 -07:00
Robert Nishihara
cd2d630074 adding to gitignore 2016-06-26 10:59:16 -07:00
mehrdadn
68d752c23e Some Visual Studio gitignores (#122)
* Ignore /build directory

* Some Windows and Visual Studio .gitignores
2016-06-18 10:04:02 -07:00
Philipp Moritz
3655dfbc23 Initial commit 2016-02-07 14:18:40 -08:00