Commit graph

1609 commits

Author SHA1 Message Date
Richard Liaw
23954e7ce2
[tune] Tune Documentation and expose better API (#1681) 2018-03-19 12:55:10 -07:00
Philipp Moritz
7b493aa4a1 Register credis with redis (#1730) 2018-03-18 14:02:19 -07:00
Christian Barra
070e27ea7a Add external module as a node scaler. (#1703)
* WIP: add external module as a node scaler.

* Fix style.

* Add tests, fix style issues.

* Fix typos.

* Fix test error.

* Fix node provider path.

* Add function to spli pkg from class.

* Add doc.

* Correct documentation.

* Debugging....

* Debugging....

* Add __init__.py to tests.

* add more output for debugging

* Add more test, fix error with import.

* Add a small detail to the documentation.

* Update autoscaler.py
2018-03-17 16:59:13 -07:00
Eric Liang
e3685fca5e
[rllib] remove redundant docs (#1728)
* wip

* more work

* fix apex

* docs

* apex doc

* pool comment

* clean up

* make wrap stack pluggable

* Mon Mar 12 21:45:50 PDT 2018

* clean up comment

* table

* Mon Mar 12 22:51:57 PDT 2018

* Mon Mar 12 22:53:05 PDT 2018

* Mon Mar 12 22:55:03 PDT 2018

* Mon Mar 12 22:56:18 PDT 2018

* Mon Mar 12 22:59:54 PDT 2018

* Update apex_optimizer.py

* Update index.rst

* Update README.rst

* Update README.rst

* comments

* Wed Mar 14 19:01:02 PDT 2018

* Fri Mar 16 15:44:27 PDT 2018
2018-03-17 14:45:04 -07:00
Richard Liaw
9b361115c3
[tune] Added Async HyperBand example (#1709) 2018-03-16 13:25:29 -07:00
Robert Nishihara
d8adfbd440 Fail build if pyarrow hasn't successfully been built. (#1721) 2018-03-16 12:02:36 -07:00
Robert Nishihara
75dc106303 Install flex and bison in autoscaler development example. (#1725) 2018-03-16 11:55:59 -07:00
Robert Nishihara
96913be939 Treat actor creation like a regular task. (#1668)
* Treat actor creation like a regular task.

* Small cleanups.

* Change semantics of actor resource handling.

* Bug fix.

* Minor linting

* Bug fix

* Fix jenkins test.

* Fix actor tests

* Some cleanups

* Bug fix

* Fix bug.

* Remove cached actor tasks when a driver is removed.

* Add more info to taskspec in global state API.

* Fix cyclic import bug in tune.

* Fix

* Fix linting.

* Fix linting.

* Don't schedule any tasks (especially actor creaiton tasks) on local schedulers with 0 CPUs.

* Bug fix.

* Add test for 0 CPU case

* Fix linting

* Address comments.

* Fix typos and add comment.

* Add assertion and fix test.
2018-03-16 11:18:07 -07:00
Melih Elibol
3c080f4baa
Add a callback for gcs table lookup failures. (#1702)
* Add callback to gcs client for table lookup failures.

* update plasma_manager reflecting changes to gcs callback.
2018-03-15 22:25:01 -07:00
Rohan Singh
1f027344f1 [Dataframes] Implemented .describe() (#1696)
* added describe methods

* mean updates and added truediv func

* updates

* updated truediv test

* porting stocks to ubuntu

* hacky solution for describe, mean, median, quantile by transposing df

* removed data file

* removed faulty truediv implementation

* flake8 and documentation updates

* updated mean, median, var, std to handle mixed values

* added describe methods

* mean updates and added truediv func

* updates

* updated truediv test

* porting stocks to ubuntu

* hacky solution for describe, mean, median, quantile by transposing df

* removed data file

* removed faulty truediv implementation

* flake8 and documentation updates

* fixed quantile to drop object typed columns

* syntax improvements"

* fixed flatten issue

* fixing flatten issue

* minor updates

* added describe methods

* mean updates and added truediv func

* updates

* updated truediv test

* porting stocks to ubuntu

* hacky solution for describe, mean, median, quantile by transposing df

* removed data file

* removed faulty truediv implementation

* flake8 and documentation updates

* updated mean, median, var, std to handle mixed values

* added describe methods

* mean updates and added truediv func

* updates

* updated truediv test

* porting stocks to ubuntu

* hacky solution for describe, mean, median, quantile by transposing df

* removed data file

* removed faulty truediv implementation

* flake8 and documentation updates

* fixed quantile to drop object typed columns

* syntax improvements"

* fixed flatten issue

* fixing flatten issue

* improved describe syntax
2018-03-15 21:16:59 -07:00
Richard Liaw
459fd5e152
[tune][minor] Move helper function (#1722) 2018-03-15 18:41:02 -07:00
Eric Liang
882a649f0c
[rllib] [docs] Cleanup RLlib API and make docs consistent with upcoming blog post (#1708)
* wip

* more work

* fix apex

* docs

* apex doc

* pool comment

* clean up

* make wrap stack pluggable

* Mon Mar 12 21:45:50 PDT 2018

* clean up comment

* table

* Mon Mar 12 22:51:57 PDT 2018

* Mon Mar 12 22:53:05 PDT 2018

* Mon Mar 12 22:55:03 PDT 2018

* Mon Mar 12 22:56:18 PDT 2018

* Mon Mar 12 22:59:54 PDT 2018

* Update apex_optimizer.py

* Update index.rst

* Update README.rst

* Update README.rst

* comments

* Wed Mar 14 19:01:02 PDT 2018
2018-03-15 15:57:31 -07:00
Devin Petersohn
c19c2a4e60 [DataFrame] readthedocs page for Pandas on Ray (#1714) 2018-03-13 22:23:50 -07:00
Robert Nishihara
adffc7bfea Pin Cython to 0.27.3 to fix travis builds. (#1718) 2018-03-13 22:18:08 -07:00
Devin Petersohn
8c1066cdba
[DataFrame] Implemented cummax, cummin, cumsum, cumprod (#1705)
* cummax, cummin, cumsum, cumprod

* added remote function

* Fix lint

* Fixing tests and linting

* Fix lint
2018-03-13 10:06:34 -07:00
Jae Min Kim
737120952e [Dataframes] Reorganization (#1676)
* moved helper functions for dataframes into df_utils

* Updating base on review comments

* fixed bug with from_pandas

* Updating formatting

* Fix lint
2018-03-12 19:13:33 -07:00
Peter Veerman
6455ec934b [DataFrame] Implements DataFrame.rename, DataFrame.rename_axis, and Index.set_names (#1573)
* Index update

* Fixed transpose bug with nan values

* Fix lint

* Add rename tests

* Implement DataFrame.rename, DataFrame.rename_axis, and Index.set_names

* Temp

* Fixing rename for new index implementation

Fix rebase merges

* Fix rename and rename_axis to work with new index.

Re-add pytest fixture

Clean up rebase artifacts

Remove index.py file

* Addressing minor points

* Addressing comments
2018-03-12 19:05:32 -07:00
Robert Nishihara
15a4392156 Add instructions for pip installing the latest wheel. (#1672) 2018-03-12 00:52:00 -07:00
Eric Liang
076936a7f5
[rllib] Switch DQN to using deepmind wrappers (#1655)
* deepmind wrap

* use 80x80

* respect custom prep

* fix replay size

* fix chekc

* batch idx

* Wed Mar  7 11:00:39 PST 2018

* random starts and reward clipping

* Fri Mar  9 17:27:17 PST 2018

* Fri Mar  9 17:36:15 PST 2018

* Sat Mar 10 19:47:10 PST 2018

* Sat Mar 10 19:47:37 PST 2018

* Sat Mar 10 20:05:12 PST 2018

* Sat Mar 10 20:54:21 PST 2018

* Sat Mar 10 21:03:52 PST 2018
2018-03-11 21:14:38 -07:00
Stephanie Wang
6114b6d20e
Implement the client table for the new GCS (#1674)
* Add subscription callback to CallbackData

* Implement ClientTable

* Hook up ClientTable to AsyncGCSClient

* Add client_info to GCSClient Connect interface

* client table callbacks

* Unit test for client table

* Doc

* Fix idempotency check

* Fix mac build

* Fix memory issues in gcs client test

* Fix disconnection bug

* lint
2018-03-11 19:17:18 -07:00
Robert Nishihara
cae108d019 Replace special single quote with regular single quote. (#1693) 2018-03-10 20:36:01 -08:00
Richard Liaw
40799fee37
[autoscaler] Fix Defaults (#1661) 2018-03-09 16:59:21 -08:00
Peter Veerman
2b747ba46c [DataFrame] Implement .fillna(), .ffill(), .bfill(), .eval(), and .drop() (#1544)
* Implement ray.DataFrame.drop w/ tests

* Implement ray.DataFrame.eval w/ tests

Fix flake8 issues

* Fix flake8 issues in dataframe.py

* Implement fillna

* Implement fillna

* Implement ffill and bfill

* Define helper functions outside of method invocation

* Implement ray.DataFrame.eval w/ tests

* Index update

* Fixed transpose bug with nan values

* Fix lint

* Implement fillna

* Use ray index to check if labels exist in df

* Fix ValueError catching

* Remove duplicate test methods

* Add documentation for .fillna(), .ffill(), .bfill(), .eval(), and .drop()

Fix flake8 errors

* Remove notebook files

* Change fillna, eval, drop to use new index type

* Fix documentation for fillna, eval and drop

temp

Temp

temp

temp

temp

* Update drop to work with new type of ray index

* Fix flake8 errors

* Refactor fillna fix for index
2018-03-09 07:37:27 -08:00
Philipp Moritz
7193107f32 fix build on macOS (#1687) 2018-03-08 23:23:21 -08:00
Philipp Moritz
5ef0892236 Compile boost from source to fix macOS wheels (#1688) 2018-03-08 23:22:23 -08:00
Alexey Tumanov
91464a56dd [XRay] Raylet node and object manager unification/backend redesign. (#1640)
* directory for raylet

* some initial class scaffolding -- in progress

* node_manager build code and test stub files.

* class scaffolding for resources, workers, and the worker pool

* Node manager server loop

* raylet policy and queue - wip checkpoint

* fix dependencies

* add gen_nm_fbs as target.

* object manager build, stub, and test code.

* Start integrating WorkerPool into node manager

* fix build on mac

* tmp

* adding LsResources boilerplate

* add/build Task spec boilerplate

* checkpoint ActorInformation and LsQueue

* Worker pool maintains started and removed workers

* todos for e2e task assignment

* fix build on mac

* build/add lsqueue interface

* channel resource config through from NodeServer to LsResources; prep LsResources to replace/provide worker_pool

* progress on LsResources class: resource availability check implementation

* Read task submission messages from a client

* Submit tasks from the client to the local scheduler

* Assign a task to a worker from the WorkerPool

* change the way node_manager is built to prevent build issues for object_manager.

* add namespaces. fix build.

* Move ClientConnection message handling into server, remove reference to
WorkerPool

* Add raw constructors for TaskSpecification

* Define TaskArgument by reference and by value

* Flatbuffer serialization for TaskSpec

* expand resource implementation

* Start integrating TaskExecutionSpecification into Task

* Separate WorkerPool from LsResources, give ownership to NodeServer

* checkpoint queue and resource code

* resoving merge conflicts

* lspolicy::schedule ; adding lsqueue and lspolicy to the nodeserver

* Implement LsQueue RemoveTasks and QueueReadyTasks

* Fill in some LsQueue code for assigning a task

* added suport for test_asio

* Implement LsQueue queue tasks methods, queue running tasks

* calling into policy from nodeserver; adding cluster resource map

* Feedback and Testing.
Incorporate Alexey's feedback. Actually test some code. Clean up callback imp.

* end to end task assignment

* Decouple local scheduler from node server

* move TODO

* Move local scheduler to separate file

* Add scaffolding for reconstruction policy, task dependency manager, and object manager

* fix

* asio for store client notifications.
added asio for plasma store connection.
added tests for store notifications.
encapsulate store interaction under store_messenger.

* Move Worker inside of ClientConnection

* Set the assigned task ID in the worker

* Several changes toward object manager implementation.
Store client integration with asio.
Complete OM/OD scaffolding.

* simple simulator to estimate number of retry timeouts

* changing dbclientid --> clientid

* fix build (include sandbox after it's fixed).

* changes to object manager, adding lambdas to the interface

* changing void * callbacks to std::function typed callbacks

* remove use namespace std from .h files.
use ray:: for Status everywhere.

* minor

* lineage cache interfaces

* TODO for object IDs

* Interface for the GCS client table

* Revert "Set the assigned task ID in the worker"

This reverts commit a770dd31048a289ef431c56d64e491fa7f9b2737.

* Revert "Move Worker inside of ClientConnection"

This reverts commit dfaa0d662a76976c05be6d76b214b45d88482818.

* OD/OM: ray::Status

* mock gcs integration.

* gcs mock clientinfo assignment

* Allow lookup of a Worker in the WorkerPool

* Split out Worker and ClientConnection source files

* Allow assignment of a task ID to a worker, skeleton for finishing a task

* integrate mock gcs with om tests.

* added tcp connection acceptor

* integrated OM with NM.
integrated GcsClient with NM.
Added multi-node integration tests.

* OM to receive incoming tcp connections.

* implemented object manager connection protocol.

* Added todos.

* slight adjustment to add/remove handler invocation on object store client.

* Simplify Task interface for getting dependencies

* Remove unused object manager file

* TaskDependencyManager tracks missing task dependencies and processes object add notifications

* Local scheduler queues tasks according to argument availability

* Fill in TaskSpecification methods to get arguments

* Implemented push.

* Queue tasks that have been scheduled but that are waiting for a worker

* Pull + mock gcs cleanup.

* OD/OM/GCS mock code review, fixing unused-result issues, eliminating copy ctor

* Remove unique_ptr from object_store_client

* Fix object manager Push memory error

* Pull task arguments in task dependency manager

* Add a demo script for remote task dependencies

* Some comments for the TaskDependencyManager

* code cleanup; builds on mac

* Make ClientConnection a templated type based on the connection protocol

* Add gmock to build

* Add WorkerPool unit tests

* clean up.

* clean up connection code.

* instantiate a template instance in the module

* Virtual destructors

* Document public api.

* Separate read and write buffers in ClientConnection; documentation

* Remove ObjectDirectory from NodeServer constructor, make directory InitGcs call a separate constructor

* Convert NodeServer Terminate to a destructor

* NodeServer documentation

* WorkerPool documentation

* TaskDependencyManager doc

* unifying naming conventions

* unifying naming conventions

* Task cleanup and documentation

* unifying naming conventions

* unifying naming conventions

* code cleanup and naming conventions

* code cleanup

* Rename om --> object_manager

* Merge with master

* SchedulingQueue doc

* Docs and implementation skeleton for ClientTable

* Node manager documentation

* ReconstructionPolicy doc

* Replace std::bind with lambda in TaskDependencyManager

* lineage cache doc

* Use \param style for doc

* documentation for scheduling policy and resources

* minor code cleanup

* SchedulingResources class documentation + code cleanup

* referencing ray/raylet directory; doxygen documentation

* updating trivial policy

* Fix bug where event loop stops after task submission

* Define entry point for ClientManager for handling new connections

* Node manager to node manager protocol, heartbeat protocol

* Fix flatbuffer

* Fix GCS flatbuffer naming conflict

* client connection moved to common dir.

* rename based on feedback.

* Added google style and 90 char lines clang-format file under src/ray.

* const ref ClientID.

* Incorporated feedback from PR.

* raylet: includes and namespaces

* raylet/om/gcs logging/using

* doxygen style

* camel casing, comments, other style; DBClientID -> ClientID

* object_manager : naming, defines, style

* consistent caps and naming; misc style

* cleaning up client connection + other stylistic fixes

* cmath, std::nan

* more style polish: OM, Raylet, gcs tables

* removing sandbox (moved to ray-project/sandbox)

* raylet linting

* object manager linting

* gcs linting

* all other linting


Co-authored-by: Melih <elibol@gmail.com>
Co-authored-by: Stephanie <swang@cs.berkeley.edu>
2018-03-08 12:53:24 -08:00
Eric Liang
d85274a12e [docs] update to expose libraries + landing page (#1642) 2018-03-08 09:18:09 -08:00
Eric Liang
75e825177f [rllib] Move Ape-X metrics behind a debug flag and remove some of them (#1656) 2018-03-08 00:48:49 -08:00
Robert Nishihara
b0510ee461 Give error when actor is created before ray.init. (#1666) 2018-03-07 10:36:49 -08:00
Philipp Moritz
a9acfab3a6 Start chain replicated GCS with Ray (#1538) 2018-03-07 10:18:58 -08:00
James Lamb
6dbf4f6318 Remove vim from base-deps container and reduce number of build layers (#1667) 2018-03-07 10:16:08 -08:00
Rohan Singh
0abebb0975 [Dataframes] Implement .__len__(), .__contains__(), .first_valid_index(), and .last_valid_index() (#1664)
* added len, contains, first_valid_index, last_valid_index

* fixed contains test cases

* test files updated for PR
2018-03-06 23:56:11 -08:00
Devin Petersohn
4af42d5bb6 [DataFrame] Adding error checking for pandas version (#1662)
* Adding error checking for pandas version

* Addressing comments
2018-03-06 09:57:49 -08:00
Stephanie Wang
0a6edb55a8 Implement the Subscribe call for the new GCS API (#1652)
* Implement the Subscribe call for the new GCS API

* Document tests

* Upper case function name

* Fix build errors

* lint
2018-03-06 09:56:12 -08:00
butchcom
936bebef99 [rllib] Upgrade to OpenAI Gym 0.10.3 (#1601) 2018-03-06 00:31:02 -08:00
Richard Liaw
162d063f0d
[autoscaler/tune] Optional YAML Fields + Fix Pretty Printing for Tune (#1541) 2018-03-04 23:35:58 -08:00
Richard Liaw
061e435411
[rllib] Fix eval.py -> rollout.py (#1650) 2018-03-04 14:59:16 -08:00
Philipp Moritz
a683cf2c70 Gcs Asio integration (#1633) 2018-03-04 14:51:04 -08:00
Richard Liaw
78716094b5
[tune] Async Hyperband (#1595) 2018-03-04 14:05:56 -08:00
Eric Liang
ecb811c26e
[rllib] Ape-X implementation and DQN refactor to handle replay in policy optimizer (#1604)
* minimal apex checkin

* cleanup dqn options

* actor utils

* Sun Feb 25 17:39:54 PST 2018

* update

* compression refactor

* fix

* add test

* fix models

* Sun Feb 25 21:46:27 PST 2018

* Wed Feb 28 10:26:34 PST 2018

* Wed Feb 28 10:28:09 PST 2018

* Wed Feb 28 10:42:59 PST 2018

* refactor

* Wed Feb 28 11:17:19 PST 2018

* Wed Feb 28 11:42:08 PST 2018

* Wed Feb 28 11:42:13 PST 2018

* Wed Feb 28 11:59:02 PST 2018

* Wed Feb 28 11:59:58 PST 2018

* Wed Feb 28 12:00:08 PST 2018

* Wed Feb 28 12:02:19 PST 2018

* Wed Feb 28 13:44:31 PST 2018

* Wed Feb 28 17:01:20 PST 2018

* Sat Mar  3 14:55:59 PST 2018

* make optimizer construction explicit

* Sat Mar  3 18:23:08 PST 2018

* Sat Mar  3 18:24:28 PST 2018

* Sat Mar  3 18:49:28 PST 2018

* Sat Mar  3 18:50:42 PST 2018

* Sat Mar  3 18:56:10 PST 2018
2018-03-04 12:25:25 -08:00
Eric Liang
9b33f3a7b7
[autoscaler] Bad error message when dict field omitted (#1632)
* Wed Feb 28 23:22:55 PST 2018

* Wed Feb 28 23:24:07 PST 2018
2018-03-03 20:25:58 -08:00
Eric Liang
75293a0ba0
[rllib] Basic regression tests on CartPole (#1608)
* Sun Feb 25 21:36:22 PST 2018

* Sun Feb 25 21:42:09 PST 2018

* Sun Feb 25 21:44:30 PST 2018

* fix lint

* Wed Feb 28 12:41:49 PST 2018
2018-03-03 16:27:56 -08:00
Eric Liang
80d7def9dc
[autoscaler] [tune] More doc fixes (#1560)
* Fri Feb 16 13:53:50 PST 2018

* Sat Feb 17 15:32:08 PST 2018

* Sat Feb 17 15:44:59 PST 2018

* fix

* Sun Feb 18 14:46:24 PST 2018

* Sun Feb 18 14:46:37 PST 2018

* Sun Feb 18 14:55:52 PST 2018

* Sun Feb 18 15:14:32 PST 2018

* Wed Feb 21 17:34:17 PST 2018

* Sun Feb 25 17:51:17 PST 2018

* Sun Feb 25 22:18:40 PST 2018

* Wed Feb 28 13:19:05 PST 2018

* Wed Feb 28 13:22:13 PST 2018

* Wed Feb 28 13:33:29 PST 2018

* Wed Feb 28 13:35:33 PST 2018

* add ex

* Fri Mar  2 12:50:17 PST 2018

* Fri Mar  2 12:54:31 PST 2018
2018-03-03 13:01:49 -08:00
Richard Liaw
96d7938fc4 [tune] Hyperband Max Iter Fix (#1620)
* nits

* cumul r

* docs

* min
2018-03-03 13:00:55 -08:00
Kunal Gosar
6685d4c446 fix tail and finish repr and str (#1628) 2018-03-02 15:26:54 -08:00
Zhenyu Guo
f1e5789c26 restructure how to organize 3rd party libs (#1630)
* restructure how to organize 3rd party libs

* Minor whitespace changes.

* Fix compilation on Linux.

* Pass around Python executable so that the correct version of Python is used.
2018-03-01 14:29:56 -08:00
Robert Nishihara
ec9dfe7748 Allow setting INCLUDE_UI=0 to disable building the UI. (#1618) 2018-03-01 02:17:15 -08:00
Robert Nishihara
1222d09224 Fix dataframe test linting and test. (#1629) 2018-02-28 15:21:49 -08:00
Robert Nishihara
0fcceef772 Update logging and check macros. (#1627)
* Update logging and check macros.

* Fix linting.

* Fix RAY_DCHECK and unused variable.

* Fix linting
2018-02-28 15:13:00 -08:00
Devin Petersohn
e7df293946 [DataFrames] Updating Error messages to encourage contribution. (#1623) 2018-02-27 21:44:33 -08:00