Commit graph

1664 commits

Author SHA1 Message Date
Robert Nishihara
3c76461b22 Remove smart_open install. (#1943) 2018-04-23 23:18:09 -07:00
Devin Petersohn
1d1df7bbec [DataFrame] Fully implement append, concat and join (#1932) 2018-04-23 17:09:57 -07:00
Kunal Gosar
29c36f2bce [DataFrame] Fix for __getitem__ string indexing (#1939)
* edge case fixes for __getitem__

* Enable None indexing
2018-04-23 13:13:14 -07:00
Kunal Gosar
7c9f39241e [DataFrame] Implementing write methods (#1918)
* Add in write methods and functionality

* infer highest available pickle version

* Fix import rebase artifact

* formatting changes to test

* fix lint
2018-04-22 21:25:33 -07:00
Roy Fox
baf97e450b [rllib] arr[end] was excluded when end is not None (#1931)
Looks good, thanks!
2018-04-22 15:12:55 -07:00
Devin Petersohn
8f59546ef2 [DataFrame] Implementing API correct groupby with aggregation methods (#1914) 2018-04-21 17:28:16 -07:00
Melih Elibol
8264e64b18 Handle interrupts correctly for ASIO synchronous reads and writes. (#1929)
* handle interrupts correctly.

* linting

* handle interrupts on read_some/write_some.
2018-04-20 22:55:40 -07:00
adgirish
3c48783a16 [DataFrame] Adding read methods and tests (#1712)
* Adding read methods and tests

* Referencing internal partition method so constructors are more canonical with Pandas

* Fixing to reference from_pandas in utils

* Cleaning up unused imports

* rerunning tests

* fixing flake8

* resolving errors

* Added sql and sas test

* updating

* Temporarily phasing out read_csv code for wrapper while diagnosing, added io tests to travis

* Adding travis

* restoring distributed read csv

* resolving rebases

* lint

* Sampling out HD test

* adding dep

* fix pathing

* Flagging out tests

* resolving read_method issues

* fix build issue

* move additional dependencies to extras

* fixing lint

* removing IO dependencies

* updated requirements doc
2018-04-20 18:33:08 -07:00
Robert Nishihara
cffda73da1 Allow task_table_update to fail when tasks are finished. (#1927)
* Allow task_table_update to fail when tasks are finished.

* Add comment.
2018-04-20 11:34:29 -07:00
Jones Wong
c9a7744e52 [rllib] Contribute DDPG to RLlib (#1877)
*  ongoing ddpg

*  ongoing ddpg converged

*  gpu machine changes

*  tuned

*  tuned ddpg specification

*  ddpg

*  supplement missed optimizer argument clip_rewards in default DQN configuration

*  ddpg supports vision env (atari) now

*  revised according to code review comments

*  added regression test case

*  removed irrelevant files

*  validate ddpg on mountain_car_continuous

*  restore unnecessary slight changes

*  revised according to eric's comments

*  added the requested tests

*  revised accordingly

*  revised accordingly and re-validated

*  formatted by yapf

*  fix lint errors

*  formatted by yapf

*  fix lint errors

*  formatted by yapf

*  fix lint error
2018-04-19 22:36:29 -07:00
Stephanie Wang
aa07f1ce4e [xray] Workers blocked in a ray.get release their resources (#1920)
* [xray] Throttle task dispatch by required resources
* Pass in number of initial workers into raylet command
* Workers blocked in a ray.get release resources
2018-04-18 20:59:58 -07:00
Alexey Tumanov
1c965fcfeb Raylet task dispatch and throttling worker startup (#1912)
* separate task placement and task dispatch; throttle task dispatch with locally available resournces

* keep track of worker's being started/in flight and suppress starting extraneous workers

* cleanup comments

* remove early termination in task dispatch to support zero-resource actor tasks

* info -> debug

* add documentation

* linting

* mock the worker pool for testing

* some linting

* kill all workers in flight; clear the worker pool in dtor

* remove fixed todo

* lint
2018-04-18 10:58:11 -07:00
Omkar Salpekar
0728d4719b [DataFrame] Eval fix (#1903)
* eval now works without assignment - helper function a bit hacky

* removed df.copy() from eval_helper

* one test still failing for qury

* all eval tests passing now

* added check to eval arge verification

* added tests to travis

* added optimization and some comments

* added pd.eval and passes all tests

* added ray dataframe back to test file

* optimizations and code cleanup for eval

* changed position of pandas import in __init__

* fixed linting errors

* fixing eval in __init__.py

* fixed travis file - removed extra tests

* removed test directory from linting exclude for travis
2018-04-18 08:48:32 -07:00
Richard Liaw
f833e4da37
[tune] Polishing docs (#1846) 2018-04-17 09:57:35 -07:00
Eric Liang
7ab890f4a1 [tune] [rllib] Automatically determine RLlib resources and add queueing mechanism for autoscaling (#1848) 2018-04-16 16:58:15 -07:00
Stephanie Wang
2e25972d4d Preemptively push local arguments for actor tasks (#1901) 2018-04-16 16:26:59 -07:00
Eric Liang
ed8c0f1a38
[tune] Allow fetching pinned objects from trainable functions (#1895)
* updates

* lint

* Update util.py

* Update function_runner.py

* updates
2018-04-16 15:54:38 -07:00
Melih Elibol
ddfc875149 Multithreading refactor for ObjectManager. (#1911)
* removes transfer service. adds separate pool for sends and receives.

* get rid of send/receive transfer counts.

* update comment.

* remove clang formatting.

* clang formatting.
2018-04-16 15:51:53 -07:00
Devin Petersohn
3c817ad908 Add slice functionality (#1832) 2018-04-16 08:50:56 -07:00
Patrick Yang
f505f0642f [DataFrame] Pass read_csv kwargs to _infer_column (#1894)
* pass kwargs to _infer_column

* adding small test for non-comma delim

* fix lint
2018-04-16 08:47:30 -07:00
Melih Elibol
cff37765b1 Addresses missed comments from multichunk object transfer PR. (#1908)
* Move object manager parameters to ray config,
object manager config bug fix.
addresses other comments from #1827.

* linting and uint?

* typos

* remove uint.
2018-04-15 21:35:51 -07:00
Robert Nishihara
6ca2c2a609 Allow numpy arrays to be passed by value into tasks (and inlined in the task spec). (#1816)
* Allow numpy arrays and larger objects to be passed by value in task specifications.

* Fix bug.

* Fix bug. Inline all bug numpy object arrays.

* Increase size limit for inlining args in task spec.

* Give numpy init different signatures in Python 2 and Python 3.

* Simplify code.

* Fix test.

* Use import_array1 instead of import_array.
2018-04-15 20:36:01 -07:00
Stephanie Wang
6bd944ae0d [xray] Lineage cache requests notifications from the GCS about remote tasks (#1834)
* Add PubsubInterface to GCS tables

* Add task table PubsubInterface to lineage cache and tests

* Request notifications for remote tasks in the lineage cache

* Add RegisterGCS method to node manager

* Fix NodeManager member initialization order, subscribe to task table notifications

* Comments

* Use returned statuses.

* Fix double commit bug in lineage cache

* lint

* More linting.

* Fix pure virtual method declarations
2018-04-15 20:16:55 -07:00
Robert Nishihara
7792032ee3 Fix UI issue for non-json-serializable task arguments. (#1892)
* Fix UI issue for non-json-serializable task arguments.

* Simplify approach.
2018-04-15 13:54:42 -07:00
Robert Nishihara
3383553dc0 Remove unnecessary calls to .hex() for object IDs. (#1910) 2018-04-15 13:52:51 -07:00
Robert Nishihara
6f8b81d9e5 Allow multiple raylets to be started on a single machine. (#1904) 2018-04-15 13:51:19 -07:00
Stephanie Wang
4b655b0ff6 [xray] Turn on flushing to the GCS for the lineage cache (#1907) 2018-04-14 23:40:56 -07:00
Melih Elibol
fcd30444a8 Single Big Object Parallel Transfer. (#1827)
* cache all object info from object added store notification.

* Adds parallel transfer for big objects.

* documentation and clean up.

* compare objects...

* merge buffer_state with chunk vec. Make separate buffer state for get and create.

* use references for Get. Allow partial failure of Create.

* single plasma client.

* changes based on review.

* update documentation and add parameters for object manager in main.cc.

* review feedback.

* use vector consturctor.

* linting

* remove profile visualizations.

* test fixes.

* linting.

* kill specific pids and use less memory.

* linting.

* simplify tests.

* Asynchronous IO for ObjectManager messages and object transfer.

* Revert "Asynchronous IO for ObjectManager messages and object transfer."

This reverts commit 4af43b159babc04daf80d1543e27c2cb46b7b19d.

* update test configuration to reflect changes in #1891

* review feedback.

* linting.
2018-04-14 17:08:19 -07:00
Melih Elibol
6a84b1f26e Remove num_threads as a parameter. (#1891)
* remove num_threads as a parameter.

* linting.

* add additional checks.

* Invoke TransferCompleted on failures.

* Fix issue with failed Gets on store.

* ray check status of writing object headers.

* fix mac issues.
2018-04-14 15:22:59 -07:00
Melih Elibol
6be73350c6 Adds Valgrind tests for multi-threaded object manager. (#1890)
* adds valgrind to new object manager.

* Add some comments.

* Update run_object_manager_valgrind.sh

typo

* Update run_object_manager_tests.sh

* update tests to reflect changes in #1891.

* reduce # tests.
2018-04-13 21:56:12 -07:00
Robert Nishihara
4379e9cea0 Pin cython version in docker base dependencies file. (#1898) 2018-04-13 20:33:20 -07:00
Robert Nishihara
24c944e499 Update arrow to efficiently serialize more types of numpy arrays. (#1889) 2018-04-13 09:41:51 -07:00
Eric Liang
e4b17e03f6
updates (#1896) 2018-04-13 00:57:00 -07:00
Peter Schafhalter
1d605e8f8a [DataFrame] Inherit documentation from Pandas (#1727)
* Added _inherit_docstrings

* DataFrame documentation inherits from Pandas

* Fix formatting

* Replace hasattr and document properties

* Fix rebase

* Override documentation for groupby

* Override documentation for series

* Don't overwrite property docstrings

* Fix property __doc__ for python2
2018-04-12 20:30:19 -07:00
Robert Nishihara
d0fffec2d0 Update arrow and parquet-cpp. (#1875)
* Update arrow.

* Fix bug.

* Cherry-pick commit for fixing parquet segfault.

* Update arrow and revert auto-releasing buffer commit.

* Remove parquet cherry-pick.
2018-04-12 16:17:12 -07:00
Alexey Tumanov
39cf6ff6e1 raylet command line resource configuration plumbing (#1882)
* raylet command line resource configuration plumbing

* Small changes.
2018-04-12 02:37:15 -07:00
Alexey Tumanov
85d3963172 use raylet for remote ray nodes (#1880) 2018-04-11 22:06:46 -07:00
Eric Liang
4dc04374f6
[rllib] Propagate dim option to deepmind wrappers (#1876)
* updates

* updates
2018-04-11 21:38:06 -07:00
alvkao58
15a668dd12 [RLLib] DDPG (#1685) 2018-04-11 15:08:39 -07:00
Philipp Moritz
74162d1492 Lint Python files with Yapf (#1872) 2018-04-11 10:11:35 -07:00
Omkar Salpekar
a3ddde398c [DataFrame] Fixed repr, info, and memory_usage (#1874)
* working with dataframes with too many rows and columns

* repr works for jupyter notebooks now

* added comments and test file

* added repr test file to .travis.yml

* added back ray.dataframe as pd to test file

* fixed pandas importing issues in test file

* getting the front and back of df more efficiently

* only keeping dataframe tests in travis

* fixing numpy array for row and col lengths issue

* doesn't add dimensions if df is small enough

* implemented memory_usage()

* completed memory_usage - still failing 2 tests

* only failing one test for memory_usage

* all repr and dataframes tests passing now

* fixing error related to python2 in info()

* fixing python2 errors

* fixed linting errosr

* using _arithmetic_helper in memory_usage()

* fixed last lint error

* removed testing-specific code

* adding back travis test

* removing extra tests from travis

* re-added concat test

* fixes with new indexing scheme

* code cleanup

* fully working with new indexing scheme

* added tests for info and memory_usage

* removed test file
2018-04-11 08:07:07 -07:00
Devin Petersohn
806b2c844e Fix getattr compat (#1871) 2018-04-10 21:28:59 -07:00
alonamid
202f9683ea check if arrow build dir exists (#1863) 2018-04-10 14:52:51 -07:00
Patrick Yang
521b549e4a [DataFrame] Encapsulate index and lengths into separate class (#1849)
* baseline impl for index_df.py

* added skeleton for index_df.py

* initial impl index_df

* separate out partition and non-partition impls

* add len function

* drop returns index_df slice of dropped indices

* housecleaning

* Integrate index overhaul

* Rename index df to index metadata

* Fix flake8 issues

* Addressing issues

* fix import issue

* Added metadata passing to constructor
2018-04-10 14:30:20 -07:00
Peter Schafhalter
405b05d58a [DataFrame] Implemented __getattr__ (#1753)
* __getattr__ accesses columns

* Added test
2018-04-10 10:19:33 -07:00
Richard Liaw
e82bea40b1 Add better analytics to docs (#1854) 2018-04-10 00:51:44 -07:00
adgirish
efeaacbedc Adding support for concat (#1739)
adding tests

fixing flake8

adding init

flake 8 on test

fixing tests, imports, and flake8

handling for index

adding tests for row, index

added more robust error handling for axis

fixing test failures

cleaning up error sfor 2.7

updating travis

resolving import

fixing flake8

moved import order

Fixing to refactor and delaying implementing ray-pd inner concat

resolving ray-pd concat and from_pandas mutation

Revert "resolving ray-pd concat and from_pandas mutation"

This reverts commit 5db43e4e89e328286532f3ef98a4526575c5d08d.
2018-04-09 21:36:24 -07:00
Philipp Moritz
3039cca242 add facility to link libraries to tests (#1850) 2018-04-09 18:59:24 -07:00
Philipp Moritz
834e594709 [XRay] Register object store and raylet with the GCS (#1860) 2018-04-09 18:56:33 -07:00
Robert Nishihara
7c9e291b4b In the UI, display task breakdowns by default. (#1857) 2018-04-09 13:24:38 -07:00