Commit graph

1829 commits

Author SHA1 Message Date
Hao Chen
5b015f9a79 Remove the check of java primitive types (#2495) 2018-07-27 11:44:19 -07:00
Shuo
29451cca82 Add test: running a driver for twice. (#2464) 2018-07-27 00:57:52 -07:00
Zhijun Fu
9ad6a973a0 [xray] lineage optimization: avoid unnecessary lineage entry allocation & free (#2463)
* merge from ray

* Revert "merge from ray"

This reverts commit 32b181ebbb1fa184026631e1a7368112c4c3118d.

* [xray] avoid unnecessary lineage entry allocation & free

* address comments

* address review comments

* address comments
2018-07-26 10:44:38 -04:00
Yuhong Guo
46351957bb Fix MAC_WHEELS=1 (#2477) 2018-07-25 14:57:28 -07:00
Yuhong Guo
b35ce5dbf1 Update Arrow Package with breaking changes (#2440)
* Merge the breaking change of Arrow Package.

* Fix typo

* Fix lint.

* put forward declarations into header

* fix

* add protocol.h

* fix linting
2018-07-25 14:28:33 -07:00
Richard Liaw
7edc677304
[rllib] Extra Changes for Usability (#2363) 2018-07-24 20:51:22 -07:00
Sergey Kolesnikov
05490b8cb9 [rllib] dqn/ddpg policy customization (#2445)
* dqn policy update - more customization

* docs for custom DQN graph

* Update rllib-training.rst

* Update rllib-models.rst

* Update rllib.rst

* Update rllib-training.rst

* Update rllib-concepts.rst

* yapf codestyle
2018-07-22 14:47:14 -07:00
Eric Liang
68660453e4
[rllib] Better support and add two-trainer example for multiagent (#2443)
This adds a simple DQN+PPO example for multi-agent. We don't do anything fancy here, just syncing weights between two separate trainers. This potentially is wasting some compute, but is very simple to set up.

It might be nice to share experience collection between the top-level trainers in the future.
2018-07-22 05:09:25 -07:00
Shuo
99d0d96aef Use different serialization context for each driver. (#2406) 2018-07-20 23:42:49 -07:00
Hao Chen
05f485e274 Allow Ray API to be used from multiple threads (#2422) 2018-07-20 15:39:01 -07:00
Robert Nishihara
4b6157ed09 Remove link to install Linux Python 3.3 wheel. (#2434) 2018-07-20 15:15:43 -07:00
nam-cern
c0b4c3b6cf Use absolute path to get to thirdparty dir (#2442)
* Use absolute path to get to thirdparty dir

In case this script is executed from a different directory than the Ray's directory, the `pushd` will fail. This commit uses absolute path to `thirdparty` directory.

* Update setup_thirdparty.sh
2018-07-20 15:12:25 -07:00
Eric Liang
807f309b3a
[test] Fix broken rllib test (#2446)
This fixes the broken build.
2018-07-20 13:47:41 -07:00
Philipp Moritz
e821f852ef [xray] Silence some object manager logging (#2437) 2018-07-20 13:10:03 -07:00
Peter Schafhalter
2a3b02649a Add queue test to xray tests (#2433) 2018-07-19 17:18:13 -07:00
Peter Schafhalter
400a3e5705 Add queue size and __len__ methods (#2432) 2018-07-19 17:04:42 -07:00
Peter Schafhalter
4225ac5081 Add benchmark using queue (#2431) 2018-07-19 16:43:22 -07:00
Eric Liang
8e75d150f7
[rllib] Apex crash when compress_observations: False (#2426)
We shouldn't try to decompress uncompressed data.

Also, fix resource requests for ddpg + GPU.
2018-07-19 15:58:09 -07:00
Eric Liang
d01dc9e22d
[rllib] format with yapf (#2427)
* initial yapf

* manual fix yapf bugs
2018-07-19 15:30:36 -07:00
Robert Nishihara
24eb140e07 Remove redundant reconstruct call. (#2421) 2018-07-19 11:22:02 -07:00
Robert Nishihara
eed39163f9 Add callback to node manager for client removed event. (#2417)
* Add callback to node manager for client removed event.

* Fix linting.
2018-07-18 16:59:04 -07:00
Robert Nishihara
991d0911d1 Move profile data flushing to background thread on workers. (#2415)
* Move profile data flushing to background thread on workers.

* Remove outdated comment.
2018-07-18 12:34:53 -07:00
Philipp Moritz
4c82ac72df Upgrade arrow to include the plasma TensorFlow op (#2412) 2018-07-18 12:33:02 -07:00
Wang Qing
344e3d2c05 Fix bug: Init RayLog before using it. (#2408) 2018-07-18 00:44:37 -07:00
Eric Liang
f31a6ca965
[rllib] Count actual sample batch size instead of configured batch size in A3C. (#2399)
This fixes a metrics accounting bug where the sample count is not reported correctly.
2018-07-18 08:59:52 +02:00
Richard Liaw
8e8c733696
[tune] Fix Categorical Space + Add Keras Example (#2401)
Previously did not properly resolve categorical variables for HyperOpt.
2018-07-17 23:52:52 +02:00
Yuhong Guo
e3badb9b09 Fix that parquet and arrow will build every time. (#2405)
* Fix the bug that parquet and arrow will build every time.

* Update build_arrow.sh

* Update build_arrow.sh
2018-07-16 22:56:14 -07:00
Eric Liang
0cecf6b79c
[rllib] Cleanup RNN support and make it work with multi-GPU optimizer (#2394)
Cleanup: TFPolicyGraph now automatically adds loss input entries for state_in_*, so that graph sub-classes don't need to worry about it.

Multi-GPU support:

Allow setting up model tower replicas with existing state input tensors

Truncate the per-device minibatch slices so that they are always a multiple of max_seq_len.
2018-07-17 06:55:46 +02:00
Robert Nishihara
1b645fcc8b Add parameter server blog post. (#2398)
* Saving work on parameter server blog post.

* Updates

* Updates to blog post.

* Add notes about tasks and actors.

* Updates

* Add RLlib paper link

* Update intro

* Address comments.

* More fixes.

* Clarify ray.get

* Change date

* Add @ray.remote clarification.

* Update site deployment instructions.

* Minor wording
2018-07-16 21:51:39 -07:00
Peter Schafhalter
f5c46c7765 Add queue data structures (#2261) 2018-07-16 16:26:20 -07:00
Yuhong Guo
404bfc5da2 Add const to to_plasma_id function to make it usable by const ObjectID (#2404)
* Add const to to_plasma_id to make it usable by const ObjectID

* Separate the building script to another PR.
2018-07-16 11:05:51 -07:00
Yuhong Guo
ded260b1b7 Add const to to_plasma_id function to make it usable by const ObjectID (#2404)
* Add const to to_plasma_id to make it usable by const ObjectID

* Separate the building script to another PR.
2018-07-16 11:05:37 -07:00
Yuhong Guo
206254bcf3 Add const to to_plasma_id function to make it usable by const ObjectID (#2404)
* Add const to to_plasma_id to make it usable by const ObjectID

* Separate the building script to another PR.
2018-07-16 11:05:29 -07:00
Hao Chen
8a3e180156 Move profiling code to a new file and fix thread safety (#2397) 2018-07-15 18:09:52 -07:00
Yuhong Guo
bbea73155a Fix parquet missing error and improve arrow commit id changing (#2319)
* Fix parquet missing error and improve arrow commit id changing

* Remove build cache for arrow.

* Update build_parquet.sh

* Update build_ui.sh

* Update build_arrow.sh
2018-07-14 16:08:13 -07:00
Eric Liang
7865dbab84 [tune] Raise error if incorrect key used in config (#2400) 2018-07-15 00:25:19 +02:00
Hao Chen
c1575e98c1 Make local scheduler client thread-safe (#2386)
* Make local scheduler client thread-safe for python

* lock write_messages

* remove allow-threads

* fix linter

* rename _write_message to do_write_message
2018-07-13 16:19:00 -07:00
Eric Liang
62f84d2f07 [rllib] Restore TF soft placement config to fix multi-GPU optimizer (#2395) 2018-07-13 10:34:37 +02:00
Hao Chen
d6af50785e move import_thread to a separate file (#2349)
* move import_thread to a separate file

* sort imports

* group imports regardless of `from`

* re-organize imoprts based on google style

* Update import_thread.py

* fix event_type names in profile statement

* unify duplicate code
2018-07-12 21:26:24 -07:00
Crystal
ebf4070d88 Documentation- Basic Profiling for Ray Users (#2326)
* Ray documentation - created new section 'Profiling for Ray Users', opposed to current Profiling section for Ray developers. Completed three sections 'A Basic Profiling Example', 'Timing Performance Using Python's Timestamps', and 'Profiling Using An External Profiler (Line_Profiler).' Left to-do two sections on CProfile and Ray Timeline Visualization.'

* Ray documentation - Fixed rst codeblock linebreaks in 'User Profiling'

* Ray documentation - For User Profiling, added section on cProfile

* Ray documentation - For User Profiling, completed Ray Timeline Visualization section, including graphical images

* Ray documentation - made User Profiling timeline image larger, minor wording edits

* Ray documentation - minor wording edits to User Profiling

* Ray documentation - User Profiling- fixed broken link

* Minor wording changes requested by Philipp Moritz addressed. Still need to address (1) compressing the image files, (2) correcting ex 3 to not be remote, and (3) using cProfile on an actor

* Ray documentation - For user-profiling.rst, revised example 3 to show a semi-parallelized example. Compressed timeline example image to be under 50 KB, removed view timeline GUI image. Updated timeline example image to reflect revised example 3. cProfile actor example left

* Ray documentation - in user-profiling.rst, added a new example including actors in the cProfile section

* Ray documentation - For user-profiling.rst, added section header for the Ray actor cProfile example

* Update user-profiling.rst

* Update user-profiling.rst

* 4 space indentation

* Update user-profiling.rst

* Update user-profiling.rst

* Update user-profiling.rst

* corrections
2018-07-12 16:57:39 -07:00
Robert Nishihara
515da7721a Change ray.worker.cleanup -> ray.shutdown and improve API documentation. (#2374)
* Change ray.worker.cleanup -> ray.shutdown and improve API documentation.

* Deprecate ray.worker.cleanup() gracefully.

* Fix linting
2018-07-12 12:00:00 -07:00
Eric Liang
b316afeb43 [rllib] Add debug info back to PPO and fix optimizer compatibility (#2366) 2018-07-12 19:22:46 +02:00
Eric Liang
8ea926c266
[rllib] _init renamed to _build_layers in example 2018-07-12 19:21:58 +02:00
Richard Liaw
5188b1d080
[autoscaler] Bug for file mounts for tilde (#2382) 2018-07-12 19:18:47 +02:00
Richard Liaw
0048e77093
[rllib] RLlib CLI (#2375) 2018-07-12 19:12:04 +02:00
Robert Nishihara
54487b1d7f Pin the number of CPUs in failing actor test. (#2368)
* Pin the number of CPUs in failing actor test.

* Pin number of CPUs in multi_node_test.py.

* Fix linting.
2018-07-11 18:34:19 -07:00
Philipp Moritz
4dadc60968 Update arrow to include uninitialized memory fixes (#2371) 2018-07-11 07:52:02 -05:00
Hanwei Jin
450b11f1d6 update to slf4j, remove DynamicLog (#2384) 2018-07-09 23:33:59 -07:00
Richard Liaw
55d5e28872 [core] Better Actor Representation (#2369) 2018-07-09 11:20:21 -07:00
Zhijun Fu
fa33ea5283 [Java] Java worker cluster support (#2359) 2018-07-09 10:20:41 -07:00