Hao Chen
0ec3a16bbd
Fix Java MultithreadingTest ( #5182 )
2019-07-12 19:00:13 +08:00
Stephanie Wang
f46c555e9e
Only get actor ID if actor task ( #5180 )
2019-07-12 14:31:21 +08:00
vipulharsh
3b42d5ccb1
Track newly created actor's parent actor ( #5098 )
...
* Track parent actor of actor
* Update src/ray/raylet/node_manager.cc
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>
* Update src/ray/raylet/node_manager.cc
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>
* fixing a comment
* Fixing typo in a comment
* capturing task_spec instead of actor_data
* adding const for some local variables
* changing an if else to else
* Linted version
* use updated method to create task from task_data
Change-Id: I9c1a65134dc23a2d175047e96b86ab9d9cf61971
* fixing linter issues
Change-Id: I1def06218130b399d2527b999258aecf9abb98dd
2019-07-11 14:52:04 -07:00
Kristian Hartikainen
3456afdea7
[autoscaler] Fix missing body argument in GCP getIamPolicy
#5169
2019-07-11 13:03:51 -07:00
Philipp Moritz
ccee77aafd
fix node_failures.py ( #5167 )
2019-07-11 11:40:13 -07:00
Zhijun Fu
1649f1370e
[direct call] changes raylet to push tasks to worker ( #5140 )
...
* refactor grpc server
* format
* change GetTask() to PushTask()
* change PushTask to AssignTask
* format
* add resource_ids
* move done_callback to server call
* remove SetTaskHandler and initialize it in task receiver's constructor
* format
* resolve comments
* update
* update
* Update src/ray/core_worker/core_worker.cc
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>
* resolve comments
* format
* Update src/ray/core_worker/transport/raylet_transport.cc
Co-Authored-By: Hao Chen <chenh1024@gmail.com>
* resolve comments
* resolve comments
* fix build
* format
* fix
* format
* noop
2019-07-11 11:01:32 -07:00
Hao Chen
fd835d107e
Move task to common module and add checks in getter methods ( #5147 )
2019-07-11 17:07:04 +08:00
Kai Yang
d8b50a5018
Fix GcsClient resource map ( #5171 )
2019-07-11 16:05:12 +08:00
Qing Wang
f2293243cc
[ID Refactor] Shorten the length of JobID to 4 bytes ( #5110 )
...
* WIP
* Fix
* Add jobid test
* Fix
* Add python part
* Fix
* Fix tes
* Remove TODOs
* Fix C++ tests
* Lint
* Fix
* Fix exporting functions in multiple ray.init
* Fix java test
* Fix lint
* Fix linting
* Address comments.
* FIx
* Address and fix linting
* Refine and fix
* Fix
* address
* Address comments.
* Fix linting
* Fix
* Address
* Address comments.
* Address
* Address
* Fix
* Fix
* Fix
* Fix lint
* Fix
* Fix linting
* Address comments.
* Fix linting
* Address comments.
* Fix linting
* address comments.
* Fix
2019-07-11 14:25:16 +08:00
Hao Chen
88365d4112
Fix Java MultithreadingTest ( #5170 )
2019-07-11 13:40:40 +08:00
Kai Yang
43b6513d19
[GCS] Move node resource info from client table to resource table ( #5050 )
2019-07-11 13:17:19 +08:00
Richard Liaw
691c9733f9
[tune] Document trainable attributes and enable user-checkpoint… ( #4868 )
2019-07-10 18:51:11 -07:00
Philipp Moritz
e6a81d40a5
[stability] Make task result for RemoveTask optional ( #5146 )
...
* make task result for RemoveTask optional
* lint
* update
* update
* update
* rename
* lint
2019-07-10 13:33:41 -07:00
Hao Chen
0c34749779
Use bazel disk cache for all CI jobs ( #5144 )
2019-07-10 22:03:45 +08:00
Richard Liaw
0b540ab492
[tune] Test example checkpointing ( #4728 )
2019-07-10 01:58:26 -07:00
Joey Jiang
e55c8ca165
Fix crash because of the reference to deleted variable in grpc server call ( #5158 )
2019-07-10 14:06:21 +08:00
Edward Oakes
2b7b7c7547
Add linting pre-push hook ( #5154 )
2019-07-09 21:49:12 -07:00
Eric Liang
5ab5017c67
[rllib] Fix impala stress test ( #5101 )
...
* add copy
* upgrade to tf 1.14
* update
* reduce count to workaround https://github.com/ray-project/ray/issues/5125
* Update impala.py
* placeholder
* comments
* update
2019-07-09 20:22:30 -07:00
Joey Jiang
5733690aa6
Add success and fail callback of grpc sending reply ( #5141 )
2019-07-09 17:03:57 +08:00
Eric Liang
5aec750107
Add warning/error if object store memory exceeds available memory ( #4893 )
...
* exclude
* format
* add warning
* hatch
* reduce mem usage
* reduce object store mem
* set obj mem
2019-07-08 21:37:08 -07:00
Stefan Pantic
dfc94ce7bc
[rllib]Add entropy coeff decay ( #5043 )
2019-07-08 18:30:32 -07:00
Daniel Edgecumbe
eeb67db861
[autoscaler] Log AWS NodeProvider create_instances ( #4998 )
...
* autoscaler: Log on AWS NodeProvider create_instances
* logging
2019-07-08 13:22:26 -07:00
Hao Chen
8a30b93e42
Define common data structures with protobuf. ( #5121 )
2019-07-08 22:41:37 +08:00
Joey Jiang
b4e51c8aa1
Support clang-format whose version is not 7.0 ( #5139 )
2019-07-08 17:15:09 +08:00
Sam Toyer
7ad854d4c6
[tune] Use traceback.format_tb() ( fixes #5135 ) ( #5136 )
2019-07-08 01:13:06 -07:00
Joey Jiang
274233962f
Remove unused connection file in object manager ( #5123 )
2019-07-08 10:59:36 +08:00
Eric Liang
893744b3be
[rllib] Revert "use make template" which seems to break DQN/Atari ( #5134 )
...
* Revert "use make template"
This reverts commit 291e9e0031c6e315fe24e5b4973dea375fe73918.
* debug vars
2019-07-07 19:51:26 -07:00
Morgan Giraud
7e020e7183
[tune] tune.run keep_checkpoints_num ( #5117 )
...
* Add missing argument keep_checkpoints_num to tune
* expose keep checkpoints
2019-07-07 17:14:56 -07:00
Edward Oakes
8f53364097
Improve local_mode ( #5060 )
2019-07-07 17:10:50 -07:00
Eric Liang
932d6b2517
[rllib] Port IMPALA to ModelV2/build_tf_policy ( #5130 )
...
* port vtrace
* fix vf
* fix vs
* fix the example
* wip ddpg
* fix tests
* fix tests
* remove ddpg model
* comments
* set vf share layers True by default
* typo
* fix test
2019-07-07 15:06:41 -07:00
Richard Liaw
6a14f1a540
[autoscaler] Small fixes for local cluster usability ( #4864 )
2019-07-06 21:55:18 -07:00
Richard Liaw
1798d4f077
[autoscaler] Add hard kill and monitor commands ( #5082 )
...
* Add hard kill and monitor commands
* better_commands
* Update python/ray/scripts/scripts.py
Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2019-07-06 21:52:55 -07:00
Eric Liang
445bcb29b0
[hotfix] fix backward compat with older yaml libraries
2019-07-06 20:41:28 -07:00
Eric Liang
c15ed3ac55
[rllib] Shuffle RNN sequences in PPO as well ( #5129 )
...
* shuffle seq
* fix test
2019-07-06 20:40:49 -07:00
Brandon Bertelsen
c04b69902c
Updates for #5072 ( #5091 )
2019-07-06 16:05:50 -07:00
Eric Liang
0448847a02
Update protobuf version ( #5128 )
2019-07-06 15:59:55 -07:00
Aleksei Petrenko
09bde397c9
Multiagent experiment resume ( #5102 )
...
* Fixed problem with multiagent experiment resume
* Applied format script
* fix lint
2019-07-06 11:38:17 -07:00
Dušan Josipović
e9b88dcbed
[wingman -> tune] Add system performance tracking ( #4924 )
2019-07-06 00:57:35 -07:00
Richard Liaw
c3e9d94b18
[tune][minor] Reduce checkpointing frequency ( #4859 )
2019-07-06 00:54:24 -07:00
Kim Jeong Ju
4b56a5eb27
[tune] missing torch.load in mnist_pytorch_trainable.py ( #5103 )
2019-07-06 00:14:41 -07:00
Philipp Moritz
c5253cc300
Add job table to state API ( #5076 )
2019-07-06 00:05:48 -07:00
Richard Liaw
53d5a8a45f
[tune] Fix sort ( #5111 )
...
* fix sort
* fix tune list-experiments
* Update python/ray/tune/tests/test_commands.py
2019-07-05 16:05:10 -07:00
Joey Jiang
4183303a2f
Add bazel build options for plasma to use glog ( #5108 )
2019-07-05 19:00:19 +08:00
Robert Nishihara
9cc4cc6a52
Fail format.sh if yapf/flake8 versions are incorrect. ( #5083 )
2019-07-04 23:22:01 -07:00
Zhijun Fu
54d5969cea
[grpc] Add grpc server to worker ( #5054 )
...
* refactor grpc server
* format
* change GetTask() to PushTask()
* change PushTask to AssignTask
* format
* update
* fix test
* format
* Update src/ray/rpc/worker_client.h
Co-Authored-By: Hao Chen <chenh1024@gmail.com>
* Update BUILD.bazel
* Update src/ray/core_worker/task_execution.cc
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>
* update
* format
* address comments
* format
* Update src/ray/rpc/worker/worker_server.h
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>
* Update src/ray/protobuf/worker.proto
Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>
* format
* fix
* format
2019-07-04 20:16:42 +08:00
ztangent
41a16c55ef
[tune] Fixed bug with joining experiment_path twice. ( #5106 )
2019-07-03 22:48:07 -07:00
Patrick
1a543a6571
[serve] add missing __init__.py file under serve/utils ( #4609 )
...
* bugfix: add missing serve/utils __init__.py file
* Update __init__.py
* lint
2019-07-03 17:27:59 -07:00
Richard Liaw
0dbb6c4911
[tune] PBT perturbing after first iteration ( #5097 )
2019-07-03 17:27:26 -07:00
Eric Liang
34d054ff19
[rllib] ModelV2 API ( #4926 )
2019-07-03 15:59:47 -07:00
Kristian Hartikainen
9e0192bc0b
[tune] Change the log syncing behavior ( #4450 )
...
* Change the log syncing behavior
* fix up abstractions for syncer
* Finished checkpoint syncing
* Code
* Set of changes to get things running
* Fixes for log syncing
* Fix parts
* Lint and other fixes
* fix some test
* Remove extra parsing functionality
* some test fixes
* Fix up cloud syncing
* Another thing to do
* Fix up tests and local sync
Changes LogSync into a mixin, and adds tests for different
functionalities.
* Fix up tests, start on local migration
* fix distributed migrations
* comments
* formatting
* Better checkpoint directory handling
* fix tests
* fix tests
* fix click
* comments
* formatting comments
* formatting and comments
* sync function deprecations
* syncfunction
* Add documentation for Syncing and Uploading
* nit
* BaseSyncer as base for Mixin in edge case
* more docs
* clean up assertions
* validate
* nit
* Update test_cluster.py
* betterdoc
* Update tune-usage.rst
* cleanup
* nit
2019-07-02 20:46:00 -07:00