Commit graph

2721 commits

Author SHA1 Message Date
Eric Liang
60dbc771a2
Revert "[autoscaler] Fix redirects, fix submit (#4085)" (#4158)
This reverts commit acf4d53b55.
2019-02-25 17:00:59 -08:00
Eric Liang
3896b726dd Dynamically adjust redis memory usage (#4152)
* f

* Update services.py
2019-02-25 16:21:37 -08:00
Hao Chen
49dc85e54b Fix wrong ID type in prepare_checkpoint (#4124)
* Fix wrong ID type in prepare_checkpoint

* fix

* fix eq
2019-02-25 11:53:09 -08:00
Kristian Hartikainen
524e69a82d [autoscaler] Change the get behavior of node providers' _get_node (#4132)
* Change the get behavior of GCPNodeProvider._get_node

* Add lock around the GCPNodeProvider._get_node call

* rename nodes

* lint

* Update GCPNodeProvider._get_node to match aws implementation

* assert

* log

* log highest heartbeats

* rename

* bringup to connected

* prune heartbeat times

* fix bringup
2019-02-24 18:43:35 -08:00
Eric Liang
d9da183c7d
[rllib] Custom supervised loss API (#4083) 2019-02-24 15:36:13 -08:00
Robert Nishihara
7b04ed059e Move TensorFlowVariables to ray.experimental.tf_utils. (#4145) 2019-02-24 14:26:46 -08:00
Philipp Moritz
615d5516d1 Compile valgrind tests with Bazel (#4144) 2019-02-24 00:00:49 -08:00
Eric Liang
05d96ce81b
[rllib] Raise an error if multi-agent envs terminate without a last observation for agents (#4139)
* fix it

* lint

* Update rllib-training.rst
2019-02-23 21:23:40 -08:00
Robert Nishihara
688a0d17e6 Kill dashboard and reporter in ray stop. (#4116) 2019-02-23 12:08:39 -08:00
Philipp Moritz
ba52caff37 Make Bazel the default build system (#3898) 2019-02-23 11:58:59 -08:00
Philipp Moritz
9b3ce3e64b Revert inline objects PR (#4125)
* Revert "Inline objects (#3756)"

This reverts commit f987572795.

* fix rebase problems

* more rebase fixes

* add back debug statement
2019-02-22 18:21:01 -08:00
Eric Liang
f1239a7a63 Lint script link broken, also lint filter was broken for generated py files (#4133) 2019-02-22 17:33:08 -08:00
Eric Liang
9896df7799
[rllib] Guard against PPO value function not training with RNN models (#4037)
* better lstm settings

* 1.0

* docs

* warn on truncate

* clarify

* Update ppo_policy_graph.py

* Update ppo_policy_graph.py

* Update ppo_policy_graph.py
2019-02-22 11:18:51 -08:00
Zachary Barry
ae4dd1db76 Custom provider_config options for NodeProvider implementations (#4075)
* added a key to send custom provider_config options to NodeProvider implementations

* Update autoscaler.py

* Update autoscaler.py
2019-02-21 21:09:22 -08:00
Stefan Pantic
a54386e499 Added custom LSTM detection (#4087)
* Added autodetection of custom LSTM usage

* Reverted line separators

* Added check for LSTM

* Update vtrace_policy_graph.py

* Update appo_policy_graph.py
2019-02-21 21:07:48 -08:00
Tianming Xu
692bb336a1 Fix master branch compilation error and lint error (#4109) 2019-02-21 11:54:30 -08:00
William Ma
fedad488d8 Kills gdb processes with ray stop (#4046) 2019-02-21 11:28:26 -08:00
William Ma
c7a4c74f55 Moving tests from test/ to python/ray/tests/ (#3950) 2019-02-21 11:09:08 -08:00
Jones Wong
acbe0b4e5f Fix twin q bug (#4108) 2019-02-21 10:47:01 -08:00
Tianming Xu
94eaaed197 [rllib]convert export format to lower case while validating (#4088)
* convert export format to lower case while validating

* fix lint error
2019-02-21 10:40:28 -08:00
Daniel Edgecumbe
2e30f7ba38 Add a web dashboard for monitoring node resource usage (#4066) 2019-02-21 00:10:04 -08:00
Jones Wong
3ac8fd7ee8 Exploration with Parameter Space Noise (#4048)
*  enable parameter space noise for exploration

*  enable parameter space noise for exploration

*  yapf formatted

*  remove the usage of scipy softmax avialable in the latest version only

*  enable subclass that has no parameter_noise in the config

*  run user specified callbacks and test parameter space noise in multi node setting

*  formatted by yapf

* Update dqn.py

* lint
2019-02-20 22:35:18 -08:00
Philipp Moritz
bcd5af78c7 Lint Cython files (#4097) 2019-02-20 22:29:25 -08:00
Richard Liaw
acf4d53b55
[autoscaler] Fix redirects, fix submit (#4085) 2019-02-20 21:35:33 -08:00
Yuhong Guo
3549cd8195
Add the Delete function in GCS (#4081)
* Add the Delete function in GCS

* Unify BatchDelete and Delete

* Fix comment

* Lint

* Refine according to comments

* Unify test.

* Address comment

* C++ lint

* Update ray_redis_module.cc
2019-02-21 13:33:37 +08:00
Yuhong Guo
1f864a02bc Add option of load_code_from_local which is required in cross-language ray call. (#3675) 2019-02-21 12:37:17 +08:00
Eric Liang
e3066d1fa5
[autoscaler] Try making GCP node provider thread-safe 2019-02-20 16:35:20 -08:00
Hao Chen
a99676e39b [Java] lint unused imports (#4100) 2019-02-20 12:37:04 -08:00
Csordás Róbert
b2677fabc0 [tune] Fix not saving a checkpoint in certain cases (issue #4041) (#4053)
## What do these changes do?

It saves checkpoint if needed regardless of what the scheduler have returned. Until now, it have not saved the checkpoint when scheduler returned TrialScheduler.PAUSE, which caused PopulationBasedTraining preventing to save any checkpoints in certain cases. See issue #4041 for more details.

## Related issue number
#4041
2019-02-20 11:54:28 -08:00
mika
64c95aea85 [rllib] Update README.md for qmix (#4101)
## What do these changes do?

Fixed PyMARL repository path.

## Related issue number

N/A
2019-02-20 10:21:08 -08:00
alegithub111
67fa0b5c25 Refine JNI bazel script to make it suitable for more systems (#4060)
* Refine JNI bazel script to make it suitable for more systems

* Update BUILD.bazel

the script format has changed

* Update BUILD.plasma

the script format has changed

* Lint bazel/BUILD.plasma  BUILD.bazel
2019-02-20 22:37:41 +08:00
Robert Nishihara
e7651b1117 Fix excessive buffering of worker stdout/stderr. (#4094)
* Start workers with 'python -u' to prevent buffering of prints.

* Set sys.stdout and sys.stderr.

* Add comment.
2019-02-19 20:20:47 -08:00
Robert Nishihara
5fe7b1c618 Make object_manager_test::test_object_transfer_retry less flaky. (#4057)
* Make object_manager_test::test_object_transfer_retry less flaky.

* Make the test pass.
2019-02-19 20:03:11 -08:00
Eric Liang
e9ee38ace2 More compact format for worker logs (#4092) 2019-02-19 19:53:43 -08:00
Robert Nishihara
c92a867c8b Fix log monitor CPU utilization. (#4091) 2019-02-19 12:19:21 -08:00
Wang Qing
794a093249 Add runtime_context to get some runtime fields in worker (#4065) 2019-02-19 15:57:30 +08:00
Wang Qing
7574757391 Fix crash for Java task's task.argument() in state. (#4063) 2019-02-19 12:46:07 +08:00
Philipp Moritz
cfc7e2c5a9 Fix modin test (#4069) 2019-02-18 12:17:36 -08:00
Eric Liang
6e46d75554
[tune] Remove slow gzip of checkpoints; ignore jupyter stop errors (#4076)
* fix gzip

* ignore jupyter
2019-02-18 01:30:13 -08:00
Eric Liang
f8bef004da
[rllib] Improve error message for bad envs, add remote env docs (#4044)
* commit

* fix up rew
2019-02-18 01:28:19 -08:00
Robert Nishihara
b78d77257b Speed up test/component_failures_test.py::test_actor_creation_node_failure. (#4056) 2019-02-17 15:35:54 -08:00
Robert Nishihara
5a9098891f Add serialization test for more collection types. (#3982)
* Add serialization test for more collection types.

* Reorganize serialization tests a little.

* Update
2019-02-17 13:57:33 -08:00
Philipp Moritz
f51969964d Fix linting on master (#4077) 2019-02-17 13:55:40 -08:00
Megan Kawakami
346885068c [rllib] add torch pg (#3857)
* add torch pg

* add torch imports

* added torch pg

* working torch pg implementation

* add pg pytorch

* Update a3c.py

* Update a3c.py

* Update torch_policy_graph.py

* Update torch_policy_graph.py
2019-02-16 19:54:14 -08:00
Zekun Shi
a708ab66f5 Add simplex action space and dirichlet action distribution (#4070)
* add simplex action space and dirichlet action distribution

* Update and rename spaces.py to extra_spaces.py

* Update __init__.py

* Update catalog.py

* Fix python 2

* Update extra_spaces.py

* change Simplex.contains() to return False
2019-02-16 12:44:59 -08:00
Kristian Hartikainen
0cc5c88075 [tune] Add number of trials to the trial runner logger (#4068) 2019-02-16 01:12:59 -08:00
Yu Kobayashi
d2d66c576e Support non ascii characters in the source code (#4047) 2019-02-16 11:45:44 +08:00
Hao Chen
de17443dc2
Propagate backend error to worker (#4039) 2019-02-16 11:39:15 +08:00
William Ma
4be3d0c5d3 Update shipped modin to 0.3.1 (#4058) 2019-02-15 15:49:38 -08:00
Robert Nishihara
2d07df7f3f Replace '__main__' with "__main__". (#4055) 2019-02-15 13:32:43 -08:00