Commit graph

1328 commits

Author SHA1 Message Date
Si-Yuan
9ce3039390
Fix webui api (#4686)
* fix webui

* Apply suggestions from code review

lint

Co-Authored-By: suquark <suquark@gmail.com>

* add dependencies for this unittest

* move dependencies to the script file
2019-04-27 15:23:56 +08:00
Sam Toyer
663e92ab3f [rllib] TD3/DDPG improvements and MuJoCo benchmarks (#4694)
* [rllib] Separate optimisers for DDPG actor & crit.

* [rllib] Better names for DDPG variables & options

Config changes:

- noise_scale -> exploration_ou_noise_scale
- exploration_theta -> exploration_ou_theta
- exploration_sigma -> exploration_ou_sigma
- act_noise -> exploration_gaussian_sigma
- noise_clip -> target_noise_clip

* [rllib] Make DDPG less class-y

Used functions to replace three classes with only an __init__ method & a
handful of unrelated attributes.

* [rllib] Refactor DDPG noise

* [rllib] Unify DDPG exploration annealing

Added option "exploration_should_anneal" to enable linear annealing of
exploration noise. By default this is off, for consistency with DDPG &
TD3 papers. Also renamed "exploration_final_eps" to
"exploration_final_scale" (that name seems to have been carried over
from DQN, and doesn't really make sense here). Finally, tried to rename
"eps" to "noise_scale" wherever possible.
2019-04-26 17:49:53 -07:00
Eric Liang
47cca971b5
Don't delete files in rsync up, and also shorten timeout (#4688) 2019-04-25 12:18:42 -07:00
Qing Wang
f39b6747e5 Refactor command line argument parsing with gflags (#4676) 2019-04-24 14:53:07 +08:00
William Ma
c99e3caaca Change resource bookkeeping to account for machine precision. (#4533) 2019-04-23 11:59:53 -07:00
justinwyang
8dfc833a8b Change all instances of JobID to DriverID. (#4431) 2019-04-22 16:28:09 -07:00
Andrew
06c768823c [rllib] train-eval loop implementation for rllib.Trainer class (#4647) 2019-04-21 12:08:04 -07:00
Devin Petersohn
d5df91b031 Bump version to 0.7.0dev3 (#4671) 2019-04-19 17:06:14 -07:00
Vlad Firoiu
39a09fa457 Turn replay into a circular queue. (#4667) 2019-04-19 11:42:00 -07:00
Wang Qing
9d481cc2e6 [hotfix] Missing import breaks Travis builds 2019-04-18 23:12:44 -07:00
Eric Liang
5a562bbf12
[rllib] Fix num_gpus cast and raise error on large batch (#4652) 2019-04-18 15:23:29 -07:00
Eric Liang
6848dfd179
[rllib] Replace ray.get() with ray_get_and_free() to optimize memory usage (#4586) 2019-04-17 20:30:03 -04:00
Eric Liang
3fd9dea721
[rllib] Fix tune.run(Agent class) (#4630)
* update

* Update __init__.py
2019-04-15 09:12:23 -07:00
Richard Liaw
776a7308c8
[tune] Better ASHA defaults (#4623)
## What do these changes do?
Sets ASHA defaults to paper defaults.


## Related issue number


## Linter

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
2019-04-15 01:45:43 -07:00
Vlad Firoiu
f600591468 Cast MultiCategorical num_outputs to int. (#4629) 2019-04-14 19:51:37 -07:00
Robert Nishihara
967e8aad9d Make def test_submitting_many_actors_to_one less stressful. (#4622) 2019-04-14 12:19:57 -07:00
Andrew Tan
57af1c6819 Update volume size to 100 (#4616) 2019-04-14 11:40:16 -07:00
Zachary Barry
3838548356 Custom SSH socket directories (#4299)
* ssh_control_path added as an auth option.

* revamped default ssh options to take in control path, nodeupdater checks auth config to see if a custom SSH sockets path was specified, otherwise the original hardcoded path is used. control path is now a nodeupdater instance variable

* revert socketdir in auth config and change method for determining dir

* new ssh dir method

* Lint

* ' -> " lint

* changed using USER env to getpass.getuser()
2019-04-13 23:55:41 -07:00
Daniel Edgecumbe
3e1adafbce [autoscaler] Add an aggressive_autoscaling flag (#4285) 2019-04-13 18:44:32 -07:00
Devin Petersohn
56a78baf67 Bump version to 0.6.6 (#4621) 2019-04-13 10:37:17 -07:00
Richard Liaw
0bfb0d2c29 [tune] Fix checkpointing for Gym Types 2019-04-12 21:03:56 -07:00
Eric Liang
6e7680bf21
[rllib] Clean up concepts documentation and policy optimizer creation (#4592) 2019-04-12 21:03:26 -07:00
Romil Bhardwaj
0f42f87ebc Updating zero capacity resource semantics (#4555) 2019-04-12 16:53:57 -07:00
cfan
bb207a205b [rllib] Support torch device and distributions. (#4553) 2019-04-12 11:39:14 -07:00
Wang Qing
fe07a5b4b1 Add delete_creating_tasks option for internal.free() (#4588)
* add delete creating task objects.

* format code style

* Fix lint

* add tests add address comments.

* Refine test

* Refine java test

* Fix CI

* Refine

* Fix lint

* Fix CI
2019-04-12 13:38:31 +08:00
justinwyang
e88e706fcc Enforce quoting style in Travis. (#4589) 2019-04-11 14:24:26 -07:00
Kristian Hartikainen
ed02bf11f7 [autoscaler] Lint code that we forgot to lint in #4537 (#4584)
* Lint code that we forgot to lint in previous PR

* Revert setup command merge

* Lint

* Revert "Revert setup command merge"

This reverts commit 55e1cdb1f256ea51ef66a38730d8f7865f1f5ad1.

* Fix testReportsConfigFailures test

* Minor syntax tweaks

* Lint
2019-04-10 17:01:36 +08:00
Vlad Firoiu
74fd3d7e21 [rllib] Support prev_state/prev_action in rollout and fix multiagent (#4565)
* Cleaner and more correct treatment of agent states in rollout.py

* support lstm_use_prev_action_reward in rollout.py

* Linter.

* appease flake8

* Use _DUMMY_AGENT_ID instead of 0.

* All agents have a policy_agent_mapping.
Reset the mapping cache at the start of each episode.

* Update rollout.py

* Fix rollout.py for single-agent envs.

* Use agent_id, not policy_id.
2019-04-10 00:01:25 -07:00
Eric Liang
f8e8743347
[tune] Improve PBT example (#4575) 2019-04-09 20:59:17 -07:00
Si-Yuan
dab99d26af
Improve code related to node (#4383)
* Make full use of node

implement local node

fix bugs mentioned in comments

* Add more tests

* Use more specific exception handling

* fix, lint

* fix for py2.x
2019-04-09 17:27:54 +08:00
Eric Liang
4f46d3e9bf
[rllib] Add multi-agent examples for hand-coded policy, centralized VF (#4554) 2019-04-09 00:36:49 -07:00
Stefan Pantic
915486984a [autoscaler] Add support for separate docker containers on head and worker nodes (#4537)
* Added support for running different docker containers on clusters

* Remove node specific container names

* Keep old options and expand with node specific configuration

* Optimized imports

* Changed docker fields for autoscaler

* Auto reformat

* Updated comments

* Updated condition

* Run linter

* Updated example

* Changed condition for docker images, updated examples

* Removed duplicate line

* Fixed setup_commands

* Update autoscaler.py

* fix_better_image
2019-04-07 16:51:32 -07:00
Jones Wong
da5a471485 [rllib] validate observation in NoPreprocessor (#4546) 2019-04-07 16:11:50 -07:00
Eric Liang
f9b8e77e3b
[rllib] Don't merge unrolls from same episode when calculating seq lens (#4557) 2019-04-07 12:11:30 -07:00
Eric Liang
37208216ae
[rllib] Rename Agent to Trainer (#4556) 2019-04-07 00:36:18 -07:00
Dušan Josipović
820c71b7d0 [tune/rllib] Add checkpoint eraser (#4490) 2019-04-06 20:01:54 -07:00
ctombumila37
7746d20d30 [rllib] ExternalMultiAgentEnv (#4200) 2019-04-06 19:58:14 -07:00
Andrew Tan
991b911e1d [tune] Add --columns flag for CLI (#4564) 2019-04-05 19:49:01 -07:00
Jérémy
300ec72d15 [tune] Add compatibility to nevergrad 0.2.0+ (#4529)
## What do these changes do?

This PR prepares for future version  0.2.0 of `nevergrad`, in which each suggestion is a `Candidate` instance having fields `args` and `kwargs` instead of being a `np.ndarray`. The proposed changes are compatible with all versions of `nevergrad` (manually tested with `nevergrad_example.py` on both `master` and current version `v0.1.6`).

See `nevergrad`'s [CHANGELOG](https://github.com/facebookresearch/nevergrad/blob/master/CHANGELOG.md) for more information on the change.

## Related issue number

None

## Linter

- [x] I've run `scripts/format.sh` to lint the changes in this PR.
2019-04-05 19:44:58 -07:00
Andrew Tan
bfd0af52bc [tune] Add documentation to --output flag (#4518)
## What do these changes do?

Add documentation for the `--output` flag for ls / lsx in the Tune CLI.

## Related issue number

Closes #4511 

## Linter

- [x] I've run `scripts/format.sh` to lint the changes in this PR.
2019-04-05 00:16:35 -07:00
Richard Liaw
50b2aa0740
[tune] Better handling of tune.function in global checkpoint (#4519)
Enables result keys to be queried by CLI.
2019-04-04 21:08:47 -07:00
Federico Fontana
fb88f7efe6 Fixed bug in Dirichlet (#4440) (#4560) 2019-04-04 14:33:09 -07:00
Yuhong Guo
c2349cf12d Remove local/global_scheduler from code and doc. (#4549) 2019-04-03 17:05:09 -07:00
Adi Zimmerman
51dae23d5c [tune] Search Alg delay import + CLI timing test (#4230) 2019-04-03 08:52:45 -07:00
Philipp Moritz
b0f6ddf6d1 Remove CMake files (#4493) 2019-04-02 22:17:33 -07:00
Hao Chen
23404f7bcf Fix some flaky tests (#4535) 2019-04-02 17:57:11 -07:00
Simon Mo
db4cf24636 [serve] Double Serialization Optimization (#4532) 2019-04-02 12:35:03 -07:00
Eric Liang
55a2d39409
[rllib] Add option for RNN state and value estimates to span episodes (#4429)
* wip soft horizon

* tests
2019-04-02 02:44:15 -07:00
Yuhong Guo
c2c548bdfd Fix broken pipe callback (#4513) 2019-04-02 17:42:18 +08:00
Jones Wong
fe7763e786 [rllib] replace the assertion in SyncReplayOptimizer by a warning (#4534) 2019-04-02 01:43:22 -07:00