hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-09 12:56:46 -04:00

Author	SHA1	Message	Date
Maltimore	0ec613c95a	[rllib] doc: fix typo: on_postprocess_batch -> on_postprocess_traj (#6438 )	2019-12-11 15:00:53 -08:00
Eric Liang	bc5e259264	[rllib] Add a doc section on computing actions (#6326 ) * options doc * add note * hint shr * doc update	2019-12-03 00:10:50 -08:00
Eric Liang	e4565c9cc6	Reduce RLlib log verbosity (#6154 )	2019-11-13 18:50:45 -08:00
David Bignell	3f83b2daa9	[rllib] Rollout extensions (#6065 ) * Rollout improvements * Make info-saving optional, to avoid breaking change. * Store generating ray version in checkpoint metadata * Keep the linter happy * Add small rollout test * Terse. * Update test_io.py	2019-11-05 20:34:18 -08:00
gehring	8903bcd0c3	[rllib] Tracing for eager tensorflow policies with `tf.function` (#5705 ) * Added tracing of eager policies with `tf.function` * lint * add config option * add docs * wip * tracing now works with a3c * typo * none * file doc * returns * syntax error * syntax error	2019-09-17 01:44:20 -07:00
Eric Liang	74abeab057	[rllib] Improve accessing model state docs (#5656 ) * [rllib] better model docs * fix * s	2019-09-08 23:01:26 -07:00
Eric Liang	1455a19c85	Consolidate and clean up documentation (#5645 )	2019-09-07 11:50:18 -07:00
Richard Liaw	34f6d2fc5c	[tune] Update trainable docs and support hparams (#5558 )	2019-09-04 12:44:42 -07:00
Eric Liang	daf38c8723	[tune] Deprecate tune.function (#5601 ) * remove tune function * remove examples * Update tune-usage.rst	2019-08-31 16:00:10 -07:00
Eric Liang	550c96b965	[rllib] Add docs on policy.model (#5597 )	2019-08-30 21:10:42 -07:00
Eric Liang	7d28bbbdbb	[rllib] Document on traj postprocess (#5532 ) * document on traj postprocess * shorten it	2019-08-24 20:37:45 -07:00
gehring	b520f6141e	[rllib] Adds eager support with a generic `TFEagerPolicy` class (#5436 )	2019-08-23 14:21:11 +08:00
Eric Liang	5d7afe8092	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
Richard Liaw	1eaa57c98f	[tune] Distributed example + walkthrough (#5157 )	2019-08-02 09:17:20 -07:00
Kristian Hartikainen	13fb9fe3db	[rllib] Feature/soft actor critic v2 (#5328 ) * Add base for Soft Actor-Critic * Pick changes from old SAC branch * Update sac.py * First implementation of sac model * Remove unnecessary SAC imports * Prune unnecessary noise and exploration code * Implement SAC model and use that in SAC policy * runs but doesn't learn * clear state * fix batch size * Add missing alpha grads and vars * -200 by 2k timesteps * doc * lazy squash * one file * ignore tfp * revert done	2019-08-01 23:37:36 -07:00
Eric Liang	20450a4e82	[rllib] Add rock paper scissors multi-agent example (#5336 )	2019-08-01 13:03:59 -07:00
Eric Liang	9e328fbe6f	[rllib] Add docs on how to use TF eager execution (#4927 )	2019-06-07 16:42:37 -07:00
Eric Liang	7501ee51db	[rllib] Rename PolicyEvaluator => RolloutWorker (#4820 )	2019-06-03 06:49:24 +08:00
Eric Liang	4f46d3e9bf	[rllib] Add multi-agent examples for hand-coded policy, centralized VF (#4554 )	2019-04-09 00:36:49 -07:00
Eric Liang	37208216ae	[rllib] Rename Agent to Trainer (#4556 )	2019-04-07 00:36:18 -07:00
Eric Liang	fce0062380	[rllib] Switch to tune.run() instead of run_experiments() (#4515 )	2019-03-30 14:07:50 -07:00
Eric Liang	cff08e19ff	[rllib] Print out intermediate data shapes on the first iteration (#4426 )	2019-03-26 00:27:59 -07:00
Eric Liang	4b8b703561	[rllib] Some API cleanups and documentation improvements (#4409 )	2019-03-21 21:34:22 -07:00
Eric Liang	05d96ce81b	[rllib] Raise an error if multi-agent envs terminate without a last observation for agents (#4139 ) * fix it * lint * Update rllib-training.rst	2019-02-23 21:23:40 -08:00
Eric Liang	c4182463f6	[rllib] Add helper to iterate over envs in a vectorized environment (#4001 ) * add foreach env func * fix * add test	2019-02-11 10:40:47 -08:00
Eric Liang	fb73cedf70	[rllib] Add examples page, add hierarchical training example, delete SC2 examples (#3815 ) * wip * lint * wip * up * wip * update examples * wip * remove carla * update * improve envspec * link to custom * Update rllib-env.rst * update * fix * fn * lint * ds * ssd games * desc * fix up docs * fix	2019-01-29 21:06:09 -08:00
Eric Liang	e78562b2e8	[rllib] Misc fixes: set lr for PG, better error message for LSTM/PPO, fix multi-agent/APEX (#3697 ) * fix * update test * better error * compute * eps fix * add get_policy() api * Update agent.py * better err msg * fix * pass in rew	2019-01-06 19:37:35 -08:00
Eric Liang	b8a9e3f106	[rllib] Remove uses of sgd_stepsize => lr (#3667 ) * lr * Update example-evolution-strategies.rst	2019-01-01 12:01:27 +08:00
Richard Liaw	e046a5c767	[tune] resources_per_trial from trial_resources (#3580 ) Renaming variable due to user errors.	2018-12-20 19:00:47 -08:00
Eric Liang	db0dee573e	[rllib] Q-Mix implementation (Q-Mix, VDN, IQN, and Ape-X variants) (#3548 )	2018-12-18 10:40:01 -08:00
Eric Liang	d864f299d7	[rllib] fixes from dogfooding multi-agent (#3456 ) auto wrap multi-agent dict and tuple spaces by keeping a policy -> preprocessor in the sampler add some Q-learning debug stats report min, max of custom metrics better errors	2018-12-05 23:31:45 -08:00
Eric Liang	93a9d32288	[docs] Switch docs to use rllib train instead of train.py	2018-12-04 17:36:06 -08:00
Eric Liang	ce355d13d4	[rllib] Allow envs to be auto-registered; add on_train_result callback with curriculum example (#3451 ) * train step and docs * debug * doc * doc * fix examples * fix code * integration test * fix * ... * space * instance * Update .travis.yml * fix test	2018-12-03 23:15:43 -08:00
Eric Liang	f0df97db6f	[rllib] example and docs on how to use parametric actions with DQN / PG algorithms (#3384 )	2018-11-27 23:35:19 -08:00
Eric Liang	abdc3b592e	[rllib] Update multi-gpu impala numbers (#3327 )	2018-11-19 20:55:27 -08:00
Eric Liang	65c27c70cf	[rllib] Clean up agent resource configurations (#3296 ) Closes #3284	2018-11-13 18:00:03 -08:00
Eric Liang	bd0dbde149	[rllib] Rename ServingEnv => ExternalEnv (#3302 )	2018-11-12 16:31:27 -08:00
eugenevinitsky	344b4ef0ff	[rllib] Fix filter sync for ES and ARS (#2918 )	2018-11-06 19:09:34 -08:00
Eric Liang	369cb833fe	[rllib] Implement custom metrics (#3144 )	2018-11-03 18:48:32 -07:00
Eric Liang	af0c1174cd	[sgd] Merge sharded param server based SGD implementation (#3033 ) This includes most of the TF code used for the OSDI experiment. Perf sanity check on p3.16xl instances: Overall scaling looks ok, with the multi-node results within 5% of OSDI final numbers. This seems reasonable given that hugepages are not enabled here, and the param server shards are placed randomly. $ RAY_USE_XRAY=1 ./test_sgd.py --gpu --batch-size=64 --num-workers=N \ --devices-per-worker=M --strategy=<simple\|ps> \ --warmup --object-store-memory=10000000000 Images per second total gpus total \| simple \| ps ======================================== 1 \| 218 2 (1 worker) \| 388 4 (1 worker) \| 759 4 (2 workers) \| 176 \| 623 8 (1 worker) \| 985 8 (2 workers) \| 349 \| 1031 16 (2 nodes, 2 workers) \| 600 \| 1661 16 (2 nodes, 4 workers) \| 468 \| 1712 <--- OSDI perf was 1817	2018-10-27 21:25:02 -07:00
Eric Liang	a9e454f6fd	[rllib] Include config dicts in the sphinx docs (#3064 )	2018-10-16 15:55:11 -07:00
Eric Liang	814c35b7d7	[rllib] Simplify sample batch size and num envs config, n_step adjustment (#2995 ) * simplify vec batch requirements * Update rllib-training.rst * Update rllib-training.rst * Update rllib-training.rst * Update rllib-training.rst * Update rllib-training.rst * Update rllib-models.rst	2018-09-30 18:36:22 -07:00
Eric Liang	3cde5957b3	[rllib] Better document APIs to access policy state (#2932 ) * fix * doc * example * up	2018-09-24 19:08:32 -07:00
Eric Liang	995ac24a2c	[rllib] clarify train batch size for PPO (#2793 ) It's possible to configure PPO in a way that ends up discarding most of the samples (they are treated as "stragglers"). Add a warning when this happens, and raise an exception if the waste is particularly egregious.	2018-09-05 12:06:13 -07:00
Eric Liang	df4788e501	[rllib/tune] Add test for fractional gpu support in xray mode; add rllib support for fractional gpu (#2768 ) * frac gpu * doc * Update rllib-training.rst * yapf * remove xray	2018-09-03 11:12:23 -07:00
Eric Liang	69d1354016	[rllib] Document ARS & rainbow (#2744 ) * wip * rainbow doc too * e not used * fix ppo doc * clean list * use same title	2018-08-28 18:13:36 -07:00
Eric Liang	aa014af85b	[rllib] Fix atari reward calculations, add LR annealing, explained var stat for A2C / impala (#2700 ) Changes needed to reproduce Atari plots in IMPALA / A2C: https://github.com/ray-project/rl-experiments	2018-08-23 17:49:10 -07:00
Eric Liang	fbe6c59f72	[rllib] Misc fixes, A2C (#2679 ) A bunch of minor rllib fixes: pull in latest baselines atari wrapper changes (and use deepmind wrapper by default) move reward clipping to policy evaluator add a2c variant of a3c reduce vision network fc layer size to 256 units switch to 84x84 images doc tweaks print timesteps in tune status	2018-08-20 15:28:03 -07:00
Richard Liaw	62d0698097	[tune] Tune Facelift (#2472 ) This PR introduces the following changes: * Ray Tune -> Tune * [breaking] Creation of `schedulers/`, moving PBT, HyperBand into a submodule * [breaking] Search Algorithms now must take in experiment configurations via `add_configurations` rather through initialization * Support `"run": (function \| class \| str)` with automatic registering of trainable * Documentation Changes	2018-08-19 11:00:55 -07:00
Eric Liang	53f9755594	[rllib] Fix support for mixed discrete and continuous action spaces, add to regression test (#2655 ) * fix * lint * fix	2018-08-15 10:19:41 -07:00

1 2

55 commits