hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-09 12:56:46 -04:00

Author	SHA1	Message	Date
Eric Liang	5cebee68d6	[rllib] Add scaling guide to documentation, improve bandit docs (#7780 ) * update * reword * update * ms * multi node sgd * reorder * improve bandit docs * contrib * update * ref * improve refs * fix build * add pillow dep * add pil * update pil * pillow * remove false	2020-03-27 22:05:43 -07:00
Sven Mika	1138f2ebed	[RLlib] Issue 7046 cannot restore keras model from h5 file. (#7482 )	2020-03-23 12:19:30 -07:00
Eric Liang	9392cdbf74	[rllib] Add high-performance external application connector (#7641 )	2020-03-20 12:43:57 -07:00
Eric Liang	dd70720578	[rllib] Rename sample_batch_size => rollout_fragment_length (#7503 ) * bulk rename * deprecation warn * update doc * update fig * line length * rename * make pytest comptaible * fix test * fi sys * rename * wip * fix more * lint * update svg * comments * lint * fix use of batch steps	2020-03-14 12:05:04 -07:00
Eric Liang	52cf77f5a9	[rllib] SAC no_done_at_end should default to False (#7594 ) * update * update doc * stochastic * cleanu	2020-03-14 11:16:54 -07:00
Sven Mika	2d97650b1e	[RLlib] Add Exploration API documentation. (#7373 ) * Add Exploration API documentation. * Add Exploration API documentation. * Add Exploration API documentation. * Update exporation docs.	2020-03-01 16:55:41 -08:00
Eric Liang	5df801605e	Add ray.util package and move libraries from experimental (#7100 )	2020-02-18 13:43:19 -08:00
Eric Liang	fbc545c03b	[rllib] Support parallel, parameterized evaluation (#6981 ) * eval api * update * sync eval filters * sync fix * docs * update * docs * update * link * nit * doc updates * format	2020-02-01 22:12:12 -08:00
Eric Liang	e659699ca9	[tune] Fix directory naming regression (#6839 )	2020-01-27 15:53:40 -08:00
Sven Mika	e6227082bd	[RLlib] Add `torch` flag to train.py (#6807 )	2020-01-17 18:48:44 -08:00
Maltimore	0ec613c95a	[rllib] doc: fix typo: on_postprocess_batch -> on_postprocess_traj (#6438 )	2019-12-11 15:00:53 -08:00
Eric Liang	bc5e259264	[rllib] Add a doc section on computing actions (#6326 ) * options doc * add note * hint shr * doc update	2019-12-03 00:10:50 -08:00
Eric Liang	e4565c9cc6	Reduce RLlib log verbosity (#6154 )	2019-11-13 18:50:45 -08:00
David Bignell	3f83b2daa9	[rllib] Rollout extensions (#6065 ) * Rollout improvements * Make info-saving optional, to avoid breaking change. * Store generating ray version in checkpoint metadata * Keep the linter happy * Add small rollout test * Terse. * Update test_io.py	2019-11-05 20:34:18 -08:00
gehring	8903bcd0c3	[rllib] Tracing for eager tensorflow policies with `tf.function` (#5705 ) * Added tracing of eager policies with `tf.function` * lint * add config option * add docs * wip * tracing now works with a3c * typo * none * file doc * returns * syntax error * syntax error	2019-09-17 01:44:20 -07:00
Eric Liang	74abeab057	[rllib] Improve accessing model state docs (#5656 ) * [rllib] better model docs * fix * s	2019-09-08 23:01:26 -07:00
Eric Liang	1455a19c85	Consolidate and clean up documentation (#5645 )	2019-09-07 11:50:18 -07:00
Richard Liaw	34f6d2fc5c	[tune] Update trainable docs and support hparams (#5558 )	2019-09-04 12:44:42 -07:00
Eric Liang	daf38c8723	[tune] Deprecate tune.function (#5601 ) * remove tune function * remove examples * Update tune-usage.rst	2019-08-31 16:00:10 -07:00
Eric Liang	550c96b965	[rllib] Add docs on policy.model (#5597 )	2019-08-30 21:10:42 -07:00
Eric Liang	7d28bbbdbb	[rllib] Document on traj postprocess (#5532 ) * document on traj postprocess * shorten it	2019-08-24 20:37:45 -07:00
gehring	b520f6141e	[rllib] Adds eager support with a generic `TFEagerPolicy` class (#5436 )	2019-08-23 14:21:11 +08:00
Eric Liang	5d7afe8092	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
Richard Liaw	1eaa57c98f	[tune] Distributed example + walkthrough (#5157 )	2019-08-02 09:17:20 -07:00
Kristian Hartikainen	13fb9fe3db	[rllib] Feature/soft actor critic v2 (#5328 ) * Add base for Soft Actor-Critic * Pick changes from old SAC branch * Update sac.py * First implementation of sac model * Remove unnecessary SAC imports * Prune unnecessary noise and exploration code * Implement SAC model and use that in SAC policy * runs but doesn't learn * clear state * fix batch size * Add missing alpha grads and vars * -200 by 2k timesteps * doc * lazy squash * one file * ignore tfp * revert done	2019-08-01 23:37:36 -07:00
Eric Liang	20450a4e82	[rllib] Add rock paper scissors multi-agent example (#5336 )	2019-08-01 13:03:59 -07:00
Eric Liang	9e328fbe6f	[rllib] Add docs on how to use TF eager execution (#4927 )	2019-06-07 16:42:37 -07:00
Eric Liang	7501ee51db	[rllib] Rename PolicyEvaluator => RolloutWorker (#4820 )	2019-06-03 06:49:24 +08:00
Eric Liang	4f46d3e9bf	[rllib] Add multi-agent examples for hand-coded policy, centralized VF (#4554 )	2019-04-09 00:36:49 -07:00
Eric Liang	37208216ae	[rllib] Rename Agent to Trainer (#4556 )	2019-04-07 00:36:18 -07:00
Eric Liang	fce0062380	[rllib] Switch to tune.run() instead of run_experiments() (#4515 )	2019-03-30 14:07:50 -07:00
Eric Liang	cff08e19ff	[rllib] Print out intermediate data shapes on the first iteration (#4426 )	2019-03-26 00:27:59 -07:00
Eric Liang	4b8b703561	[rllib] Some API cleanups and documentation improvements (#4409 )	2019-03-21 21:34:22 -07:00
Eric Liang	05d96ce81b	[rllib] Raise an error if multi-agent envs terminate without a last observation for agents (#4139 ) * fix it * lint * Update rllib-training.rst	2019-02-23 21:23:40 -08:00
Eric Liang	c4182463f6	[rllib] Add helper to iterate over envs in a vectorized environment (#4001 ) * add foreach env func * fix * add test	2019-02-11 10:40:47 -08:00
Eric Liang	fb73cedf70	[rllib] Add examples page, add hierarchical training example, delete SC2 examples (#3815 ) * wip * lint * wip * up * wip * update examples * wip * remove carla * update * improve envspec * link to custom * Update rllib-env.rst * update * fix * fn * lint * ds * ssd games * desc * fix up docs * fix	2019-01-29 21:06:09 -08:00
Eric Liang	e78562b2e8	[rllib] Misc fixes: set lr for PG, better error message for LSTM/PPO, fix multi-agent/APEX (#3697 ) * fix * update test * better error * compute * eps fix * add get_policy() api * Update agent.py * better err msg * fix * pass in rew	2019-01-06 19:37:35 -08:00
Eric Liang	b8a9e3f106	[rllib] Remove uses of sgd_stepsize => lr (#3667 ) * lr * Update example-evolution-strategies.rst	2019-01-01 12:01:27 +08:00
Richard Liaw	e046a5c767	[tune] resources_per_trial from trial_resources (#3580 ) Renaming variable due to user errors.	2018-12-20 19:00:47 -08:00
Eric Liang	db0dee573e	[rllib] Q-Mix implementation (Q-Mix, VDN, IQN, and Ape-X variants) (#3548 )	2018-12-18 10:40:01 -08:00
Eric Liang	d864f299d7	[rllib] fixes from dogfooding multi-agent (#3456 ) auto wrap multi-agent dict and tuple spaces by keeping a policy -> preprocessor in the sampler add some Q-learning debug stats report min, max of custom metrics better errors	2018-12-05 23:31:45 -08:00
Eric Liang	93a9d32288	[docs] Switch docs to use rllib train instead of train.py	2018-12-04 17:36:06 -08:00
Eric Liang	ce355d13d4	[rllib] Allow envs to be auto-registered; add on_train_result callback with curriculum example (#3451 ) * train step and docs * debug * doc * doc * fix examples * fix code * integration test * fix * ... * space * instance * Update .travis.yml * fix test	2018-12-03 23:15:43 -08:00
Eric Liang	f0df97db6f	[rllib] example and docs on how to use parametric actions with DQN / PG algorithms (#3384 )	2018-11-27 23:35:19 -08:00
Eric Liang	abdc3b592e	[rllib] Update multi-gpu impala numbers (#3327 )	2018-11-19 20:55:27 -08:00
Eric Liang	65c27c70cf	[rllib] Clean up agent resource configurations (#3296 ) Closes #3284	2018-11-13 18:00:03 -08:00
Eric Liang	bd0dbde149	[rllib] Rename ServingEnv => ExternalEnv (#3302 )	2018-11-12 16:31:27 -08:00
eugenevinitsky	344b4ef0ff	[rllib] Fix filter sync for ES and ARS (#2918 )	2018-11-06 19:09:34 -08:00
Eric Liang	369cb833fe	[rllib] Implement custom metrics (#3144 )	2018-11-03 18:48:32 -07:00
Eric Liang	af0c1174cd	[sgd] Merge sharded param server based SGD implementation (#3033 ) This includes most of the TF code used for the OSDI experiment. Perf sanity check on p3.16xl instances: Overall scaling looks ok, with the multi-node results within 5% of OSDI final numbers. This seems reasonable given that hugepages are not enabled here, and the param server shards are placed randomly. $ RAY_USE_XRAY=1 ./test_sgd.py --gpu --batch-size=64 --num-workers=N \ --devices-per-worker=M --strategy=<simple\|ps> \ --warmup --object-store-memory=10000000000 Images per second total gpus total \| simple \| ps ======================================== 1 \| 218 2 (1 worker) \| 388 4 (1 worker) \| 759 4 (2 workers) \| 176 \| 623 8 (1 worker) \| 985 8 (2 workers) \| 349 \| 1031 16 (2 nodes, 2 workers) \| 600 \| 1661 16 (2 nodes, 4 workers) \| 468 \| 1712 <--- OSDI perf was 1817	2018-10-27 21:25:02 -07:00

1 2

65 commits