hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Eric Liang	905258dbc1	Clean up docstyle in python modules and add LINT rule (#25272 )	2022-06-01 11:27:54 -07:00
Eric Liang	4963dfaae0	[api] Add API stability annotations for all RLlib symbols and add to LINT (#25060 )	2022-05-24 22:14:25 -07:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Sven Mika	853d10871c	[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376 )	2022-01-05 18:22:33 +01:00
mvindiola1	8cee0c03bf	[RLlib] Update `max_seq_len` in pad_batch_to_sequences_of_same_size (#20743 )	2021-11-30 18:00:07 +01:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	a96dbd885b	[RLlib] Reinstate trajectory view API tests. (#18809 )	2021-09-23 08:31:51 +02:00
Sven Mika	649580d735	[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046 )	2021-07-15 05:51:24 -04:00
Amog Kamsetty	38b5b6d24c	Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565 )" (#17036 ) This reverts commit `e4123fff27`.	2021-07-13 09:57:15 -07:00
Sven Mika	e4123fff27	[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565 )	2021-07-13 06:38:14 -04:00
Sven Mika	7eb1a29426	[RLlib] Fix ModelV2 custom metrics for torch. (#16734 )	2021-07-01 13:01:40 +02:00
Sven Mika	2303851c3c	[RLlib] Torch multi-GPU + LSTM/RNN bug fix. (#15492 )	2021-05-18 11:51:05 +02:00
Sven Mika	e973b726c2	[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273 )	2021-04-30 19:26:30 +02:00
Sven Mika	8b3554e37e	[RLlib] Remove all (already soft-deprecated) `SampleBatch.data` from code. (#15335 )	2021-04-15 19:19:51 +02:00
Sven Mika	732197e23a	[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393 )	2021-03-08 15:41:27 +01:00
Sven Mika	8000258333	[RLlib] R2D2 Implementation. (#13933 )	2021-02-25 12:18:11 +01:00
Sven Mika	eb0038612f	[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584 )	2021-02-08 15:02:19 +01:00
Sven Mika	b2bcab711d	[RLlib] Attention Nets: tf (#12753 )	2020-12-20 20:22:32 -05:00
Sven Mika	0df55a139c	[RLlib] Attention Net prep PR #1 : Smaller cleanups. (#12447 ) * WIP. * Fix. * Fix. * Fix.	2020-11-27 16:25:47 -08:00
Sven Mika	805dad3bc4	[RLlib] SAC algo cleanup. (#10825 )	2020-09-20 11:27:02 +02:00
Sven Mika	e968b52cb7	[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950 )	2020-08-21 12:35:16 +02:00
Sven Mika	c9435cad43	WIP. (#8456 ) Fix multi-GPU histogram metrics for > 0D tensors.	2020-05-15 21:43:27 +02:00
Sven Mika	66df8b8c35	[RLlib] Working/learning example: PPO + torch + LSTM. (#7797 )	2020-03-31 22:00:28 -07:00
Eric Liang	9a590ac6a5	[rllib] Fix custom model metrics in multi-device case (#7640 ) * fix example * add example test * lin	2020-03-23 12:40:22 -07:00
Sven Mika	d537e9f0d8	[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155 )	2020-02-19 12:18:45 -08:00
Eric Liang	2fb53396ad	[rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918 )	2020-01-25 22:36:43 -08:00

26 commits