hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Jun Gong	9b65d5535d	[RLlib] Introduce basic connectors library. (#25311 )	2022-06-07 19:18:14 +02:00
Rohan Potdar	a9d8da0100	[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056 )	2022-06-07 12:52:19 +02:00
Jun Gong	1d24d6af98	[RLlib] Fix MARWIL tf policy. (#25384 )	2022-06-03 10:50:36 +02:00
Eric Liang	905258dbc1	Clean up docstyle in python modules and add LINT rule (#25272 )	2022-06-01 11:27:54 -07:00
Rohan Potdar	ab81c8e9ca	[RLlib]: Rename `input_evaluation` to `off_policy_estimation_methods`. (#25107 )	2022-05-27 13:14:54 +02:00
Eric Liang	4963dfaae0	[api] Add API stability annotations for all RLlib symbols and add to LINT (#25060 )	2022-05-24 22:14:25 -07:00
Eric Liang	55d039af32	Annotate datasources and add API annotation check script (#24999 ) Why are these changes needed? Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.	2022-05-21 15:05:07 -07:00
Rohan Potdar	5a70b732e8	[RLlib] MARWIL and BC Config. (#24853 )	2022-05-21 12:50:20 +02:00
Sven Mika	7cca7782f1	[RLlib] OPE (off policy estimator) API. (#24384 )	2022-05-02 21:15:50 +02:00
Kai Fricke	8c2e471265	[AIR] Add RLTrainer interface, implementation, and examples (#23465 ) This PR adds a RLTrainer to Ray AIR. It works for both offline and online use cases. In offline training, it will leverage the datasets key of the Trainer API to specify a dataset reader input, used e.g. in Behavioral Cloning (BC). In online training, it is a wrapper around the rllib trainables making use of the parameter layering enabled by the Trainer API.	2022-04-08 17:16:42 -07:00
Kai Fricke	262d6121bb	[rllib] Fix error messages and example for dataset writer (#23419 ) Currently the error message and example refer to a field type that is actually format.	2022-03-28 19:53:12 +01:00
Max Pumperla	60054995e6	[docs] fix doctests and activate CI (#23418 )	2022-03-24 17:04:02 -07:00
Sven Mika	0af100ffae	[RLlib] Fix tree.flatten dict ordering bug: `flatten_space([obs_space])` should produce same struct as `tree.flatten([obs])`. (#22731 )	2022-03-01 21:24:24 +01:00
Jun Gong	6f5afcbce9	[RLlib] Docs enhancements: Setup-dev instructions; Ray datasets integration. (#22239 )	2022-02-15 09:09:24 +01:00
Jun Gong	87fe033f7b	[RLlib] Request CPU resources in `Trainer.default_resource_request()` if using dataset input. (#21948 )	2022-02-02 10:20:37 +01:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Jun Gong	099c170ab4	[RLlib] Dataset Reader/Writer for RLlib (#21808 )	2022-01-26 16:00:46 +01:00
xwjiang2010	9af8f11191	Revert "[docs] Clean up doc structure (first part) (#21667 )" (#21763 ) This reverts commit `38e46c9fb3`.	2022-01-20 15:30:56 -08:00
Max Pumperla	38e46c9fb3	[docs] Clean up doc structure (first part) (#21667 )	2022-01-20 16:19:04 +01:00
Sven Mika	596c8e2772	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
Sven Mika	f814c2af89	[RLlib; Docs] Docs API reference pages: `rllib/execution`, `rllib/evaluation`, `rllib/models`, `rllib/offline`. (#20538 )	2021-12-10 09:41:29 +01:00
Sungho Joo	dc51af798c	[RLlib] Minor fix on json encoding during worker sampling (#20134 ) * import custom json encoder from util and improve encoder default function * linting	2021-11-09 16:46:41 -08:00
Sven Mika	ea2bea7e30	[RLlib; Docs overhaul] Docstring cleanup: Offline. (#19808 )	2021-11-01 10:59:53 +01:00
Sven Mika	cabaa3b3c6	[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. (#18381 )	2021-09-07 11:48:41 +02:00
Sven Mika	18d173b172	[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031 )	2021-07-19 13:16:03 -04:00
Sven Mika	1fd0eb805e	[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014 )	2021-07-13 14:01:30 -04:00
Amog Kamsetty	bc33dc7e96	Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`." (#17002 ) This reverts commit `7862dd64ea`.	2021-07-12 11:09:14 -07:00
Julius Frost	a88b217d3f	[rllib] Enhancements to Input API for customizing offline datasets (#16957 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-07-10 15:05:25 -07:00
Sven Mika	7862dd64ea	[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`. (#16774 )	2021-07-08 17:31:34 +02:00
Sven Mika	53206dd440	[RLlib] CQL BC loss fixes; PPO/PG/A2\|3C action normalization fixes (#16531 )	2021-06-30 12:32:11 +02:00
Sven Mika	e2be41b407	[RLlib] MARWIL + BC: Various fixes and enhancements. (#16218 )	2021-06-03 22:29:00 +02:00
Sven Mika	c4a3e1589b	[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. (#15761 )	2021-05-13 09:17:23 +02:00
Michael Luo	4cbe13cdfd	[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603 ) Co-authored-by: Sven Mika <sven@anyscale.io> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-05-04 19:06:19 +02:00
Sven Mika	8b3554e37e	[RLlib] Remove all (already soft-deprecated) `SampleBatch.data` from code. (#15335 )	2021-04-15 19:19:51 +02:00
Sven Mika	6708211b59	[RLlib] JSONReader: Mix files if > 1 at beginning (each worker should start with different file). (#14865 )	2021-03-24 16:07:40 +01:00
Michael Luo	587f207c2f	[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550 )	2021-01-21 16:43:55 +01:00
Sven Mika	391cdfae8c	[RLlib] Trajectory view API docs. (#12718 )	2020-12-30 17:32:21 -08:00
Sven Mika	c524f86785	[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064 )	2020-12-27 09:46:03 -05:00
Felipe Antunes	4c0f0ce3a9	[RLlib] In OffPolicyEstimators (Offline RL): Include last step of trajectory (#12619 )	2020-12-08 12:39:40 +01:00
Sven Mika	f6b84cb2f7	[RLlib] Fix offline logp vs prob bug in OffPolicyEstimator class. (#12158 )	2020-11-20 08:59:43 +01:00
Eric Liang	6b7a4dfaa0	[rllib] Forgot to pass ioctx to child json readers (#11839 ) * fix ioctx * fix	2020-11-05 22:07:57 -08:00
Sven Mika	805dad3bc4	[RLlib] SAC algo cleanup. (#10825 )	2020-09-20 11:27:02 +02:00
Sven Mika	4b278c36fc	[RLlib] Behavioral Cloning (from MARWIL). (#10619 )	2020-09-09 17:33:21 +02:00
Julius Frost	dc659ae89a	make action probabilities a numpy array (#10122 )	2020-08-16 11:25:12 -07:00
Sven Mika	2256047876	[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114 )	2020-08-15 13:24:22 +02:00
Julius Frost	6d9d2b320a	[RLlib] Support windows drives other than C drive for the offline json API (#9909 )	2020-08-13 11:57:54 +02:00
Sven Mika	b0b0463161	[RLlib] Trajectory View API (preparatory cleanup and enhancements). (#9678 )	2020-07-29 21:15:09 +02:00
Michael Luo	b51ab2af66	[RLlib] Offline Type Annotations (#9676 ) * Offline Annotations * Modifications * Fixed circular dependencies * Linter fix	2020-07-27 14:01:17 -07:00
Sven Mika	43043ee4d5	[RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136 ) * WIP. * Fixes. * LINT. * WIP. * WIP. * Fixes. * Fixes. * Fixes. * Fixes. * WIP. * Fixes. * Test * Fix. * Fixes and LINT. * Fixes and LINT. * LINT.	2020-06-30 10:13:20 +02:00
Sven Mika	af1203b9df	[RLlib] Issue 8507 (PyTorch does not support custom loss). (#9142 )	2020-06-26 09:52:22 +02:00

1 2

61 commits