hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	c4a3e1589b	[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. (#15761 )	2021-05-13 09:17:23 +02:00
Sven Mika	16ddab49f5	[RLlib] Trainer._evaluate -> Trainer.evaluate; Also make evaluation possible w/o evaluation worker set. (#15591 )	2021-05-12 12:16:00 +02:00
Sven Mika	a495759f06	[RLlib] Discussion 2022: PPO should auto-adjust `rollout_fragment_length` if other settings do not align with `train_batch_size`. (#15611 )	2021-05-10 16:16:02 +02:00
Sven Mika	461d73ddf1	[RLlib] `simple_optimizer` should not be used by default for tf+MA. (#15365 )	2021-05-10 16:10:44 +02:00
Sven Mika	46f6fa2361	[RLlib] Example script for restoring 1 agent (out of n) from a checkpoint (multi-agent). (#15540 )	2021-05-10 16:09:05 +02:00
Eric Liang	ff36ae594b	Remove flaky tag from newly unflaky tests (#15639 )	2021-05-05 12:15:46 -07:00
Kai Fricke	1d52ab819f	[release] release 1.3.0 results and test updates (#15366 ) Convert a number of release tests and add logs for release 1.3.0	2021-05-04 22:10:04 +01:00
Sven Mika	c7563a32ed	[RLlib] DD-PPO not supported on Win (add meaningful error message). (#15631 )	2021-05-04 19:26:17 +02:00
Michael Luo	4cbe13cdfd	[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603 ) Co-authored-by: Sven Mika <sven@anyscale.io> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-05-04 19:06:19 +02:00
Sven Mika	4b3add0066	[RLlib] Discussion 2021: PPO does not learn vf, iff use_gae=False (ignores use_critic setting). (#15610 )	2021-05-04 14:17:00 +02:00
mvindiola1	170366fbf1	[RLlib] contrib/MADDPG: Make get_weights and set_weights use dictionaries rather than lists. (#14903 ) Co-authored-by: Manny Vindiola <manuel.m.vindiola.civ@mail.mil>	2021-05-04 13:26:39 +02:00
Sertingolix	5a45009ebc	[RLlib] Handle array custom metrics correctly in evaluate (#15190 ) Co-authored-by: Lucas Brunner <lucas.brunner@urb-x.ch>	2021-05-04 13:25:28 +02:00
Antoine Galataud	ce1c001b1d	[RLlib] DQN: Place LearningRateSchedule mixin at the right moment (#15558 )	2021-05-04 13:21:40 +02:00
Yeachan-Heo	0552f6e886	[RLlib] Update alpha_zero_policy.py (#15042 )	2021-05-04 13:20:24 +02:00
Amog Kamsetty	ebc44c3d76	[CI] Upgrade flake8 to 3.9.1 (#15527 ) * formatting * format util * format release * format rllib/agents * format rllib/env * format rllib/execution * format rllib/evaluation * format rllib/examples * format rllib/policy * format rllib utils and tests * format streaming * more formatting * update requirements files * fix rllib type checking * updates * update * fix circular import * Update python/ray/tests/test_runtime_env.py * noqa	2021-05-03 14:23:28 -07:00
Sven Mika	e973b726c2	[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273 )	2021-04-30 19:26:30 +02:00
Sven Mika	fc3a65f9d4	[RLlib] Split test_checkpoint_restore tests into 3 and make each "large" (from "enormous"). (#15499 )	2021-04-30 12:33:12 +02:00
Sven Mika	78b776942f	[RLlib] Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf and torch). (#15538 )	2021-04-27 17:19:52 +02:00
SebastianBo1995	f5be8d8f74	[Rllib] Offline Learning Bug, different shapes (#15132 )	2021-04-27 17:18:17 +02:00
Sven Mika	bb8a286cbc	[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684 )	2021-04-27 10:44:54 +02:00
Kai Fricke	2c11a1aff1	[RLlib] Evaluation parallel to training check, key-error hotfix (#15345 )	2021-04-27 08:38:10 +02:00
mvindiola1	9330403200	[RLlib] Mask out padded values for A3C loss with recurrent policy (#15525 )	2021-04-27 08:36:04 +02:00
Sven Mika	354c960fff	[RLlib] Fix test_dependency_torch and fix custom logger support for RLlib. (#15120 )	2021-04-24 08:13:41 +02:00
Eric Liang	af01a47d59	Add support for tune,serve,rllib tests to flaky builder (#15447 )	2021-04-22 15:03:29 -07:00
Sven Mika	b9761d7081	[RLlib] Discussion 1759: SampleBatch._get_slice_indices stuck for R2D2 when using incorrect Trainer. (#15451 ) Thanks @Manuscrit for raising this issue!	2021-04-22 19:21:03 +02:00
Sven Mika	7e1a191f17	[RLlib] Remove all remaining tf- and MuJoCo warnings from RLlib. (#15454 )	2021-04-22 19:20:19 +02:00
Sven Mika	bdda73e2dd	[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421 ) Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!	2021-04-22 11:29:42 +02:00
Sven Mika	7318439c3d	[RLlib] DQN native_ratio (for training intensity) incorrect (discussion 1763). (#15436 ) Thanks @Manuscrit !	2021-04-22 11:06:29 +02:00
Fabien Couthouis	fe06642df0	[RLlib] Report mean losses instead of sum in IMPALA (discussion 1709) (#15427 )	2021-04-21 10:59:06 +02:00
Sven Mika	7ff27dfe07	[RLlib] Remove atari dependency for RLlib (in favor of detailed error message). (#15292 )	2021-04-20 08:46:58 +02:00
Sven Mika	41968512ca	[RLlib] Partial GPU examples (for learner and workers). (#15334 )	2021-04-20 08:46:05 +02:00
Sven Mika	cecfc3b43b	[RLlib] Multi-GPU support for Torch algorithms. (#14709 )	2021-04-16 09:16:24 +02:00
SangBin Cho	1d87e4447d	[Test] increase the test size of test io that consistenly times out (#15341 )	2021-04-15 14:02:41 -07:00
Sven Mika	8b3554e37e	[RLlib] Remove all (already soft-deprecated) `SampleBatch.data` from code. (#15335 )	2021-04-15 19:19:51 +02:00
Sven Mika	c90de315e5	[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. (#15295 )	2021-04-15 16:50:42 +01:00
Sven Mika	e961d2f4b2	[RLlib] Improve example scripts for attention nets, CartPole LSTM, and custom RNN-models. (#15329 )	2021-04-15 16:11:34 +02:00
Sven Mika	45d6560759	[RLlib] Fix flakey custom_fast_model_torch/tf tests. (#15330 )	2021-04-15 16:10:29 +02:00
SangBin Cho	27ab0c7633	[Test] Skip the failing rllib example test. (#15321 )	2021-04-14 20:19:44 -07:00
Sven Mika	bbfa8ffec9	[RLlib] Minor release 1.3 warnings cleanups. (#15272 )	2021-04-14 14:03:15 +02:00
Sven Mika	ef0f163d16	[RLlib] Discussion 1709: IMPALA (tf and torch) reports sum of entropy (over batch) in stats. Should report mean instead. (#15290 )	2021-04-14 11:44:25 +02:00
Sven Mika	5254d2fb36	[RLlib] Support parallelizing evaluation and training (optional). (#15040 )	2021-04-13 09:53:35 +02:00
Sven Mika	9c5a0cfd7a	[RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386 )	2021-04-11 18:20:04 +02:00
Sven Mika	dfc116ea27	[RLlib] Discussion 681: Metrics prepends newest episodes instead of appending. (#15236 )	2021-04-11 15:31:43 +02:00
Sven Mika	1c9701e9cb	[RLlib] Discussion 1513: `on_episode_step()` callback called after very first reset (should not). (#15218 )	2021-04-11 13:16:17 +02:00
Sven Mika	b267f1f1ba	[RLlib] Add support for Int-Box action spaces. (#15012 )	2021-04-11 13:16:01 +02:00
Dmitri Gekhtman	58fbb419ea	[client][rllib] Add client_mode_hook for ray.get_gpu_ids (#15185 )	2021-04-08 23:36:11 -07:00
Yi Cheng	e552e3f19c	Skip test_dependency_torch (#15123 )	2021-04-05 18:02:10 -07:00
Kai Fricke	d33b0e4bc3	[tune] Reconcile placement groups every N seconds to avoid bottlenecks when running many short trials (#15011 ) Closes a release blocking issue	2021-04-01 17:04:44 +02:00
Sven Mika	1bb70e4907	[RLlib] Issue 14523: Torch + py3.8 leads to GPU device error. (#15014 )	2021-03-30 21:43:11 +02:00
Sven Mika	95686a8fdd	[RLlib] Issue 14533: Tf-eager properly use `tree.map_struct` on value of type `Repeated` (RLlib-specific space) (#15015 )	2021-03-30 19:28:45 +02:00

... 5 6 7 8 9 ...

967 commits