hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-17 16:46:39 -04:00

Author	SHA1	Message	Date
Eric Liang	e15a419028	Enable stage fusion by default for dataset pipelines (#22476 ) This PR enables stage fusion for dataset pipelines. This also requires: 1. Removing the num_cpus=0.5 default for the read stage, to enable fusion of the read stage. 2. Removing spread_resource_prefix (not supported for now).	2022-02-23 17:34:05 -08:00
Jiajun Yao	baa14d695a	Round robin during spread scheduling (#21303 ) - Separate spread scheduling and default hydra scheduling (i.e. SpreadScheduling != HybridScheduling(threshold=0)): they are already separated in the API layer and they have the different end goals so it makes sense to separate their implementations and evolve them independently. - Simple round robin for spread scheduling: this is just a starting implementation, can be optimized later. - Prefer not to spill back tasks that are waiting for args since the pull is already in progress.	2022-02-18 15:05:35 -08:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Jiajun Yao	cea80b1a5b	Don't advertise cpus on gpu nodes for pipelined ingestion tests (#21899 ) * Don't advertise cpus on gpu nodes for pipelined ingestion tests * Don't advertise cpus on gpu nodes for pipelined ingestion tests * Don't advertise cpus on gpu nodes for pipelined ingestion tests	2022-01-27 09:17:01 -08:00
Antoni Baum	7ce22b72ed	[datasets] Expand `to_torch`'s functionality (#21117 ) Expands the `to_torch` method for Datasets with: * An ability to choose to output a list/dict of feature tensors instead of just one (through setting `feature_columns` to be a list of lists or a dict of lists) * An ability to choose whether the label should be unsqueezed or not * An ability to pass `None` as the label (for prediction). Furthermore, this changes how the `feature_column_dtypes` argument works. Previously, it took a list of dtypes for each feature. However, as the tensor was concatenated in the end, only one dtype mattered (the biggest one). Now, this argument expects a single dtype which will be applied to the features tensor (or a list/dict if `feature_columns` is a list of list/dict of lists). Unit tests for all cases are included. Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>	2022-01-03 09:03:50 -08:00
Jiajun Yao	9776e21842	Revert "Round robin during spread scheduling (#19968 )" (#21293 ) This reverts commit `60388b2834`.	2021-12-30 10:33:06 +09:00
Jiajun Yao	60388b2834	Round robin during spread scheduling (#19968 )	2021-12-22 20:27:34 -08:00
Amog Kamsetty	9796ae56d5	[Train][Data] Change usages of `iter_datasets` to `iter_epochs` (#20487 )	2021-11-17 18:05:51 -08:00
Chen Shen	9dba5e0ead	[dataset][nightly-test] fix pipeline ingest test (#19437 )	2021-10-18 11:31:24 +01:00
Eric Liang	86cbe3e833	[data] Add support for repeating and re-windowing a DatasetPipeline (#19091 )	2021-10-06 20:13:43 -07:00
Chen Shen	7c99aae033	[dataset][nightly-test] add pipelined ingestion/training nightly test	2021-09-23 20:39:03 -07:00

11 commits