Clark Zinzow
fb0d6e6b0b
[Datasets] [Docs] Datasets library branding + positioning tweaks ( #22067 )
2022-02-05 16:59:34 -08:00
Max Pumperla
4dd221f848
[Docs] Ray Data docs target state ( #21931 )
...
Preview: [docs](https://ray--21931.org.readthedocs.build/en/21931/data/dataset.html )
The Ray Data project's docs now have a clearer structure and have partly been rewritten/modified. In particular we have
- [x] A Getting Started Guide
- [x] An explicit User / How-To Guide
- [x] A dedicated Key Concepts page
- [x] A consistent naming convention in `Ray Data` whenever is is referred to the project.
This surfaces quite clearly that, apart from the "Getting Started" sections, we really only have one real example. Once we have more, we can create an "Example" section like many other sub-projects have. This will be addressed in https://github.com/ray-project/ray/issues/21838 .
2022-01-27 13:14:36 -08:00
Clark Zinzow
411bb308dc
[Datasets] [Docs] Add API docs links to I/O compatibility matrix ( #21889 )
2022-01-26 12:05:27 -08:00
xwjiang2010
9af8f11191
Revert "[docs] Clean up doc structure (first part) ( #21667 )" ( #21763 )
...
This reverts commit 38e46c9fb3
.
2022-01-20 15:30:56 -08:00
Max Pumperla
38e46c9fb3
[docs] Clean up doc structure (first part) ( #21667 )
2022-01-20 16:19:04 +01:00
Eric Liang
a69ae1d886
Add blogs to dataset materials ( #21546 )
2022-01-11 22:09:57 -08:00
Clark Zinzow
b872fdaaac
[Datasets] Last-mile preprocessing docs. ( #20712 )
...
Datasets docs for last-mile preprocessing, particularly geared towards ML ingest. This gives groupby, aggregations, and random shuffling examples in the overview page (not present previously), adds some concreteness to our last-mile preprocessing positioning, and provides some preprocessing recipes for a few common transformations.
2021-11-29 23:23:27 -08:00
Richard Liaw
cf357f6bce
[docs] Add a talks section for ray.data ( #20444 )
2021-11-16 14:30:08 -08:00
Eric Liang
6102912494
Dataset doc updates ( #19815 )
2021-11-04 18:13:40 -07:00
Philipp Moritz
0a5942d8b0
[Documentation] Fix quotes for windows installations ( #19859 )
...
* [Documentation] Fix quotes for windows installations
* update
* formatting
2021-10-29 10:54:38 -07:00
Eric Liang
27a5b546ad
Make ArrowRow less scary ( #19686 )
2021-10-25 12:18:42 -07:00
Eric Liang
875d19f838
[data] Fix inconsistent naming of to_refs() methods, remove to_arrow() ( #19620 )
2021-10-23 12:20:23 -07:00
matthewdeng
4674c78050
[Train] Rename Ray SGD v2 to Ray Train ( #19436 )
2021-10-18 22:27:46 -07:00
Eric Liang
430a5f4a21
[doc] Bump dataset to beta for 1.8 and add backlink to SGD ( #19332 )
2021-10-12 18:32:29 -07:00
Clark Zinzow
d22f838795
[Datasets] Delineate between ref and raw APIs for the Pandas/Arrow integrations. ( #18992 )
2021-10-01 13:08:25 -07:00
Alex Wu
5709c6501b
[dataset][usability] Dataset dependencies ( #18346 )
2021-09-29 17:29:31 -07:00
Eric Liang
caf34a452c
Unify ArrowTensorType tables and Tensor blocks ( #18867 )
2021-09-27 16:24:09 -07:00
Eric Liang
4d2065352b
Increase dataset read parallelism by default ( #18420 )
2021-09-09 15:07:49 -07:00
Clark Zinzow
b30c41759d
[Datasets] Adds tensor column support (tensors-in-tables) via Pandas/Arrow extension types/arrays. ( #18301 )
2021-09-08 10:09:01 -07:00
Eric Liang
cbdafa0b63
[doc] Fix various workflow doc bugs ( #18357 )
2021-09-06 01:39:08 -07:00
Eric Liang
7dcae690b9
Mark datasets as still in alpha for now ( #18321 )
2021-09-02 17:07:33 -07:00
Wesley Gifford
6133a561e9
Dataset from modin ( #18122 )
2021-08-31 11:19:35 -07:00
Eric Liang
95b5ad12ba
Initial version of workflow documentation ( #18138 )
2021-08-27 16:20:48 -07:00
Clark Zinzow
aee7ba2510
[Datasets] Add from_numpy() and to_numpy() APIs ( #18146 )
2021-08-27 13:33:11 -07:00
Eric Liang
71b3183038
Add implicit init note to Ray docs & dataset version note ( #17751 )
2021-08-11 13:13:22 -07:00
Eric Liang
d4f9d3620e
Move ray.data out of experimental ( #17560 )
2021-08-04 13:31:10 -07:00
Eric Liang
748cbbb23d
[hotfix] Parquet S3 reads broken due to pyarrow.lib.ArrowInvalid: S3 subsystem not initialized ( #17492 )
2021-08-02 11:48:48 -07:00
Eric Liang
e812691909
Support top-level tensor values in dataset ( #17439 )
2021-08-01 22:45:21 -07:00
Eric Liang
7ed62ea0ad
Initial implementation of Dataset pipelining and docs ( #17309 )
2021-07-28 21:12:01 -07:00
Clark Zinzow
b5194ca9f9
Add imports to docs examples to make the code more runnable. ( #17240 )
2021-07-21 11:18:45 -07:00
Eric Liang
fabba96fad
Re-merge large function def, skipping test failing on Windows ( #17191 )
2021-07-19 18:03:26 -07:00
architkulkarni
4069686e0f
Revert "Improve error message for oversized function ( #17133 )" ( #17184 )
...
This reverts commit 3e53619d64
.
2021-07-19 09:28:33 -07:00
Eric Liang
3e53619d64
Improve error message for oversized function ( #17133 )
2021-07-17 11:04:05 -07:00
Eric Liang
94f17ec099
[RFC] API stability annotations ( #17100 )
2021-07-16 17:09:20 -07:00
Eric Liang
26a286655b
Add link to datasets preview docs
2021-07-16 12:31:52 -07:00
Eric Liang
f03b43c532
[dataset] Support callable classes to simplify state initialization ( #17136 )
2021-07-15 23:06:14 -07:00
Eric Liang
3d764d7b4b
[data] Fix the ObjectRef type in the dataset docs ( #17111 )
...
* fix reft
* remove exp
* fix
2021-07-15 09:50:37 -07:00
Eric Liang
38bddc3f2b
First cut at dataset documentation ( #16956 )
2021-07-14 23:27:13 -07:00