ray/doc/source/data/mars-on-ray.rst

.. _mars-on-ray:

Using Mars on Ray
=================

.. _`issue on GitHub`: https://github.com/mars-project/mars/issues


`Mars`_ is a tensor-based unified framework for large-scale data computation which scales Numpy, Pandas and Scikit-learn.
Mars on Ray makes it easy to scale your programs with a Ray cluster. Currently Mars on Ray supports both Ray actors 
and tasks as execution backend. The task will be scheduled by mars scheduler if Ray actors is used. This mode can reuse 
all mars shceduler optimizations. If ray tasks mode is used, all tasks will be scheduled by ray, which can reuse failover and
pipeline capabilities provided by ray futures.


.. _`Mars`: https://docs.pymars.org


Installation
-------------
You can simply install Mars via pip:

.. code-block:: bash

    pip install pymars>=0.8.3


Getting started
----------------

It's easy to run Mars jobs on a Ray cluster.


Starting a new Mars on Ray runtime locally via:


.. code-block:: python

    import ray
    ray.init()
    import mars
    mars.new_ray_session()
    import mars.tensor as mt
    mt.random.RandomState(0).rand(1000_0000, 5).sum().execute()


Or connecting to a Mars on Ray runtime which is already initialized:


.. code-block:: python

    import mars
    mars.new_ray_session('http://<web_ip>:<ui_port>')
    # perform computation


Interact with Ray Dataset:


.. code-block:: python

    import mars.tensor as mt
    import mars.dataframe as md
    df = md.DataFrame(
        mt.random.rand(1000_0000, 4),
        columns=list('abcd'))
    # Convert mars dataframe to ray dataset
    import ray
    # ds = md.to_ray_dataset(df)
    ds = ray.data.from_mars(df)
    print(ds.schema(), ds.count())
    ds.filter(lambda row: row["a"] > 0.5).show(5)
    # Convert ray dataset to mars dataframe
    # df2 = md.read_ray_dataset(ds)
    df2 = ds.to_mars()
    print(df2.head(5).execute())

Refer to _`Mars on Ray`: https://docs.pymars.org/en/latest/installation/ray.html for more information.
[docs] init list of oss projects (#10758) 2020-09-14 12:11:05 -07:00			`.. _mars-on-ray:`

[Docs] Ray Data docs target state (#21931) Preview: [docs](https://ray--21931.org.readthedocs.build/en/21931/data/dataset.html) The Ray Data project's docs now have a clearer structure and have partly been rewritten/modified. In particular we have - [x] A Getting Started Guide - [x] An explicit User / How-To Guide - [x] A dedicated Key Concepts page - [x] A consistent naming convention in `Ray Data` whenever is is referred to the project. This surfaces quite clearly that, apart from the "Getting Started" sections, we really only have one real example. Once we have more, we can create an "Example" section like many other sub-projects have. This will be addressed in https://github.com/ray-project/ray/issues/21838. 2022-01-27 22:14:36 +01:00			`Using Mars on Ray`
			`=================`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00
			.. _`issue on GitHub`: https://github.com/mars-project/mars/issues


[docs] Fix broken links in new community libraries page ++ (#10785) * fix * fix up * update * revert * typo * update 2020-09-14 21:18:28 -07:00			`Mars`_ is a tensor-based unified framework for large-scale data computation which scales Numpy, Pandas and Scikit-learn.
[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`Mars on Ray makes it easy to scale your programs with a Ray cluster. Currently Mars on Ray supports both Ray actors`
			`and tasks as execution backend. The task will be scheduled by mars scheduler if Ray actors is used. This mode can reuse`
			`all mars shceduler optimizations. If ray tasks mode is used, all tasks will be scheduled by ray, which can reuse failover and`
			`pipeline capabilities provided by ray futures.`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00

			.. _`Mars`: https://docs.pymars.org


			`Installation`
			`-------------`
			`You can simply install Mars via pip:`

			`.. code-block:: bash`

[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`pip install pymars>=0.8.3`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00

			`Getting started`
			`----------------`

[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`It's easy to run Mars jobs on a Ray cluster.`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00

[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`Starting a new Mars on Ray runtime locally via:`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00

[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`.. code-block:: python`

			`import ray`
			`ray.init()`
			`import mars`
			`mars.new_ray_session()`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00			`import mars.tensor as mt`
[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`mt.random.RandomState(0).rand(1000_0000, 5).sum().execute()`

Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00
[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`Or connecting to a Mars on Ray runtime which is already initialized:`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00

[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`.. code-block:: python`

			`import mars`
			`mars.new_ray_session('http://<web_ip>:<ui_port>')`
			`# perform computation`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00

[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00			`Interact with Ray Dataset:`
Add documentation for Mars on Ray (#10468) * Add documentation for Mars on Ray * Update mars_on_ray.rst * refine according to comments Co-authored-by: hekaisheng <kaisheng.hks@alibaba-inc.com> Co-authored-by: Eric Liang <ekhliang@gmail.com> 2020-09-04 00:07:33 +08:00
[Datasets] Integrate Mars-on-Ray with Datasets; improve docs and add tests (#23402) Add Mars-on-Ray + Datasets integration; improve Mars-on-Ray docs and add tests. 2022-04-30 00:43:52 +08:00
			`.. code-block:: python`

			`import mars.tensor as mt`
			`import mars.dataframe as md`
			`df = md.DataFrame(`
			`mt.random.rand(1000_0000, 4),`
			`columns=list('abcd'))`
			`# Convert mars dataframe to ray dataset`
			`import ray`
			`# ds = md.to_ray_dataset(df)`
			`ds = ray.data.from_mars(df)`
			`print(ds.schema(), ds.count())`
			`ds.filter(lambda row: row["a"] > 0.5).show(5)`
			`# Convert ray dataset to mars dataframe`
			`# df2 = md.read_ray_dataset(ds)`
			`df2 = ds.to_mars()`
			`print(df2.head(5).execute())`

			Refer to _`Mars on Ray`: https://docs.pymars.org/en/latest/installation/ray.html for more information.