mirror of
https://github.com/vale981/ray
synced 2025-03-05 10:01:43 -05:00
Improve release docs and add results from 0.7.7 (#6506)
* Improve docs, add logs * add logs * microbenchmark * lint
This commit is contained in:
parent
b7d23405fe
commit
8636d67b72
7 changed files with 170 additions and 113 deletions
|
@ -9,32 +9,47 @@ This document describes the process for creating new releases.
|
|||
``releases/<release-version>``. Then push that branch to the ray repo:
|
||||
``git push upstream releases/<release-version>``.
|
||||
|
||||
2. **Update the release branch version:** Push a commit that increments the Python
|
||||
package version in python/ray/__init__.py and src/ray/raylet/main.cc. You can
|
||||
push this directly to the release branch.
|
||||
2. **Update the release branch version:** Push a commit directly to the
|
||||
newly-created release branch that increments the Python package version in
|
||||
python/ray/__init__.py and src/ray/raylet/main.cc. See this
|
||||
`sample commit for bumping the release branch version`_.
|
||||
|
||||
3. **Update the master branch version:** Create a pull request to
|
||||
increment the version of the master branch, see `this PR`_.
|
||||
The format of the new version is as follows:
|
||||
increment the dev version in of the master branch. See this
|
||||
`sample PR for bumping a minor release version`_. **NOTE:** Not all of
|
||||
the version numbers should be replaced. For example, ``0.7.0`` appears in
|
||||
this file but should not be updated.
|
||||
|
||||
New minor release (e.g., 0.7.0): Increment the minor version and append
|
||||
``.dev0`` to the version. For example, if the version of the new release is
|
||||
0.7.0, the master branch needs to be updated to 0.8.0.dev0.
|
||||
This should be merged soon after cutting the release branch (step 1) to
|
||||
closely track the development version.
|
||||
|
||||
New micro release (e.g., 0.7.1): Increment the ``dev`` number, such that the
|
||||
For a new micro release (e.g., 0.7.1): Increment the ``dev`` number, such that the
|
||||
number after ``dev`` equals the micro version. For example, if the version
|
||||
of the new release is 0.7.1, the master branch needs to be updated to
|
||||
0.8.0.dev1.
|
||||
|
||||
After the wheels for the new version are built, create and merge a
|
||||
`PR like this`_.
|
||||
For a new minor release (e.g., 0.7.0): Increment the minor version and append
|
||||
``.dev0`` to the version. For example, if the new release is 0.7.0,
|
||||
the master branch should be updated to 0.8.0.dev0.
|
||||
|
||||
These should be merged as soon as step 1 is complete to make sure the links
|
||||
in the documentation keep working and the master stays on the development
|
||||
version.
|
||||
4. **Testing:** Before releasing, the following sets of tests should be run. The results
|
||||
of each of these tests for previous releases are checked in under ``doc/dev/release_tests``,
|
||||
and should be compared against to identify any regressions.
|
||||
|
||||
4. **Testing:** Before a release is created, significant testing should be done.
|
||||
Run the following scripts
|
||||
1. Long-running tests
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
ray/ci/long_running_tests/README.rst
|
||||
|
||||
Follow the instructions to kick off the tests and check the status of the workloads
|
||||
These tests should run for at least 24 hours (printing new iterations and CPU load
|
||||
stable in the AWS console).
|
||||
|
||||
The last hundred lines or so printed by each test should be checked in under
|
||||
``doc/dev/release_logs/<version>``.
|
||||
|
||||
2. Stress tests
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
|
@ -42,22 +57,25 @@ This document describes the process for creating new releases.
|
|||
ray/ci/stress_tests/run_application_stress_tests.sh <release-version> <release-commit>
|
||||
rllib train -f rllib/tuned_examples/compact-regression-test.yaml
|
||||
|
||||
and make sure they pass. For the RLlib regression tests, see the comment on the
|
||||
Make sure that these pass. For the RLlib regression tests, see the comment on the
|
||||
file for the pass criteria. For the rest, it will be obvious if they passed.
|
||||
This will use the autoscaler to start a bunch of machines and run some tests.
|
||||
**Caution!**: By default, the stress tests will require expensive GPU instances.
|
||||
|
||||
You'll also want to kick off the long-running tests by following the instructions
|
||||
in:
|
||||
The summaries printed by each test should be checked in under
|
||||
``doc/dev/release_logs/<version>``.
|
||||
|
||||
.. code-block:: bash
|
||||
3. Microbenchmarks
|
||||
|
||||
ray/ci/long_running_tests/README.rst
|
||||
.. code-block:: bash
|
||||
|
||||
Following the instructions to check the status of the workloads to verify that they
|
||||
are running. Let them run for at least 24 hours, and check them again. They should
|
||||
all still be running (printing new iterations), and their CPU load should be stable
|
||||
when you view them in the AWS monitoring console (not increasing over time).
|
||||
ray microbenchmark
|
||||
|
||||
Run `ray microbenchmark` on an `m4.16xl` instance running `Ubuntu 18.04` with `Python 3` to get the
|
||||
latest microbenchmark numbers and update them in `profiling.rst`.
|
||||
|
||||
The results should be updated in ``doc/dev/profiling.rst`` and checked in under
|
||||
``doc/dev/release_logs/<version>``.
|
||||
|
||||
5. **Resolve release-blockers:** If a release blocking issue arises, there are
|
||||
two ways the issue can be resolved: 1) Fix the issue on the master branch and
|
||||
|
@ -67,7 +85,30 @@ This document describes the process for creating new releases.
|
|||
|
||||
These changes should then be pushed directly to the release branch.
|
||||
|
||||
6. **Download all the wheels:** Now the release is ready to begin final
|
||||
6. **Create a GitHub release:** Create a `GitHub release`_. This should include
|
||||
**release notes**. Copy the style and formatting used by previous releases.
|
||||
Create a draft of the release notes containing information about substantial
|
||||
changes/updates/bugfixes and their PR numbers. Once you have a draft, send it
|
||||
out to other Ray developers (especially those who contributed heavily during
|
||||
this release) for feedback. At the end of the release note, you should also
|
||||
add a list of contributors.
|
||||
|
||||
Run ``doc/dev/get_contributors.py`` to generate the list of commits corresponding
|
||||
to this release and the formatted list of contributors.
|
||||
You will need to provide a GitHub personal access token
|
||||
(github.com -> settings -> developer settings -> personal access tokens).
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Must be run from inside the Ray repository.
|
||||
pip install PyGitHub tqdm
|
||||
python get_contributors.py --help
|
||||
python get_contributors.py \
|
||||
--access-token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
|
||||
--prev-release-commit="<COMMIT_SHA>" \
|
||||
--curr-release-commit="<COMMIT_SHA>"
|
||||
|
||||
7. **Download all the wheels:** Now the release is ready to begin final
|
||||
testing. The wheels are automatically uploaded to S3, even on the release
|
||||
branch. To test, ``pip install`` from the following URLs:
|
||||
|
||||
|
@ -84,20 +125,20 @@ This document describes the process for creating new releases.
|
|||
pip install -U https://s3-us-west-2.amazonaws.com/ray-wheels/releases/$RAY_VERSION/$RAY_HASH/ray-$RAY_VERSION-cp36-cp36m-macosx_10_6_intel.whl
|
||||
pip install -U https://s3-us-west-2.amazonaws.com/ray-wheels/releases/$RAY_VERSION/$RAY_HASH/ray-$RAY_VERSION-cp37-cp37m-macosx_10_6_intel.whl
|
||||
|
||||
7. **Upload to PyPI Test:** Upload the wheels to the PyPI test site using
|
||||
``twine`` (ask Robert to add you as a maintainer to the PyPI project on both the
|
||||
real and test PyPI). You'll need to run a command like
|
||||
8. **Upload to PyPI Test:** Upload the wheels to the PyPI test site using
|
||||
``twine``.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
twine upload --repository-url https://test.pypi.org/legacy/ ray/.whl/*
|
||||
# Downloads all of the wheels to the current directory.
|
||||
RAY_VERSION=<version> COMMIT=<commit_sha> bash download_wheels.sh
|
||||
|
||||
assuming that you've downloaded the wheels from the ``ray-wheels`` S3 bucket
|
||||
and put them in ``ray/.whl``, that you've installed ``twine`` through
|
||||
``pip``, and that you've created both PyPI accounts.
|
||||
# Will ask for your PyPI test credentials and require that you're a maintainer
|
||||
# on PyPI test. If you are not, ask @robertnishihara to add you.
|
||||
pip install twine
|
||||
twine upload --repository-url https://test.pypi.org/legacy/ *.whl
|
||||
|
||||
Test that you can install the wheels with pip from the PyPI test repository
|
||||
with:
|
||||
Test that you can install the wheels with pip from the PyPI test repository:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
|
@ -112,77 +153,29 @@ This document describes the process for creating new releases.
|
|||
scripts. Make sure that it is finding the version of Ray that you just
|
||||
installed by checking ``ray.__version__`` and ``ray.__file__``.
|
||||
|
||||
Do this at least for MacOS and for Linux, as well as for Python 2 and Python
|
||||
3.
|
||||
Do this at least for MacOS and Linux, as well as for Python 2 and Python 3.
|
||||
|
||||
8. **Upload to PyPI:** Now that you've tested the wheels on the PyPI test
|
||||
9. **Upload to PyPI:** Now that you've tested the wheels on the PyPI test
|
||||
repository, they can be uploaded to the main PyPI repository. Be careful,
|
||||
**it will not be possible to modify wheels once you upload them**, so any
|
||||
mistake will require a new release. You can upload the wheels with a command
|
||||
like
|
||||
mistake will require a new release.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
twine upload --repository-url https://upload.pypi.org/legacy/ ray/.whl/*
|
||||
# Will ask for your real PyPI credentials and require that you're a maintainer
|
||||
# on real PyPI. If you are not, ask @robertnishihara to add you.
|
||||
twine upload --repository-url https://upload.pypi.org/legacy/ *.whl
|
||||
|
||||
Verify that
|
||||
Now, try installing from the real PyPI mirror. Verify that the correct version is
|
||||
installed and that you can run some simple scripts.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
pip install -U ray
|
||||
|
||||
finds the correct Ray version, and successfully runs some simple scripts on
|
||||
both MacOS and Linux as well as Python 2 and Python 3.
|
||||
|
||||
9. **Create a GitHub release:** Create a GitHub release through the
|
||||
`GitHub website`_. The release should be created at the commit from the
|
||||
previous step. This should include **release notes**. Copy the style and
|
||||
formatting used by previous releases. Create a draft of the release notes
|
||||
containing information about substantial changes/updates/bugfixes and their
|
||||
PR numbers. Once you have a draft, make sure you solicit feedback from other
|
||||
Ray developers before publishing. Use the following to get started:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
git pull origin master --tags
|
||||
git log $(git describe --tags --abbrev=0)..HEAD --pretty=format:"%s" | sort
|
||||
|
||||
|
||||
At the end of the release note, you can add a list of contributors that help
|
||||
creating this release. Use the ``doc/dev/get_contributors.py`` to generate this
|
||||
list. You will need to create a GitHub personal access token first if you don't
|
||||
have one (github.com -> settings -> developer settings -> personal access tokens).
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
# Must be run from inside the Ray repository.
|
||||
python get_contributors.py --help
|
||||
python get_contributors.py \
|
||||
--access-token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
|
||||
--prev-branch="ray-0.7.1" \
|
||||
--curr-branch="ray-0.7.2"
|
||||
|
||||
Run `ray microbenchmark` on an `m4.16xl` instance running `Ubuntu 18.04` with `Python 3.7` to get the
|
||||
latest microbenchmark numbers and update them in `profiling.rst`.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
ray microbenchmark
|
||||
|
||||
10. **Update version numbers throughout codebase:** Suppose we just released
|
||||
0.7.1. The previous release version number (in this case 0.7.0) and the
|
||||
previous dev version number (in this case 0.8.0.dev0) appear in many places
|
||||
throughout the code base including the installation documentation, the
|
||||
example autoscaler config files, and the testing scripts. Search for all of
|
||||
the occurrences of these version numbers and update them to use the new
|
||||
release and dev version numbers. **NOTE:** Not all of the version numbers
|
||||
should be replaced. For example, ``0.7.0`` appears in this file but should
|
||||
not be updated.
|
||||
|
||||
11. **Improve the release process:** Find some way to improve the release
|
||||
10. **Improve the release process:** Find some way to improve the release
|
||||
process so that whoever manages the release next will have an easier time.
|
||||
|
||||
.. _`this example`: https://github.com/ray-project/ray/pull/4226
|
||||
.. _`this PR`: https://github.com/ray-project/ray/pull/5523
|
||||
.. _`PR like this`: https://github.com/ray-project/ray/pull/5585
|
||||
.. _`GitHub website`: https://github.com/ray-project/ray/releases
|
||||
.. _`sample PR for bumping a minor release version`: https://github.com/ray-project/ray/pull/6303
|
||||
.. _`sample commit for bumping the release branch version`: https://github.com/ray-project/ray/commit/a39325d818339970e51677708d5596f4b8f790ce
|
||||
.. _`GitHub release`: https://github.com/ray-project/ray/releases
|
||||
|
|
|
@ -16,18 +16,24 @@ Create them at https://github.com/settings/tokens/new
|
|||
""",
|
||||
)
|
||||
@click.option(
|
||||
"--prev-branch",
|
||||
"--prev-release-commit",
|
||||
required=True,
|
||||
help="Previous version branch like ray-0.7.1")
|
||||
help="Last commit SHA of the previous release.")
|
||||
@click.option(
|
||||
"--curr-branch",
|
||||
"--curr-release-commit",
|
||||
required=True,
|
||||
help="Current version branch like ray-0.7.2")
|
||||
def run(access_token, prev_branch, curr_branch):
|
||||
help="Last commit SHA of the current release.")
|
||||
def run(access_token, prev_release_commit, curr_release_commit):
|
||||
print("Writing commit descriptions to 'commits.txt'...")
|
||||
check_output(
|
||||
("'git log {prev_release_commit}..{curr_release_commit} "
|
||||
f"--pretty=format:'%s' > commits.txt"),
|
||||
shell=True)
|
||||
# Generate command
|
||||
cmd = []
|
||||
cmd.append(f'git log {prev_branch}..{curr_branch} --pretty=format:"%s" '
|
||||
' | grep -Eo "#(\d+)"')
|
||||
cmd.append((f"git log {prev_release_commit}..{curr_release_commit} "
|
||||
f"--pretty=format:'%s' "
|
||||
f" | grep -Eo '#(\d+)'"))
|
||||
joined = " && ".join(cmd)
|
||||
cmd = f"bash -c '{joined}'"
|
||||
cmd = shlex.split(cmd)
|
||||
|
|
14
doc/dev/release_logs/0.7.7/microbenchmark.txt
Normal file
14
doc/dev/release_logs/0.7.7/microbenchmark.txt
Normal file
|
@ -0,0 +1,14 @@
|
|||
single client get calls per second 28595.02 +- 580.33
|
||||
single client put calls per second 6313.62 +- 66.88
|
||||
single client put gigabytes per second 11.6 +- 6.86
|
||||
multi client put calls per second 16800.89 +- 381.69
|
||||
multi client put gigabytes per second 23.33 +- 0.96
|
||||
single client tasks sync per second 1963.72 +- 48.48
|
||||
single client tasks async per second 5181.29 +- 30.0
|
||||
multi client tasks async per second 5566.7 +- 280.72
|
||||
1:1 actor calls sync per second 1595.47 +- 38.32
|
||||
1:1 actor calls async per second 2496.26 +- 37.62
|
||||
1:1 direct actor calls async per second 7233.63 +- 205.75
|
||||
n:n actor calls async per second 5357.63 +- 116.9
|
||||
n:n direct actor calls async per second 90703.32 +- 805.56
|
||||
n:n direct actor calls with arg async per second 13300.47 +- 532.66
|
24
doc/dev/release_logs/0.7.7/rllib_regression.txt
Normal file
24
doc/dev/release_logs/0.7.7/rllib_regression.txt
Normal file
|
@ -0,0 +1,24 @@
|
|||
+----------------------------------------+------------+-------+--------+------------------+-------------+----------+
|
||||
| Trial name | status | loc | iter | total time (s) | timesteps | reward |
|
||||
|----------------------------------------+------------+-------+--------+------------------+-------------+----------|
|
||||
| IMPALA_BreakoutNoFrameskip-v4_4445a400 | TERMINATED | | 294 | 3600.28 | 6183500 | 352.15 |
|
||||
| IMPALA_BreakoutNoFrameskip-v4_44461b92 | TERMINATED | | 292 | 3610.1 | 6178500 | 147.14 |
|
||||
| IMPALA_BreakoutNoFrameskip-v4_44467b28 | TERMINATED | | 293 | 3611.05 | 6200500 | 275.1 |
|
||||
| IMPALA_BreakoutNoFrameskip-v4_4446e1bc | TERMINATED | | 294 | 3611.16 | 6161500 | 322.95 |
|
||||
| PPO_BreakoutNoFrameskip-v4_44474a4e | TERMINATED | | 549 | 3603.09 | 2745000 | 36.41 |
|
||||
| PPO_BreakoutNoFrameskip-v4_4447b740 | TERMINATED | | 545 | 3604.43 | 2725000 | 16.83 |
|
||||
| PPO_BreakoutNoFrameskip-v4_44481ea6 | TERMINATED | | 549 | 3605.4 | 2745000 | 52.36 |
|
||||
| PPO_BreakoutNoFrameskip-v4_444885a8 | TERMINATED | | 546 | 3605.27 | 2730000 | 34.94 |
|
||||
| APEX_BreakoutNoFrameskip-v4_4448feac | TERMINATED | | 113 | 3629.93 | 3552960 | 13.47 |
|
||||
| APEX_BreakoutNoFrameskip-v4_44497684 | TERMINATED | | 112 | 3615.5 | 3539360 | 28.65 |
|
||||
| APEX_BreakoutNoFrameskip-v4_4449f01e | TERMINATED | | 113 | 3621.85 | 3524640 | 28.17 |
|
||||
| APEX_BreakoutNoFrameskip-v4_444a5dd8 | TERMINATED | | 112 | 3616.41 | 3447520 | 20.56 |
|
||||
| A2C_BreakoutNoFrameskip-v4_444acb7e | TERMINATED | | 349 | 3604.38 | 2981000 | 108.8 |
|
||||
| A2C_BreakoutNoFrameskip-v4_444b3a50 | TERMINATED | | 349 | 3602.3 | 2967500 | 145.82 |
|
||||
| A2C_BreakoutNoFrameskip-v4_444b9ebe | TERMINATED | | 349 | 3605.71 | 2990500 | 127.3 |
|
||||
| A2C_BreakoutNoFrameskip-v4_444c0016 | TERMINATED | | 349 | 3602.58 | 2970000 | 112.92 |
|
||||
| DQN_BreakoutNoFrameskip-v4_444c6830 | TERMINATED | | 30 | 3662.81 | 300000 | 18.71 |
|
||||
| DQN_BreakoutNoFrameskip-v4_444cdeb4 | TERMINATED | | 30 | 3608.95 | 300000 | 17.45 |
|
||||
| DQN_BreakoutNoFrameskip-v4_444d5038 | TERMINATED | | 31 | 3658.65 | 310000 | 17.43 |
|
||||
| DQN_BreakoutNoFrameskip-v4_444dbfb4 | TERMINATED | | 31 | 3696.8 | 310000 | 18.63 |
|
||||
+----------------------------------------+------------+-------+--------+------------------+-------------+----------+
|
|
@ -0,0 +1,4 @@
|
|||
Finished in: 361.19257402420044s
|
||||
Average iteration time: 3.6119208979606627s
|
||||
Max iteration time: 6.086421489715576s
|
||||
Min iteration time: 0.23141980171203613s
|
13
doc/dev/release_logs/0.7.7/stress_tests/test_many_tasks.txt
Normal file
13
doc/dev/release_logs/0.7.7/stress_tests/test_many_tasks.txt
Normal file
|
@ -0,0 +1,13 @@
|
|||
Stage 1 results:
|
||||
Total time: 562.7756896018982
|
||||
Average iteration time: 56.277525544166565
|
||||
Max iteration time: 59.43245506286621
|
||||
Min iteration time: 46.18883228302002
|
||||
Stage 2 results:
|
||||
Total time: 650.9956197738647
|
||||
Average iteration time: 130.19871163368225
|
||||
Max iteration time: 134.73312878608704
|
||||
Min iteration time: 126.5669572353363
|
||||
Stage 3 results:
|
||||
Actor creation time: 0.12409853935241699
|
||||
Total time: 3345.6912751197815
|
|
@ -66,17 +66,20 @@ Ubuntu 18.04 and Python 3.6:
|
|||
|
||||
.. code-block:: text
|
||||
|
||||
single core get calls per second 13387.15 +- 9.53
|
||||
single core put calls per second 4569.31 +- 53.59
|
||||
single core put gigabytes per second 12.64 +- 6.07
|
||||
multi core put calls per second 15667.53 +- 110.85
|
||||
multi core put gigabytes per second 22.85 +- 1.15
|
||||
single core tasks sync per second 1822.1 +- 51.61
|
||||
single core tasks async per second 6603.71 +- 39.5
|
||||
multi core tasks async per second 8161.46 +- 456.28
|
||||
single core actor calls sync per second 1374.22 +- 81.32
|
||||
single core actor calls async per second 1786.57 +- 138.77
|
||||
multi core actor calls async per second 6418.93 +- 128.0
|
||||
single client get calls per second 28595.02 +- 580.33
|
||||
single client put calls per second 6313.62 +- 66.88
|
||||
single client put gigabytes per second 11.6 +- 6.86
|
||||
multi client put calls per second 16800.89 +- 381.69
|
||||
multi client put gigabytes per second 23.33 +- 0.96
|
||||
single client tasks sync per second 1963.72 +- 48.48
|
||||
single client tasks async per second 5181.29 +- 30.0
|
||||
multi client tasks async per second 5566.7 +- 280.72
|
||||
1:1 actor calls sync per second 1595.47 +- 38.32
|
||||
1:1 actor calls async per second 2496.26 +- 37.62
|
||||
1:1 direct actor calls async per second 7233.63 +- 205.75
|
||||
n:n actor calls async per second 5357.63 +- 116.9
|
||||
n:n direct actor calls async per second 90703.32 +- 805.56
|
||||
n:n direct actor calls with arg async per second 13300.47 +- 532.66
|
||||
|
||||
References
|
||||
----------
|
||||
|
|
Loading…
Add table
Reference in a new issue