ray/doc/source/preprocess_github_markdown.py
Antoni Baum 756d08cd31
[docs] Add support for external markdown (#23505)
This PR fixes the issue of diverging documentation between Ray Docs and ecosystem library readmes which live in separate repos (eg. xgboost_ray). This is achieved by adding an extra step before the docs build process starts that downloads the readmes of specified ecosystem libraries from their GitHub repositories. The files are then preprocessed by a very simple parser to allow for differences between GitHub and Docs markdowns.

In summary, this makes the markdown files in ecosystem library repositories single sources of truth and removes the need to manually keep the doc pages up to date, all the while allowing for differences between what's rendered on GitHub and in the Docs.

See ray-project/xgboost_ray#204 & https://ray--23505.org.readthedocs.build/en/23505/ray-more-libs/xgboost-ray.html for an example.

Needs ray-project/xgboost_ray#204 and ray-project/lightgbm_ray#30 to be merged first.
2022-03-31 08:38:14 -07:00

40 lines
1.2 KiB
Python

import re
import argparse
import pathlib
def preprocess_github_markdown_file(path: str):
"""
Preprocesses GitHub Markdown files by:
- Uncommenting all ``<!-- -->`` comments in which opening tag is immediately
succeded by ``$UNCOMMENT``(eg. ``<!--$UNCOMMENTthis will be uncommented-->``)
- Removing text between ``<!--$REMOVE-->`` and ``<!--$END_REMOVE-->``
This is to enable translation between GitHub Markdown and MyST Markdown used
in docs. For more details, see ``doc/README.md``.
"""
with open(path, "r") as f:
text = f.read()
# $UNCOMMENT
text = re.sub(r"<!--\s*\$UNCOMMENT(.*?)(-->)", r"\1", text, flags=re.DOTALL)
# $REMOVE
text = re.sub(
r"(<!--\s*\$REMOVE\s*-->)(.*?)(<!--\s*\$END_REMOVE\s*-->)",
r"",
text,
flags=re.DOTALL,
)
with open(path, "w") as f:
f.write(text)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Preprocess github markdown file to Ray Docs MyST markdown"
)
parser.add_argument(
"path", type=pathlib.Path, help="Path to github markdown file to preprocess"
)
args, _ = parser.parse_known_args()
preprocess_github_markdown_file(args.path.expanduser())