docs: convert reStructuredText sources to MyST markdown#1579
Open
timsaucer wants to merge 12 commits into
Open
docs: convert reStructuredText sources to MyST markdown#1579timsaucer wants to merge 12 commits into
timsaucer wants to merge 12 commits into
Conversation
Bump pydata-sphinx-theme 0.8.0 -> 0.16 to enable the modern navbar slot
API and dark/light theme switcher. Configure top navbar with logo,
nav links, GitHub icon, and theme switcher in conf.py. Drop the custom
docs-sidebar.html override and the layout.html block that silenced the
navbar — both predate the slot API and conflict with the new theme.
Strip CSS overrides that fought the old theme (--pst-header-height: 0,
navbar-brand sizing) and add a dark-mode variant for the inline code
color and table-stripe shading. Fix the stale github_repo
("arrow-datafusion-python" -> "datafusion-python") so future Edit-on-
GitHub links resolve. Bump copyright year and project name.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous structure dumped every top-level toctree entry from index.rst
into the navbar, producing eight items including external URLs ("Github
and Issue Tracker", "Rust's API Docs", ...) that wrapped to two lines
each. Introduce user-guide/index.rst and contributor-guide/index.rst as
section landing pages with nested toctrees, then point index.rst at just
those two plus autoapi/index. The navbar now reads "User Guide",
"Contributor Guide", "API Reference" — three single-line entries. Move
the external links into the index.rst body where they're discoverable
without crowding navigation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add Examples and Rust API as text links in the top navbar via the pydata-sphinx-theme external_links option. Nest the code-of-conduct link inside the Contributor Guide toctree so it appears alongside the other contributor pages. Drop the duplicate "Further reading" bullet list from the landing page now that every link has a permanent home. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the Rust API docs entry from external_links to icon_links and use the fa-brands fa-rust gear mark. Now sits next to the GitHub icon in navbar_end with matching visual weight instead of a wider text link. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The default pydata-sphinx-theme sidebar-nav-bs starts at the current top-level section, so the root index — which has no parent section — ends up with an empty sidebar. The theme's layout also explicitly filters sidebar-nav-bs out of the sidebar list when suppress_sidebar_ toctree() returns true (which it does for root pages), so simply overriding sidebar-nav-bs.html in templates doesn't help. Add a sidebar-globaltoc.html template that calls Sphinx's toctree() global directly to render the full document tree, and wire it through html_sidebars under a name the theme's suppress filter doesn't strip. Landing page now shows User Guide / Contributor Guide / API Reference in the sidebar with the current section expanded on inner pages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch the sidebar toctree call from toctree() to generate_toctree_html with collapse=False, so nested <ul>s render into the DOM for every branch. The pydata-sphinx-theme JS then wraps them in <details> with fa-chevron-down toggles, matching the datafusion-comet sidebar where each section with children can be expanded inline. show_nav_level=1 keeps deeper levels collapsed on first load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bump show_nav_level 1 -> 2 so the landing-page sidebar opens with User Guide / Contributor Guide / API Reference already expanded to their immediate children. Deeper levels remain collapsed behind chevrons so the sidebar stays scannable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Restore the "Links" sidebar heading that the previous site had — GitHub and Issue Tracker, Rust API Docs, Code of Conduct, Examples. Implemented as a second hidden toctree with :caption: Links so the pydata-sphinx-theme sidebar renders the heading above the four external URLs. Drop Code of Conduct from the Contributor Guide toctree since it now lives under Links instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the second hidden toctree (which expanded each external URL into its own navbar entry) with a dedicated links.rst landing page, and add a single "links" entry to the main toctree. Top navbar now shows User Guide / Contributor Guide / API Reference / Links — four items, no wrapping. Clicking Links opens the page that lists GitHub, Rust API Docs, Code of Conduct, and Examples. Drop the external_links Examples entry from conf.py since the same URL now lives on the Links page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop in the same favicon.svg the main datafusion.apache.org site uses (just the Apache DataFusion mark, no wordmark) and wire it through html_favicon. Browsers and bookmarks now show the project icon instead of the generic Sphinx page glyph. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small follow-ups from the Copilot reviewer on apache#1578: - Append .html to the html_sidebars entry. Sphinx's Jinja loader resolves both "sidebar-globaltoc" and "sidebar-globaltoc.html" to the same template, but the explicit form is closer to the spelling in the Sphinx docs and is harder to misread. - Update the inline comment in sidebar-globaltoc.html that still claimed show_nav_level=1 after we bumped it to 2 in conf.py. Now describes the variable wiring instead of hard-coding a number that has to be kept in sync with conf.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 of the documentation-site refresh. Run `rst2myst convert` over
every human-authored .rst file under docs/source/ and remove the
originals. The result:
- 33 .rst files become 33 .md files (user guide, contributor guide,
index, links).
- Headings, paragraphs, hyperlinks, code blocks, admonitions, and
toctree directives all map cleanly to MyST syntax.
- Cross-reference anchors round-trip through MyST as `(label)=`
blocks. The converter kebab-cased the labels (e.g. `(io-csv)=`),
but every `{ref}` target in the corpus still uses the underscore
form from the original RST (`{ref}\`CSV <io_csv>\``) and so do the
Python docstrings that AutoAPI pulls in. Rewrite the anchors back
to the underscore form so the existing references resolve.
- 86 `{eval-rst}` blocks remain — they all wrap `.. ipython::`
directives, which have no first-class MyST equivalent. They render
identically and don't block the build.
conf.py changes:
- Enable `colon_fence` and `deflist` MyST extensions (rst-to-myst
emits these on a few files, particularly execution-metrics.md).
- Keep `.rst` in `source_suffix` even though no human-authored RST
remains: sphinx-autoapi generates RST under autoapi/ at build time
and Sphinx needs the suffix registered to parse it.
AGENTS.md: update the two .rst paths called out under "Aggregate and
Window Function Documentation" to point at the .md equivalents.
Verified by building locally — `build succeeded`, no warnings, all
internal cross-references resolve, the ipython examples on the
landing page and basics page still execute.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #.
Rationale for this change
Phase 2 of the documentation-site refresh started in #1578. With the
modern pydata-sphinx-theme + navigation in place, this PR moves the
content format off
.rstand onto MyST.md. The motivation:on GitHub and modern docs parse Markdown reliably; reStructuredText
is a minority dialect that frequently confuses both humans editing
via PR review and agents reading the source. The Apache
datafusion-cometsibling project completed the same migrationrecently and reported smoother contributor onboarding.
Sphinx features we actually use (toctrees, cross-references,
code-blocks, admonitions, eval-rst escape hatch).
myst-parserextension is already in the docs dependencygroup and was loaded by
conf.pyeven before this PR — switchingthe on-disk format is a low-risk, mechanical change.
This PR stacks on #1578 (theme + navbar refresh). It should land
after #1578.
What changes are included in this PR?
Format conversion (mechanical, via
rst-to-myst):.rstfiles underdocs/source/become 33.mdfiles — the user guide, contributor guide, IO subsection,common-operations subsection, dataframe subsection, top-level
index, andlinks.and license headers all round-trip cleanly.
Manual fixes layered on top of the converter output:
(label)=anchor (e.g.(io-csv)=), but every{ref}in thecorpus — including the Python docstrings that
sphinx-autoapipulls into the API reference — still uses the underscore form
(
{ref}\CSV <io_csv>`). Rewrite the anchors back to underscore form ((io_csv)=,(window_functions)=,(user_guide_concepts)=,(execution_metrics)=`, etc.) so existing references resolvewithout churning every callsite.
colon_fenceanddeflistinmyst_enable_extensions(the converter emits these on a fewfiles, notably
dataframe/execution-metrics.md).source_suffix. Keep.rstregistered even though nohuman-authored RST remains:
sphinx-autoapigenerates.rstunder
autoapi/at build time and Sphinx needs the suffix toparse it. The comment in
conf.pyflags this so a future cleanuppass doesn't strip it again.
86
{eval-rst}blocks remain in the converted output. Every one ofthem wraps a
.. ipython::directive, which has no first-class MySTequivalent in our extensions setup. The blocks render identically
and don't block the build. Migrating these to a native MyST exec
syntax is a follow-up that requires either
myst-nbor a customparser registration — out of scope here.
AGENTS.mdis updated so the two.rstpaths called out under"Aggregate and Window Function Documentation" point at the new
.mdequivalents.
Are there any user-facing changes?
No behavioral change to the
datafusionpackage — only the sourceformat of the published documentation. Readers of the rendered site
will not notice the migration; the HTML output is unchanged. Internal
cross-references resolve, the
pokemon.csvipython example on thelanding page and the
yellow_tripdata_2021-01.parquetexample onthe basics page both still execute.
No
api changelabel — public APIs untouched.Follow-ups (out of scope for this PR)
{eval-rst}.. ipython::blocks to aMyST-native exec syntax. Requires either pulling in
myst-nborconfiguring a per-language parser.
asf-sitepublishing workflow.