Skip to content

[WIP] Adds Qiskit addon prose content pipeline #5046

Draft
eharvey328 wants to merge 109 commits into
mainfrom
emh/addons-content
Draft

[WIP] Adds Qiskit addon prose content pipeline #5046
eharvey328 wants to merge 109 commits into
mainfrom
emh/addons-content

Conversation

@eharvey328
Copy link
Copy Markdown
Contributor

@eharvey328 eharvey328 commented Apr 29, 2026

Extracts prose HTML and Jupyter notebooks for Qiskit addons and converts them to MDX to be consumed by our docs app.

The goal is to extend the existing API docs pipeline for minimal changes.

Adds a new addons command npm run gen-addon -- -p {package}, to function like the existing npm run gen-api -- -p {package} -v {version}

Why a separate pipeline instead of reusing the API pipeline

  • The API pipeline (runApiDocsPipeline) targets apidocs/** stubs and generates docs/api/{pkg}/.
  • Addons ship guides (HTML) + notebooks (.ipynb) under the Sphinx root — no stubs, no
    historical versioning, no release notes per-page.
  • The addon pipeline reuses the same shared stages
    (pipelineStages.ts, notebookStages.ts) but diverges in:
Concern API pipeline Addon pipeline
Input glob apidocs/**, stubs/** Everything except apidocs/tutorials/release-notes
Output path docs/api/{pkg}/ docs/addons/{pkg}/
Images public/docs/images/api/ public/docs/images/addons/
TOC generator generateToc.ts generateAddonToc.ts
Frontmatter auto-generated from URL extracted from Sphinx <h1>
Notebooks not processed processed (link rewriting, frontmatter cell injected)
Release notes yes (own pipeline stage) excluded from glob
Versioning latest/historical/dev latest only

New files

File Purpose
scripts/js/lib/api/addonDocsPipeline.ts Orchestrator: HTML→MD, notebooks, images, TOC, tutorials manifest
scripts/js/lib/api/pipelineStages.ts Shared HTML stages extracted from the old conversionPipeline.ts; used by both pipelines
scripts/js/lib/api/notebookStages.ts Notebook-specific stages (read, process, write, collect images)
scripts/js/lib/api/generateAddonToc.ts Builds _toc.json with back-link, merged Guide sections, Tutorials, API reference
scripts/js/commands/api/updateAddonDocs.ts CLI entry point: --package, --version
scripts/js/commands/api/updateDocsShared.ts Shared CLI helpers (yargs options, artifact download, output wipe) extracted from updateApiDocs.ts
scripts/js/lib/api/apiDocsPipeline.ts Renamed/extracted from conversionPipeline.ts; API pipeline now lives here

Design decisions

  1. extractfrontMatter=true in addon pipeline: Addons don't follow the same URL-slug naming convention as API stubs, so URL-derived titles were wrong. Extracting from <h1> is more reliable.

  2. escapeMdxSpecialChars in notebookStages: Opt-mapper notebooks contain math like 0 <= x < 1 in prose, which MDX parses as a JSX tag open. The function surgically escapes only bare < outside code/math regions rather than HTML-escaping the entire source.

  3. Merged "Guides" section in generateAddonToc: how-tos/ and explanations/ both map to "Guides" via DIR_LABELS. This matches the Qiskit docs content model (guides = both types).

  4. parentLabel added to _toc to provide text for the parent back arrow (link) in the toc. Added as a new property for backwards compatibility with learning content.

  5. Loading all objects.inv files for cross-package link resolution: Addon notebooks link to symbols in other packages (qiskit.github.io/{pkg}/stubs/...). The old pipeline only resolved links against the current package's inventory, which isn't enough for addons.

    • New pattern: Both pipelines now call ObjectsInv.loadPublishedApis(publicBaseFolder) first, which scans public/docs/api/ on disk and returns a Map<pkgName, ObjectsInv> for every package that has a published inventory. That map flows down through postProcessupdateLinks and processNotebooksrewriteNotebookLinks, where resolveStubUrl(url, allInvs) uses it to look up the right package's inventory by name.
    • resolveStubUrl handles two URL forms:
      • stubs/{Symbol} — entry lookup by name; the entry's uri already carries the correct anchor.
      • apidocs/{page} — page-level lookup; the source #anchor is preserved and appended.
    • std: domain guard in shouldIncludeEntry — excludes non-API entries (guide pages, index, install) from published inventories so they can't produce false cross-package matches.

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@eharvey328 eharvey328 changed the title [WIP] compile docs/addons content [WIP] Adds Qiskit addon prose content pipeline Apr 29, 2026
Comment thread scripts/js/lib/links/ignores.ts Outdated
@@ -587,8 +587,57 @@ function _qiskitCRegexes(): FilesToIgnores {
};
}

function _addonStaleTutorialLinks(): FilesToIgnores {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this is merged, we should open an issue so we don't forget about this.

Comment thread scripts/js/lib/api/pipelineStages.ts Outdated
Comment thread scripts/js/lib/api/htmlToMd.ts Outdated

// Generates the _toc.json for an addon's content pages.
//
// Addon TOCs have a fixed three-part shape:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How hard would be to process the html to extract the toc preserving the order, and ignore the entries that we are not including?

"leftarrow",
"forall",
"arxiv",
"subpace", // fix in addon-sqd
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an issue for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants