A Hono-based Cloudflare Worker that converts web pages to clean Markdown or styled, print-friendly HTML.
/text— Converts HTML pages to Markdown via Cloudflare's Workers AItoMarkdownAPI/print— Cleans original HTML withHTMLRewriter, injects a print stylesheet, preserves semantic structure- Content filtering: scope Markdown conversion to a CSS selector and strip unwanted elements by ID or class
- Domain whitelisting to restrict which sites can be proxied
- CORS-enabled responses
- Zero external runtime dependencies — Hono and all libraries are bundled at build time
GET /text?page=https://news.ucsc.edu/2026/03/some-article/
Returns the page content as Markdown (text/markdown).
GET /print?page=https://news.ucsc.edu/2026/03/some-article/
Returns cleaned HTML (text/html) with screen and print stylesheets. Original HTML structure is preserved — tables, figures, semantic elements all survive intact. Suitable for reading in a browser or printing to PDF.
Environment variables are set in wrangler.toml under [vars]:
| Variable | Description | Default |
|---|---|---|
WHITELISTED_DOMAIN |
Only this hostname can be proxied | news.ucsc.edu |
CSS_SELECTOR |
Scope AI conversion to a content area, /text only (e.g. .entry-content) |
.entry-content |
REMOVE_IDS |
Comma-separated element IDs to strip | "" |
REMOVE_CLASSES |
Comma-separated class names to strip | (see wrangler.toml) |
The AI binding is configured under [ai] in wrangler.toml.
npm install # Install dependencies
npm run dev # Start local dev server
npm run test # Run tests (46 tests across unit and integration suites)
npm run deploy # Deploy to Cloudflaresrc/
index.js # Hono app — mounts routes, CORS
routes/
home.js # GET / — front page
print.js # GET /print — cleaned HTML with print stylesheet
text.js # GET /text — Markdown via AI.toMarkdown
middleware/
validate-page.js # Validates ?page param, fetches upstream HTML
lib/
tidy-html.js # HTMLRewriter-based HTML cleaner
preprocess-html.js # HTML prep for Markdown conversion
utils.js # parseList(), stripFrontMatter()
pretty.css # Screen and print stylesheet
tests/
unit/ # Pure function tests
integration/ # Route tests (workerd runtime)
vitest.workspace.js # Vitest workspace config
wrangler.toml # Worker config, env vars, AI binding
MIT