Draft PR to store changes to my blog post#17877
Conversation
My version on top of #17681: keeps the original intro and voice, adopts the section headings, links, and restructured lists, adds new content (logical-vs-physical-layout rationale, ShrinkRay, coverage-guided generation), and a few copy-edit passes.
Deploy preview
|
|
Vale prose linter → found 0 errors, 19 warnings, 0 suggestions in your markdown Full report → Copy the linter results into an LLM to batch-fix issues. Linter being weird? Update the rules!
|
| Line | Severity | Message | Rule |
|---|---|---|---|
| 16:51 | warning | 'clacky' is a possible misspelling. | PostHogBase.Spelling |
| 22:73 | warning | 'autoresearch' is a possible misspelling. | PostHogBase.Spelling |
| 28:70 | warning | 'transpile' is a possible misspelling. | PostHogBase.Spelling |
| 33:37 | warning | Capitalize 'Product Analytics' for PostHog's product. Use 'product analytics' for the general industry concept. | PostHogBase.ProductNames |
| 33:56 | warning | Capitalize 'Session Replay' for PostHog's product. Use 'session replay' for the general industry concept. | PostHogBase.ProductNames |
| 33:72 | warning | Capitalize 'Error Tracking' for PostHog's product. Use 'error tracking' for the general industry concept. | PostHogBase.ProductNames |
| 33:151 | warning | 'transpilation' is a possible misspelling. | PostHogBase.Spelling |
| 33:285 | warning | 'untrusted' is a possible misspelling. | PostHogBase.Spelling |
| 35:27 | warning | 'transpilation' is a possible misspelling. | PostHogBase.Spelling |
| 35:132 | warning | 'transpiled' is a possible misspelling. | PostHogBase.Spelling |
| 37:4 | warning | 'Generating our parser with ANTLR' heading should be in sentence case, and product names should be capitalized. | PostHogBase.SentenceCase |
| 41:134 | warning | 'declaratively' is a possible misspelling. | PostHogBase.Spelling |
| 45:48 | warning | 'lookahead' is a possible misspelling. | PostHogBase.Spelling |
| 53:162 | warning | 'lookahead' is a possible misspelling. | PostHogBase.Spelling |
| 67:40 | warning | 'transpiler' is a possible misspelling. | PostHogBase.Spelling |
| 71:93 | warning | 'codegen' is a possible misspelling. | PostHogBase.Spelling |
| 75:182 | warning | 'lookahead' is a possible misspelling. | PostHogBase.Spelling |
| 75:237 | warning | 'lookahead' is a possible misspelling. | PostHogBase.Spelling |
| 87:13 | warning | 'anonymized' is a possible misspelling. | PostHogBase.Spelling |
|
|
||
| After the success of using agents to [improve query performance through autoresearch](/blog/karpathy-autoresearch-query-engine-bug), I wanted to try something more ambitious. | ||
|
|
||
| I ran multiple long-running Claude Code sessions in parallel, and the result was 16K lines of "hand"-rolled parser code, 5K lines of tooling, and a few more K of tests. |
There was a problem hiding this comment.
I kept this instead of the stat about number of queries. My rationale is that this actually isn't a story about doing something at scale, it's more about doing something with AI coding that's hard with some interesting computer science
|
|
||
| Hypothesis will "reduce" test cases for you, turning them into a minimal reproduction, but I couldn’t use that with SQL from other sources. For those I used [ShrinkRay](https://github.com/DRMacIver/shrinkray) instead. | ||
|
|
||
| Later on, I added code-coverage-guided test case generation, which gives a better distribution of generated SQL. With coverage feedback, the generator can tell which constructs it hasn't exercised yet and bias towards those. This wasn't necessary to hit 100% accuracy on a production corpus, but it did help me find some very subtle test cases. |
There was a problem hiding this comment.
this is new, I did this since I wrote the original
|
|
||
| The two parallel parser approaches shared their regression suites, so any failing test case found in one session was shared with the other. | ||
|
|
||
| Hypothesis will "reduce" test cases for you, turning them into a minimal reproduction, but I couldn’t use that with SQL from other sources. For those I used [ShrinkRay](https://github.com/DRMacIver/shrinkray) instead. |
There was a problem hiding this comment.
this is new, I did this since I wrote the original
Bundle reportTotal JS (gzip)6.21 MiB (no change) Eager graph (static-import closure per entrypoint)
Largest modules in the
|
| Module | Size |
|---|---|
css ./node_modules/.pnpm/css-loader@5.2.7_webpack@5.101.3/node_modules/css-loader/dist/cjs.js??ruleSet[1].rules[8].oneOf[1].use[1]!./node_modules/.pnpm/postcss-loader@4.3.0_postcss@8.5.6_webpack@5.101.3/node_modules/postcss-loader/dist/cjs.js??ruleSet[1].rules[8].oneOf[1].use[2]!./src/styles/global.css |
710.3 KiB |
./src/components/Stickers/Stickers.tsx |
696.4 KiB |
./.cache/caches/gatsby-plugin-mdx/mdx-scopes-dir/31a094f140f119e73085d847ae81b99b.js + 2 modules |
531.3 KiB |
./node_modules/.pnpm/@radix-ui+react-icons@1.3.2_react@18.3.1/node_modules/@radix-ui/react-icons/dist/react-icons.esm.js |
481.4 KiB |
./node_modules/.pnpm/@codemirror+view@6.38.2/node_modules/@codemirror/view/dist/index.js |
458.1 KiB |
./node_modules/.pnpm/rehype-raw@7.0.0/node_modules/rehype-raw/lib/index.js + 29 modules |
395.1 KiB |
./node_modules/.pnpm/@posthog+icons@0.36.6_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@posthog/icons/dist/posthog-icons.cjs.js |
364.8 KiB |
./node_modules/.pnpm/@posthog+icons@0.36.6_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@posthog/icons/dist/posthog-icons.es.js |
354.8 KiB |
./src/hooks/useCustomers.tsx + 54 modules |
353.9 KiB |
./node_modules/.pnpm/react-markdown@8.0.7_@types+react@16.14.66_react@18.3.1/node_modules/react-markdown/lib/react-markdown.js + 88 modules |
351.4 KiB |
./node_modules/.pnpm/cloudinary-core@2.14.0_lodash@4.17.21/node_modules/cloudinary-core/cloudinary-core.js |
281.9 KiB |
./node_modules/.pnpm/@codesandbox+sandpack-react@2.20.0_react-dom@18.3.1_react@18.3.1__react@18.3.1/node_modules/@codesandbox/sandpack-react/dist/index.mjs |
266.6 KiB |
./src/components/ProductComparisonTable/index.tsx + 114 modules |
264.0 KiB |
./node_modules/.pnpm/d3@7.9.0/node_modules/d3/src/index.js + 208 modules |
247.4 KiB |
./src/components/Pricing/PricingSlider/Slider.tsx + 87 modules |
239.9 KiB |
Eager-graph budgets are report-only until a baseline is established. Sizes are gzip of public/**/*.js; eager size is webpack module source bytes.
Changes
See the original https://docs.google.com/document/d/1d0J9OUwxN7uCD9q7TMU8P5S91IOg3Aoe9TTP_OKVCFI/edit?tab=t.0
See #17681
Checklist
vercel.json