Native theme fidelity suite + Material 3 fidelity fixes by shai-almog · Pull Request #5274 · codenameone/CodenameOne

shai-almog · 2026-06-24T03:18:38Z

What

Adds a data-driven fidelity test suite (scripts/fidelity-app) that, for every component with a native equivalent, renders the real native OS widget (rasterized off-screen) alongside the CN1 component under the native theme, and measures a per-component similarity score. Routine CI renders only the CN1 side and diffs against committed native goldens; a one-way ratchet (FidelityGate) fails only when a change drops a pair below its baseline.

It then drives the Android Material 3 theme from 94.9% → 96.2% overall fidelity through real framework + theme fixes — every change verified pixel-for-pixel against the native golden, no metric softening.

Framework fixes (each fixes a real Material-fidelity bug)

Fix	Effect
`FloatingActionButton` honors a `fabDiameterMM` constant (Material's fixed 56dp) instead of the legacy `icon*11/4` (~71dp) heuristic	FAB 85.7 → 98.5
`Tabs.paintAnimatedIndicator` reads `tabsAnimatedIndicatorThicknessMm` as a float (an int read silently dropped `"0.45"` → a 2×-too-thick indicator)	indicator 16px → 7px
New `Tabs.paintBottomDivider` (opt-in `tabsBottomDividerBool`) paints the full-width M3 tab divider directly — a CSS `border-bottom` does not paint on the custom tab-row `Container`; colour comes from the `TabsDivider` UIID (light/dark aware)	Tabs light 84.9 → 91.5
`DefaultLookAndFeel` disabled-unchecked checkbox/radio box reads the `*UncheckedColorUIID`'s own `.disabled` style, so the greyed box outline diverges from the (darker) disabled label text, as Material renders them	CheckBox 93.4 → 95.3, Radio 94.2 → 96.0

Plus the tuned native-themes/android-material/theme.css and recompiled shipped .res (Themes/, Ports, JS mirror).

Host tooling

ProcessScreenshots --mode fidelity, RenderFidelityReport, FidelityGate (ratchet), cn1ss.sh helpers, run-{android,ios}-fidelity-tests.sh, and the scripts-fidelity GitHub workflow.

Known limitation — iOS native references blocked

The iOS round cannot yet collect native UIKit references: rendering the native widget inside a ParparVM native method NPEs as soon as it does real UIKit work (a trivial stub delivers cleanly; reproduces identically with or without dispatch_sync, and String-arg/BOOL-return marshal fine — so it is neither a threading nor a marshaling fault). Documented in com_codenameone_fidelity_NativeWidgetFactoryImpl.m. Resolving it needs a ParparVM runtime fix, or rendering the native reference via a PeerComponent + Display.screenshot() instead of a NativeInterface method. The Android off-screen path (View.draw → Bitmap) works fully.

🤖 Generated with Claude Code

Adds a data-driven fidelity test suite (scripts/fidelity-app) that renders each component under the native theme alongside the REAL native OS widget (off-screen rasterized) and measures per-component visual fidelity, gated by a one-way ratchet vs a committed baseline. Android round raises overall Material 3 fidelity 94.9% -> 96.2% via real framework fixes (verified pixel vs the native golden, no metric softening): - FloatingActionButton: honor a fabDiameterMM theme constant for the Material 56dp fixed diameter instead of the icon*11/4 (~71dp) heuristic. FAB 85->98. - Tabs.paintAnimatedIndicator: read tabsAnimatedIndicatorThicknessMm as a float (an int read dropped "0.45" -> 2x-too-thick indicator). - Tabs.paintBottomDivider: new opt-in (tabsBottomDividerBool) full-width M3 divider painted directly (a border-bottom does not paint on the custom tab-row Container); colour from the TabsDivider UIID (light/dark aware). - DefaultLookAndFeel: disabled-unchecked checkbox/radio box reads the *UncheckedColorUIID's own .disabled style, so the greyed box outline can differ from the darker disabled label text (Material renders them distinctly). Theme (native-themes/android-material/theme.css) + recompiled shipped res. Host tooling: ProcessScreenshots --mode fidelity, RenderFidelityReport, FidelityGate (ratchet), cn1ss.sh helpers, run-*-fidelity-tests.sh, and the scripts-fidelity GitHub workflow. iOS round is blocked: rendering the native UIKit reference inside a ParparVM native method NPEs whenever it does real UIKit work (a trivial stub delivers; not a threading or marshaling fault). Documented in the iOS NativeWidgetFactory impl; needs a ParparVM fix or a PeerComponent+screenshot redesign. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

shai-almog · 2026-06-24T03:25:40Z

JavaSE simulator screenshot updates

Compared 11 screenshots: 10 matched, 1 updated.

javase-single-component-inspector — updated screenshot. Screenshot differs (2200x1400 px, bit depth 8).

Preview info: JPEG preview quality 20; JPEG preview quality 20; downscaled to 1540x980.
Full-resolution PNG saved as javase-single-component-inspector.png in workflow artifacts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-24T03:35:45Z

Cloudflare Preview

URL: https://pr-5274-website-preview.codenameone.pages.dev
Branch: pr-5274-website-preview

shai-almog · 2026-06-24T03:37:52Z

Native fidelity (Android, Material 3)

54 pairs compared -- median 93.8%, worst 75.4% (Tabs_normal_light), 25th pct 92.4%, mean 93.3%.

Distribution -- >=99%: 7 | 95-99%: 10 | 90-95%: 30 | <90%: 7

Component	State	Appearance	Fidelity	SSIM	mean delta	vs base
Tabs	normal	light	75.4%	0.904	3.41	-16.1
Tabs	normal	dark	77.3%	0.903	4.19	-22.2
Dialog	normal	dark	78.7%	0.819	8.29	-12.4
Dialog	normal	light	83.4%	0.878	5.72	-10.5
Button	pressed	dark	87.9%	0.939	5.69	-3.3
FloatingActionButton	normal	light	88.9%	0.964	0.68	-10.2
FloatingActionButton	pressed	light	88.9%	0.964	0.68	-10.2
Switch	selected	dark	90.9%	0.960	2.19	-6.5
Button	normal	dark	91.1%	0.940	3.72	-7.4
Switch	disabled	dark	92.0%	0.954	1.07	-1.6
Switch	selected	light	92.0%	0.960	1.87	-5.5
Button	pressed	light	92.3%	0.944	3.98	-2.3
FlatButton	normal	dark	92.4%	0.928	3.09	-0.5
FlatButton	pressed	dark	92.4%	0.928	3.09	-0.5
Switch	disabled	light	92.7%	0.963	0.81	+1.6
RadioButton	normal	dark	92.7%	0.957	2.58	-2.3
Switch	normal	light	92.7%	0.953	1.82	-1.4
RadioButton	normal	light	93.0%	0.958	2.19	-2.1
FlatButton	normal	light	93.1%	0.931	2.63	-1.4
FlatButton	pressed	light	93.1%	0.931	2.63	-1.4
Switch	normal	dark	93.1%	0.954	1.74	+1.7
RadioButton	selected	dark	93.3%	0.957	2.85	-2.3
Button	disabled	dark	93.6%	0.940	2.45	-0.2
Button	normal	light	93.7%	0.944	2.90	-4.9
RaisedButton	normal	light	93.7%	0.955	1.65	-4.9
RaisedButton	pressed	light	93.7%	0.955	1.65	-4.9
FloatingActionButton	normal	dark	93.8%	0.952	1.43	-5.2
FloatingActionButton	pressed	dark	93.8%	0.952	1.43	-5.2
RadioButton	selected	light	93.9%	0.958	2.18	-2.2
RadioButton	disabled	dark	94.3%	0.957	1.39	-2.8
RadioButton	disabled	light	94.4%	0.960	1.28	-2.7
CheckBox	selected	dark	94.4%	0.942	3.06	+0.3
Slider	normal	dark	94.4%	0.991	1.03	-2.7
CheckBox	normal	dark	94.7%	0.943	3.17	+0.6
CheckBox	normal	light	94.8%	0.945	2.72	+0.1
CheckBox	disabled	light	94.8%	0.949	1.58	-1.8
CheckBox	selected	light	94.8%	0.944	2.43	-0.7
CheckBox	disabled	dark	95.0%	0.946	1.73	-1.6
Button	disabled	light	95.3%	0.953	1.25	-2.7
RaisedButton	normal	dark	95.5%	0.943	2.09	-3.1
RaisedButton	pressed	dark	95.5%	0.943	2.09	-3.1
RaisedButton	disabled	dark	96.0%	0.947	1.15	-2.7
TextField	disabled	dark	96.1%	0.958	0.90	+1.0
RaisedButton	disabled	light	96.2%	0.954	0.93	-2.6
ProgressBar	normal	dark	97.1%	0.961	2.40	-2.9
ProgressBar	normal	light	97.1%	0.969	1.81	-2.9
TextField	disabled	light	98.1%	0.958	0.93	-0.1
Slider	normal	light	99.4%	0.998	0.08	-0.4
TextField	normal	dark	99.5%	0.951	2.16	+4.0
TextField	normal	light	99.5%	0.951	1.92	+3.8
Slider	disabled	dark	99.7%	0.998	0.12	-0.1
Slider	disabled	light	99.7%	0.998	0.08	-0.1
Toolbar	normal	dark	99.8%	0.903	1.82	+7.6
Toolbar	normal	light	100.0%	0.970	1.49	+3.5

Side-by-side comparisons (worst first)

Tabs_normal_light -- 75.40% fidelity (SSIM 0.9044) (-16.14 vs baseline)

Left: native widget. Right: Codename One render.
Tabs_normal_dark -- 77.28% fidelity (SSIM 0.9033) (-22.15 vs baseline)

Left: native widget. Right: Codename One render.
Dialog_normal_dark -- 78.65% fidelity (SSIM 0.8191) (-12.40 vs baseline)

Left: native widget. Right: Codename One render.
Dialog_normal_light -- 83.44% fidelity (SSIM 0.8779) (-10.49 vs baseline)

Left: native widget. Right: Codename One render.
Button_pressed_dark -- 87.90% fidelity (SSIM 0.9388) (-3.29 vs baseline)

Left: native widget. Right: Codename One render.
FloatingActionButton_normal_light -- 88.88% fidelity (SSIM 0.9636) (-10.24 vs baseline)

Left: native widget. Right: Codename One render.
FloatingActionButton_pressed_light -- 88.88% fidelity (SSIM 0.9636) (-10.24 vs baseline)

Left: native widget. Right: Codename One render.
Switch_selected_dark -- 90.86% fidelity (SSIM 0.9596) (-6.51 vs baseline)

Left: native widget. Right: Codename One render.
Button_normal_dark -- 91.13% fidelity (SSIM 0.9400) (-7.35 vs baseline)

Left: native widget. Right: Codename One render.
Switch_disabled_dark -- 91.96% fidelity (SSIM 0.9537) (-1.59 vs baseline)

Left: native widget. Right: Codename One render.
Switch_selected_light -- 91.99% fidelity (SSIM 0.9598) (-5.54 vs baseline)

Left: native widget. Right: Codename One render.
Button_pressed_light -- 92.30% fidelity (SSIM 0.9440) (-2.27 vs baseline)

Left: native widget. Right: Codename One render.
FlatButton_normal_dark -- 92.39% fidelity (SSIM 0.9278) (-0.46 vs baseline)

Left: native widget. Right: Codename One render.
FlatButton_pressed_dark -- 92.39% fidelity (SSIM 0.9278) (-0.46 vs baseline)

Left: native widget. Right: Codename One render.
Switch_disabled_light -- 92.65% fidelity (SSIM 0.9634) (+1.55 vs baseline)

Left: native widget. Right: Codename One render.
RadioButton_normal_dark -- 92.68% fidelity (SSIM 0.9565) (-2.26 vs baseline)

Left: native widget. Right: Codename One render.
Switch_normal_light -- 92.70% fidelity (SSIM 0.9532) (-1.40 vs baseline)

Left: native widget. Right: Codename One render.
RadioButton_normal_light -- 93.01% fidelity (SSIM 0.9580) (-2.14 vs baseline)

Left: native widget. Right: Codename One render.
FlatButton_normal_light -- 93.13% fidelity (SSIM 0.9313) (-1.36 vs baseline)

Left: native widget. Right: Codename One render.
FlatButton_pressed_light -- 93.13% fidelity (SSIM 0.9313) (-1.36 vs baseline)

Left: native widget. Right: Codename One render.
Switch_normal_dark -- 93.14% fidelity (SSIM 0.9537) (+1.72 vs baseline)

Left: native widget. Right: Codename One render.
RadioButton_selected_dark -- 93.31% fidelity (SSIM 0.9565) (-2.25 vs baseline)

Left: native widget. Right: Codename One render.
Button_disabled_dark -- 93.63% fidelity (SSIM 0.9401) (-0.21 vs baseline)

Left: native widget. Right: Codename One render.
Button_normal_light -- 93.67% fidelity (SSIM 0.9444) (-4.87 vs baseline)

Left: native widget. Right: Codename One render.
RaisedButton_normal_light -- 93.70% fidelity (SSIM 0.9545) (-4.90 vs baseline)

Left: native widget. Right: Codename One render.
RaisedButton_pressed_light -- 93.70% fidelity (SSIM 0.9545) (-4.90 vs baseline)

Left: native widget. Right: Codename One render.
FloatingActionButton_normal_dark -- 93.76% fidelity (SSIM 0.9522) (-5.24 vs baseline)

Left: native widget. Right: Codename One render.
FloatingActionButton_pressed_dark -- 93.76% fidelity (SSIM 0.9522) (-5.24 vs baseline)

Left: native widget. Right: Codename One render.
RadioButton_selected_light -- 93.89% fidelity (SSIM 0.9579) (-2.24 vs baseline)

Left: native widget. Right: Codename One render.
RadioButton_disabled_dark -- 94.28% fidelity (SSIM 0.9565) (-2.76 vs baseline)

Left: native widget. Right: Codename One render.
RadioButton_disabled_light -- 94.38% fidelity (SSIM 0.9597) (-2.74 vs baseline)

Left: native widget. Right: Codename One render.
CheckBox_selected_dark -- 94.41% fidelity (SSIM 0.9415) (+0.28 vs baseline)

Left: native widget. Right: Codename One render.
Slider_normal_dark -- 94.41% fidelity (SSIM 0.9914) (-2.67 vs baseline)

Left: native widget. Right: Codename One render.
CheckBox_normal_dark -- 94.72% fidelity (SSIM 0.9431) (+0.59 vs baseline)

Left: native widget. Right: Codename One render.
CheckBox_normal_light -- 94.79% fidelity (SSIM 0.9449) (+0.07 vs baseline)

Left: native widget. Right: Codename One render.
CheckBox_disabled_light -- 94.81% fidelity (SSIM 0.9486) (-1.79 vs baseline)

Left: native widget. Right: Codename One render.
CheckBox_selected_light -- 94.83% fidelity (SSIM 0.9441) (-0.65 vs baseline)

Left: native widget. Right: Codename One render.
CheckBox_disabled_dark -- 95.02% fidelity (SSIM 0.9457) (-1.61 vs baseline)

Left: native widget. Right: Codename One render.
Button_disabled_light -- 95.33% fidelity (SSIM 0.9526) (-2.67 vs baseline)

Left: native widget. Right: Codename One render.
RaisedButton_normal_dark -- 95.54% fidelity (SSIM 0.9432) (-3.13 vs baseline)

Left: native widget. Right: Codename One render.
RaisedButton_pressed_dark -- 95.54% fidelity (SSIM 0.9432) (-3.13 vs baseline)

Left: native widget. Right: Codename One render.
RaisedButton_disabled_dark -- 96.03% fidelity (SSIM 0.9473) (-2.66 vs baseline)

Left: native widget. Right: Codename One render.
TextField_disabled_dark -- 96.09% fidelity (SSIM 0.9584) (+1.01 vs baseline)

Left: native widget. Right: Codename One render.
RaisedButton_disabled_light -- 96.18% fidelity (SSIM 0.9543) (-2.57 vs baseline)

Left: native widget. Right: Codename One render.
ProgressBar_normal_dark -- 97.13% fidelity (SSIM 0.9611) (-2.87 vs baseline)

Left: native widget. Right: Codename One render.
ProgressBar_normal_light -- 97.13% fidelity (SSIM 0.9694) (-2.87 vs baseline)

Left: native widget. Right: Codename One render.
TextField_disabled_light -- 98.13% fidelity (SSIM 0.9584) (-0.07 vs baseline)

Left: native widget. Right: Codename One render.
Slider_normal_light -- 99.40% fidelity (SSIM 0.9978) (-0.40 vs baseline)

Left: native widget. Right: Codename One render.
TextField_normal_dark -- 99.45% fidelity (SSIM 0.9508) (+4.02 vs baseline)

Left: native widget. Right: Codename One render.
TextField_normal_light -- 99.48% fidelity (SSIM 0.9506) (+3.84 vs baseline)

Left: native widget. Right: Codename One render.
Slider_disabled_dark -- 99.70% fidelity (SSIM 0.9976) (-0.09 vs baseline)

Left: native widget. Right: Codename One render.
Slider_disabled_light -- 99.71% fidelity (SSIM 0.9977) (-0.09 vs baseline)

Left: native widget. Right: Codename One render.
Toolbar_normal_dark -- 99.76% fidelity (SSIM 0.9026) (+7.58 vs baseline)

Left: native widget. Right: Codename One render.
Toolbar_normal_light -- 99.98% fidelity (SSIM 0.9704) (+3.53 vs baseline)

Left: native widget. Right: Codename One render.

shai-almog · 2026-06-24T03:42:42Z

Android screenshot updates

Compared 136 screenshots: 104 matched, 32 updated.

ButtonTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ButtonTheme_dark.png in workflow artifacts.
ButtonTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ButtonTheme_light.png in workflow artifacts.
ChatInput_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ChatInput_dark.png in workflow artifacts.
ChatInput_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ChatInput_light.png in workflow artifacts.
ChatView_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ChatView_dark.png in workflow artifacts.
ChatView_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ChatView_light.png in workflow artifacts.
CheckBoxRadioTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as CheckBoxRadioTheme_dark.png in workflow artifacts.
CheckBoxRadioTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as CheckBoxRadioTheme_light.png in workflow artifacts.
DialogTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as DialogTheme_dark.png in workflow artifacts.
DialogTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as DialogTheme_light.png in workflow artifacts.
FloatingActionButtonTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as FloatingActionButtonTheme_dark.png in workflow artifacts.
FloatingActionButtonTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as FloatingActionButtonTheme_light.png in workflow artifacts.
ListTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ListTheme_dark.png in workflow artifacts.
ListTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ListTheme_light.png in workflow artifacts.
MultiButtonTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as MultiButtonTheme_dark.png in workflow artifacts.
MultiButtonTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as MultiButtonTheme_light.png in workflow artifacts.
PaletteOverrideTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as PaletteOverrideTheme_dark.png in workflow artifacts.
PaletteOverrideTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as PaletteOverrideTheme_light.png in workflow artifacts.
PickerTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as PickerTheme_dark.png in workflow artifacts.
PickerTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as PickerTheme_light.png in workflow artifacts.
ShowcaseTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ShowcaseTheme_dark.png in workflow artifacts.
ShowcaseTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ShowcaseTheme_light.png in workflow artifacts.
SpanLabelTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SpanLabelTheme_dark.png in workflow artifacts.
SpanLabelTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SpanLabelTheme_light.png in workflow artifacts.
SwitchTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SwitchTheme_dark.png in workflow artifacts.
SwitchTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SwitchTheme_light.png in workflow artifacts.
TabsTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as TabsTheme_dark.png in workflow artifacts.
TabsTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as TabsTheme_light.png in workflow artifacts.
TextFieldTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as TextFieldTheme_dark.png in workflow artifacts.
TextFieldTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as TextFieldTheme_light.png in workflow artifacts.
ToolbarTheme_dark — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ToolbarTheme_dark.png in workflow artifacts.
ToolbarTheme_light — updated screenshot. Screenshot differs (320x640 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ToolbarTheme_light.png in workflow artifacts.

Native Android coverage

📊 Line coverage: 14.46% (8850/61219 lines covered) [HTML preview] (artifact android-coverage-report, jacocoAndroidReport/html/index.html)
- Other counters: instruction 11.74% (43658/372010), branch 5.19% (1815/34977), complexity 6.20% (2077/33516), method 10.72% (1679/15664), class 17.49% (388/2218)
- Lowest covered classes
  - kotlin.collections.kotlin.collections.ArraysKt___ArraysKt – 0.00% (0/6327 lines covered)
  - kotlin.collections.unsigned.kotlin.collections.unsigned.UArraysKt___UArraysKt – 0.00% (0/2384 lines covered)
  - org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.ClassReader – 0.00% (0/1519 lines covered)
  - kotlin.collections.kotlin.collections.CollectionsKt___CollectionsKt – 0.00% (0/1148 lines covered)
  - org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.MethodWriter – 0.00% (0/923 lines covered)
  - kotlin.sequences.kotlin.sequences.SequencesKt___SequencesKt – 0.00% (0/730 lines covered)
  - kotlin.text.kotlin.text.StringsKt___StringsKt – 0.00% (0/623 lines covered)
  - org.jacoco.agent.rt.internal_b6258fc.asm.org.jacoco.agent.rt.internal_b6258fc.asm.Frame – 0.00% (0/564 lines covered)
  - kotlin.collections.kotlin.collections.ArraysKt___ArraysJvmKt – 0.00% (0/495 lines covered)
  - kotlinx.coroutines.kotlinx.coroutines.JobSupport – 0.00% (0/423 lines covered)

Benchmark Results

Detailed Performance Metrics

Metric	Duration
SIMD kernel backend	scalar fallback (no native SIMD)
SIMD int-add (64K x300)	java 205ms / native 125ms = 1.6x speedup
SIMD float-mul (64K x300)	java 86ms / native 95ms = 0.9x speedup
SIMD kernel correctness	PASS (native result == scalar reference)
Base64 payload size	8192 bytes
Base64 benchmark iterations	6000
Base64 SIMD byte path	gated to scalar (CPU autovectorizes scalar; explicit SIMD not beneficial here)
Base64 CN1 encode	310.000 ms
Base64 CN1 decode	281.000 ms
Base64 native encode	802.000 ms
Base64 encode ratio (CN1/native)	0.387x (61.3% faster)
Base64 native decode	894.000 ms
Base64 decode ratio (CN1/native)	0.314x (68.6% faster)
Image encode benchmark status	skipped (SIMD unsupported)

- Switch.java: replace a non-ASCII U+2248 with ~ (Android port javac uses US-ASCII encoding and failed on it). - scripts/javase/screenshots: refresh the 7 simulator goldens that shifted with the framework/theme changes (rendered on CI Linux to match the test env). - scripts-fidelity.yml: TEMPORARY seed -- run the Android fidelity suite with FIDELITY_UPDATE_GOLDENS=1 + FIDELITY_UPDATE_BASELINE=1 so the native goldens and baseline are regenerated on CI's emulator density (the committed ones were rendered on a different local emulator, so 50/54 pairs "could not be compared"). Reverted in a follow-up once the CI-density artifacts are committed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

shai-almog · 2026-06-24T03:57:47Z

Apple Watch (watchOS / Core Graphics)

Compared 211 screenshots: 206 matched, 5 updated.

CheckBoxRadioTheme_dark — updated screenshot. Screenshot differs (416x496 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as CheckBoxRadioTheme_dark.png in workflow artifacts.
CheckBoxRadioTheme_light — updated screenshot. Screenshot differs (416x496 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as CheckBoxRadioTheme_light.png in workflow artifacts.
ShowcaseTheme_dark — updated screenshot. Screenshot differs (416x496 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ShowcaseTheme_dark.png in workflow artifacts.
ShowcaseTheme_light — updated screenshot. Screenshot differs (416x496 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ShowcaseTheme_light.png in workflow artifacts.
SwitchTheme_dark — updated screenshot. Screenshot differs (416x496 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SwitchTheme_dark.png in workflow artifacts.

The native goldens + ratchet baseline are now the ones the seed run regenerated on CI's own emulator (e.g. Tabs 377x100 vs the local 1039x277), so the fidelity gate compares like-for-like instead of failing 50/54 pairs on size mismatch. Removes the temporary FIDELITY_UPDATE_* seed so the job is a real one-way ratchet again. CI baseline overall fidelity: 96.2%. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

shai-almog · 2026-06-24T04:00:59Z

Compared 133 screenshots: 133 matched.
✅ Native Apple TV (tvOS, Metal) screenshot tests passed.

shai-almog · 2026-06-24T04:02:03Z

Compared 131 screenshots: 131 matched.
✅ Native iOS screenshot tests passed.

Benchmark Results

VM Translation Time: 0 seconds
Compilation Time: 225 seconds

Build and Run Timing

Metric	Duration
Simulator Boot	71000 ms
Simulator Boot (Run)	0 ms
App Install	12000 ms
App Launch	0 ms
Test Execution	421000 ms

Detailed Performance Metrics

Metric	Duration
SIMD kernel backend	SSE2 (x64) / NEON (arm64) native kernels
SIMD int-add (64K x300)	java 289ms / native 3ms = 96.3x speedup
SIMD float-mul (64K x300)	java 319ms / native 10ms = 31.9x speedup
SIMD kernel correctness	PASS (native result == scalar reference)
Base64 payload size	8192 bytes
Base64 benchmark iterations	6000
Base64 SIMD byte path	active (NEON-accelerated)
Base64 CN1 encode	323.000 ms
Base64 CN1 decode	233.000 ms
Base64 native encode	1055.000 ms
Base64 encode ratio (CN1/native)	0.306x (69.4% faster)
Base64 native decode	620.000 ms
Base64 decode ratio (CN1/native)	0.376x (62.4% faster)
Base64 SIMD encode	57.000 ms
Base64 encode ratio (SIMD/CN1)	0.176x (82.4% faster)
Base64 SIMD decode	49.000 ms
Base64 decode ratio (SIMD/CN1)	0.210x (79.0% faster)
Base64 encode ratio (SIMD/native)	0.054x (94.6% faster)
Base64 decode ratio (SIMD/native)	0.079x (92.1% faster)
Image encode benchmark iterations	100
Image createMask (SIMD off)	28.000 ms
Image createMask (SIMD on)	23.000 ms
Image createMask ratio (SIMD on/off)	0.821x (17.9% faster)
Image applyMask (SIMD off)	222.000 ms
Image applyMask (SIMD on)	167.000 ms
Image applyMask ratio (SIMD on/off)	0.752x (24.8% faster)
Image modifyAlpha (SIMD off)	254.000 ms
Image modifyAlpha (SIMD on)	166.000 ms
Image modifyAlpha ratio (SIMD on/off)	0.654x (34.6% faster)
Image modifyAlpha removeColor (SIMD off)	177.000 ms
Image modifyAlpha removeColor (SIMD on)	156.000 ms
Image modifyAlpha removeColor ratio (SIMD on/off)	0.881x (11.9% faster)

shai-almog · 2026-06-24T04:28:20Z

JavaScript port screenshot updates

Compared 128 screenshots: 96 matched, 32 updated.

ButtonTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ButtonTheme_dark.png in workflow artifacts.
ButtonTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ButtonTheme_light.png in workflow artifacts.
ChatInput_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ChatInput_dark.png in workflow artifacts.
ChatInput_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ChatInput_light.png in workflow artifacts.
ChatView_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ChatView_dark.png in workflow artifacts.
ChatView_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ChatView_light.png in workflow artifacts.
CheckBoxRadioTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as CheckBoxRadioTheme_dark.png in workflow artifacts.
CheckBoxRadioTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as CheckBoxRadioTheme_light.png in workflow artifacts.
DialogTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as DialogTheme_dark.png in workflow artifacts.
DialogTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as DialogTheme_light.png in workflow artifacts.
FloatingActionButtonTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as FloatingActionButtonTheme_dark.png in workflow artifacts.
FloatingActionButtonTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as FloatingActionButtonTheme_light.png in workflow artifacts.
ListTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ListTheme_dark.png in workflow artifacts.
ListTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ListTheme_light.png in workflow artifacts.
MultiButtonTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as MultiButtonTheme_dark.png in workflow artifacts.
MultiButtonTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as MultiButtonTheme_light.png in workflow artifacts.
PaletteOverrideTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as PaletteOverrideTheme_dark.png in workflow artifacts.
PaletteOverrideTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as PaletteOverrideTheme_light.png in workflow artifacts.
PickerTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as PickerTheme_dark.png in workflow artifacts.
PickerTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as PickerTheme_light.png in workflow artifacts.
ShowcaseTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ShowcaseTheme_dark.png in workflow artifacts.
ShowcaseTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ShowcaseTheme_light.png in workflow artifacts.
SpanLabelTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SpanLabelTheme_dark.png in workflow artifacts.
SpanLabelTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SpanLabelTheme_light.png in workflow artifacts.
SwitchTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SwitchTheme_dark.png in workflow artifacts.
SwitchTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as SwitchTheme_light.png in workflow artifacts.
TabsTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as TabsTheme_dark.png in workflow artifacts.
TabsTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as TabsTheme_light.png in workflow artifacts.
TextFieldTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as TextFieldTheme_dark.png in workflow artifacts.
TextFieldTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as TextFieldTheme_light.png in workflow artifacts.
ToolbarTheme_dark — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ToolbarTheme_dark.png in workflow artifacts.
ToolbarTheme_light — updated screenshot. Screenshot differs (375x667 px, bit depth 8).

Preview info: JPEG preview quality 70; JPEG preview quality 70.
Full-resolution PNG saved as ToolbarTheme_light.png in workflow artifacts.

shai-almog · 2026-06-24T04:29:00Z

Compared 134 screenshots: 134 matched.
✅ Native Mac screenshot tests passed.

Benchmark Results

VM Translation Time: 0 seconds
Compilation Time: 217 seconds

Detailed Performance Metrics

Metric	Duration
SIMD kernel backend	SSE2 (x64) / NEON (arm64) native kernels
SIMD int-add (64K x300)	java 56ms / native 3ms = 18.6x speedup
SIMD float-mul (64K x300)	java 54ms / native 3ms = 18.0x speedup
SIMD kernel correctness	PASS (native result == scalar reference)
Base64 payload size	8192 bytes
Base64 benchmark iterations	6000
Base64 SIMD byte path	active (NEON-accelerated)
Base64 CN1 encode	377.000 ms
Base64 CN1 decode	290.000 ms
Base64 native encode	1206.000 ms
Base64 encode ratio (CN1/native)	0.313x (68.7% faster)
Base64 native decode	863.000 ms
Base64 decode ratio (CN1/native)	0.336x (66.4% faster)
Base64 SIMD encode	69.000 ms
Base64 encode ratio (SIMD/CN1)	0.183x (81.7% faster)
Base64 SIMD decode	62.000 ms
Base64 decode ratio (SIMD/CN1)	0.214x (78.6% faster)
Base64 encode ratio (SIMD/native)	0.057x (94.3% faster)
Base64 decode ratio (SIMD/native)	0.072x (92.8% faster)
Image encode benchmark iterations	100
Image createMask (SIMD off)	34.000 ms
Image createMask (SIMD on)	31.000 ms
Image createMask ratio (SIMD on/off)	0.912x (8.8% faster)
Image applyMask (SIMD off)	237.000 ms
Image applyMask (SIMD on)	248.000 ms
Image applyMask ratio (SIMD on/off)	1.046x (4.6% slower)
Image modifyAlpha (SIMD off)	230.000 ms
Image modifyAlpha (SIMD on)	181.000 ms
Image modifyAlpha ratio (SIMD on/off)	0.787x (21.3% faster)
Image modifyAlpha removeColor (SIMD off)	6906.000 ms
Image modifyAlpha removeColor (SIMD on)	194.000 ms
Image modifyAlpha removeColor ratio (SIMD on/off)	0.028x (97.2% faster)

iOS fidelity native references now render (48 delivered, was 0). The earlier "ParparVM can't render UIKit in a native method" conclusion was wrong: it was three mundane MRC (non-ARC) memory bugs in NativeWidgetFactoryImpl.m -- 1. knownKind: cached an AUTORELEASED +[NSSet setWithObjects:] in a static, which dangled once the autorelease pool drained between native calls; the 2nd call derefed freed memory. ParparVM turns that EXC_BAD_ACCESS into a bogus Java NPE (which read as "buildAndRender NPEs"). Fixed: -[alloc initWithObjects:] (+1). 2. The rendered NSData was autoreleased and built on the main queue (UIKit layout -- e.g. SF-Symbol buttons -- hangs off-main, so the build is dispatch_sync'd to main); when dispatch_sync returned, main's pool drained and freed it before the EDT's writeToFile. Fixed: -retain it across the boundary, -release after. 3. (UIKit build moved to the main thread to avoid the off-main layout hang.) Report (RenderFidelityReport): lead with median / worst-pair / 25th-percentile / distribution buckets instead of a single misleading mean; add a per-pair percentage table (Fidelity, SSIM, mean-delta, delta-vs-baseline) sorted worst first; list unscored pairs explicitly; render the side-by-side cards for every pair worst-first. Workflow: drop continue-on-error on the iOS job (no longer a blocker); reseed per-environment goldens (FIDELITY_UPDATE_GOLDENS) while the committed baseline remains the portable ratchet floor. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… app The off-screen UIKit factory render was bunk: it rasterized DETACHED widgets at scale=1.0, so a 30pt button was 30px inside a 1087px tile (tiny, wrong size), and UINavigationBar/UITabBar rendered blank without a window. Replaced it for iOS with the approach Shai asked for: - scripts/fidelity-app/ios-native-ref/NativeRef.swift: a standalone native iOS app that lays each reference UIKit widget out in a REAL UIWindow and captures it with drawHierarchy(afterScreenUpdates:) -- so nav/tab bars render correctly -- at CN1's pixel density (so the PNG overlays the CN1 render 1:1, no scaling). Built directly with swiftc (no Xcode project) by scripts/build-ios-native-ref.sh, which runs it on the simulator and copies the PNGs into the committed iOS goldens. - run-ios-fidelity-tests.sh: iOS now compares the CN1 render against these COMMITTED goldens (generated offline, not same-run) instead of the broken factory native. - ProcessScreenshots: tolerate a few px of cross-environment rounding (golden 1088 vs CN1 1087) by cropping both to their common top-left region before diffing -- a true 1:1 overlay, never a scale. Result: all 50 iOS pairs now compare against real, correctly-sized native widgets (Toolbar was 0% blank -> a real centred-vs-left-aligned title diff). Seeded the iOS ratchet baseline (mean 62.3%); the low scores are the genuine untuned-iOSModern-theme gaps to drive up next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

shai-almog · 2026-06-24T06:00:23Z

Compared 135 screenshots: 135 matched.
✅ Native iOS Metal screenshot tests passed.

Benchmark Results

VM Translation Time: 0 seconds
Compilation Time: 291 seconds

Build and Run Timing

Metric	Duration
Simulator Boot	104000 ms
Simulator Boot (Run)	0 ms
App Install	11000 ms
App Launch	4000 ms
Test Execution	300000 ms

Detailed Performance Metrics

Metric	Duration
SIMD kernel backend	SSE2 (x64) / NEON (arm64) native kernels
SIMD int-add (64K x300)	java 62ms / native 3ms = 20.6x speedup
SIMD float-mul (64K x300)	java 69ms / native 4ms = 17.2x speedup
SIMD kernel correctness	PASS (native result == scalar reference)
Base64 payload size	8192 bytes
Base64 benchmark iterations	6000
Base64 SIMD byte path	active (NEON-accelerated)
Base64 CN1 encode	726.000 ms
Base64 CN1 decode	206.000 ms
Base64 native encode	616.000 ms
Base64 encode ratio (CN1/native)	1.179x (17.9% slower)
Base64 native decode	385.000 ms
Base64 decode ratio (CN1/native)	0.535x (46.5% faster)
Base64 SIMD encode	58.000 ms
Base64 encode ratio (SIMD/CN1)	0.080x (92.0% faster)
Base64 SIMD decode	72.000 ms
Base64 decode ratio (SIMD/CN1)	0.350x (65.0% faster)
Base64 encode ratio (SIMD/native)	0.094x (90.6% faster)
Base64 decode ratio (SIMD/native)	0.187x (81.3% faster)
Image encode benchmark iterations	100
Image createMask (SIMD off)	19.000 ms
Image createMask (SIMD on)	2.000 ms
Image createMask ratio (SIMD on/off)	0.105x (89.5% faster)
Image applyMask (SIMD off)	70.000 ms
Image applyMask (SIMD on)	31.000 ms
Image applyMask ratio (SIMD on/off)	0.443x (55.7% faster)
Image modifyAlpha (SIMD off)	82.000 ms
Image modifyAlpha (SIMD on)	30.000 ms
Image modifyAlpha ratio (SIMD on/off)	0.366x (63.4% faster)
Image modifyAlpha removeColor (SIMD off)	76.000 ms
Image modifyAlpha removeColor (SIMD on)	30.000 ms
Image modifyAlpha removeColor ratio (SIMD on/off)	0.395x (60.5% faster)

The native and CN1 tiles both anchor the widget top-left, but their pixel sizes can diverge -- a few px of cross-environment rounding (iOS offline goldens), or a larger native-vs-CN1 tile-geometry gap that flakes between Android emulator runs (e.g. CN1 320 vs native 377). Failing those as "size_mismatch" broke the gate. Now both are cropped to their common top-left region and overlaid 1:1 (never a scale); the structural metric still crops to each widget's content bbox, so an honest extent difference scores lower rather than erroring. Only a degenerate overlap (<8px) is an error. TEMPORARY: FIDELITY_UPDATE_BASELINE=1 on both run steps to reseed the ratchet baselines on CI under the new comparison (reverted once the baselines are committed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The old score was the mean colour agreement over all widget-content pixels, so a large flat region that happened to match -- e.g. a dark nav-bar fill against a dark tile -- could carry the score into the high 80s even when the actual widget (the title) was centred in one render and left-aligned at a totally different font size in the other. "Mostly got points for being black." Now fidelity = min(fillSim, structSim): - fillSim = mean colour agreement over content pixels (the old term; catches wrong fill colours). - structSim = the same agreement WEIGHTED BY local-gradient salience SQUARED, so flat fills count for ~nothing and the strongest edges -- glyph strokes, crisp outlines, separators -- dominate. A mis-placed or mis-sized title lands its strokes on the other render's flat fill, collapsing this term. A widget must now agree in BOTH fill AND structure/placement. Effect on the iOS Toolbar that triggered this: 89.3% -> ~59% (dark) / 36% (light), matching the independent SSIM (~56%), while genuinely-similar widgets (an off switch, disabled buttons) stay in the mid-80s. This is stricter for Android too; the CI seed run reseeds both ratchet baselines under it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

shai-almog and others added 2 commits June 24, 2026 06:18

ci: mark iOS fidelity job non-blocking (ParparVM native-render blocker)

f108bb2

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

ci: make build-fidelity-app.sh executable (exit 126 in CI)

ebe84de

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

shai-almog and others added 2 commits June 24, 2026 07:32

shai-almog and others added 2 commits June 24, 2026 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native theme fidelity suite + Material 3 fidelity fixes#5274

Native theme fidelity suite + Material 3 fidelity fixes#5274
shai-almog wants to merge 9 commits into
masterfrom
native-theme-fidelity-suite

shai-almog commented Jun 24, 2026

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

shai-almog commented Jun 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shai-almog commented Jun 24, 2026

What

Framework fixes (each fixes a real Material-fidelity bug)

Host tooling

Known limitation — iOS native references blocked

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

JavaSE simulator screenshot updates

Uh oh!

github-actions Bot commented Jun 24, 2026

Cloudflare Preview

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Native fidelity (Android, Material 3)

Side-by-side comparisons (worst first)

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Android screenshot updates

Native Android coverage

Benchmark Results

Detailed Performance Metrics

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Apple Watch (watchOS / Core Graphics)

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Build and Run Timing

Detailed Performance Metrics

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

JavaScript port screenshot updates

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Detailed Performance Metrics

Uh oh!

shai-almog commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Results

Build and Run Timing

Detailed Performance Metrics

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shai-almog commented Jun 24, 2026 •

edited

Loading

shai-almog commented Jun 24, 2026 •

edited

Loading

shai-almog commented Jun 24, 2026 •

edited

Loading

shai-almog commented Jun 24, 2026 •

edited

Loading

shai-almog commented Jun 24, 2026 •

edited

Loading

shai-almog commented Jun 24, 2026 •

edited

Loading

shai-almog commented Jun 24, 2026 •

edited

Loading

shai-almog commented Jun 24, 2026 •

edited

Loading

shai-almog commented Jun 24, 2026 •

edited

Loading