feat(benchmarks): add QFUERZA-recovery results for all 5 TS systems#11
Open
ericchansen wants to merge 1 commit into
Open
feat(benchmarks): add QFUERZA-recovery results for all 5 TS systems#11ericchansen wants to merge 1 commit into
ericchansen wants to merge 1 commit into
Conversation
Adds benchmarks/<system>/from-qfuerza/ artifacts for rh-enamide, heck-relay, pd-allyl-amination, pd-1,4-conjugate-addition, and rh-1,4-conjugate-addition. Each directory contains: - validation_results.json — full run record with provenance, audit, R² - paper_metrics.json — published-paper-comparable metrics - <system>_optimized.fld — final optimized force field These runs start from QFUERZA Hessian-derived bond/angle values (overwriting the published OPT scalars) and run the standard SciPy L-BFGS-B + JaxLoss pipeline. See ericchansen/q2mm#290 for the loader and CLI code, and docs/benchmarks/qfuerza-recovery.md (in that PR) for the methodology and interpretation. Summary (QFUERZA vs published-start final OF ratio): - rh-enamide: 1.03x (same basin) - pd-allyl: 1.00x (same basin) - pd-conjugate: 1.14x (nearby basin) - rh-conjugate: 3.49x (different basin) - heck-relay: 100x (JaxLoss surrogate diverged) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
benchmarks/<system>/from-qfuerza/artifacts for all 5 TS systems from the QFUERZA-recovery validation experiment. Companion toericchansen/q2mm#290.What's in each directory
validation_results.jsonpaper_metrics.json<system>_optimized.fldExperiment
Each run starts from QFUERZA Hessian-derived bond/angle values (per-molecule Seminario with TS inversion, multi-molecule mean) overlaid on the published OPT topology (frozen partition, vdW, SB, atom-type rows from the published FF). Optimized with SciPy L-BFGS-B + JaxLoss,
--ratio-tol -1, run on RTX 5090 (WSL2).This is not from-scratch FF generation — only 17–33% of active parameters per system are overwritten by QFUERZA. See
docs/benchmarks/qfuerza-recovery.md(in the companion PR) for full methodology and interpretation.Results summary
QFUERZA-start vs published-start final objective:
rh-enamide and pd-allyl converge to essentially the same basin as the published-start runs. The other three land in different (worse) basins, with heck-relay failing due to JaxLoss surrogate divergence at the poor starting FF (bond R² = −247 at the QFUERZA starting point).
File size
~530 KB total, all JSON + small
.fldfiles. No large binary artifacts.Provenance
29b61f8(loader commit onfeat/qfuerza-from-scratch)72add3dcuda:0(RTX 5090, WSL2)-1(bypassed)