[WIP] Add structural analysis to the SepTop protocol by hannahbaumann · Pull Request #1982 · OpenFreeEnergy/openfe

hannahbaumann · 2026-05-28T10:05:26Z

Checklist

All new code is appropriately documented (user-facing code must have complete docstrings).
Added a news entry, or the changes are not user-facing.
Ran pre-commit: you can run pre-commit locally or comment on this PR with pre-commit.ci autofix.

Manual Tests: these are slow so don't need to be run every commit, only before merging and when relevant changes are made (generally at reviewer-discretion).

GPU integration tests
example notebook testing
packaging tests: run this for any large feature PRs or PRs that add test data.

Developers certificate of origin

I certify that this contribution is covered by the MIT License here and the Developer Certificate of Origin at https://developercertificate.org/.

hannahbaumann · 2026-05-28T10:06:06Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

hannahbaumann · 2026-05-29T11:04:42Z

This is what it would look like for the complex right now.

codecov · 2026-05-29T13:30:30Z

Codecov Report

❌ Patch coverage is 96.85864% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.58%. Comparing base (5347082) to head (d1e3abe).

Files with missing lines	Patch %	Lines
src/openfe/protocols/openmm_septop/base_units.py	93.54%	6 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1982      +/-   ##
==========================================
- Coverage   94.94%   90.58%   -4.36%     
==========================================
  Files         216      217       +1     
  Lines       20481    20670     +189     
==========================================
- Hits        19445    18724     -721     
- Misses       1036     1946     +910

Flag	Coverage Δ
fast-tests	`90.58% <96.85%> (?)`
slow-tests	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hannahbaumann · 2026-06-02T09:12:02Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

hannahbaumann · 2026-06-02T09:34:13Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

github-actions · 2026-06-02T10:16:26Z

No API break detected ✅

jthorton · 2026-06-03T16:17:19Z

+        u_top = mda.Universe(pdb_file)
+        for state_idx in range(n_lambda):
+            universe = create_universe_single_state(u_top._topology, ds, state=state_idx)
+            prot = universe.select_atoms(protein_selection)


Not sure if its possible or faster but could we do this selection once on the u_top to get the indices of the atoms and then just indexing to get the atoms in each loop?

jthorton · 2026-06-03T16:20:39Z

+        trj_file: pathlib.Path,
+        output_directory: pathlib.Path,
+        dry: bool,
+        simtype: str,


Use a literal type here with complex and solvent

jthorton · 2026-06-03T16:25:24Z

+                    n_frames = len(range(0, ds.dimensions["iteration"].size, ds.PositionInterval))
+                else:
+                    n_frames = ds.dimensions["iteration"].size
+                skip = max(n_frames // 500, 1)


Is there a reason why we want to keep the analysis to only 500 frames, should this be exposed?

jthorton · 2026-06-03T16:33:31Z

+        if simtype == "complex":
+            np.savez_compressed(
+                npz_file,
+                ligand_A_RMSD=np.asarray(data["ligand_A_RMSD"], dtype=np.float32),
+                ligand_B_RMSD=np.asarray(data["ligand_B_RMSD"], dtype=np.float32),
+                ligand_A_COM_drift=np.asarray(data["ligand_A_COM_drift"], dtype=np.float32),
+                ligand_B_COM_drift=np.asarray(data["ligand_B_COM_drift"], dtype=np.float32),
+                protein_2D_RMSD=np.asarray(data["protein_2D_RMSD"], dtype=np.float32),
+                time_ps=np.asarray(data["time_ps"], dtype=np.float32),
+            )
+        else:
+            np.savez_compressed(
+                npz_file,
+                ligand_A_RMSD=np.asarray(data["ligand_A_RMSD"], dtype=np.float32),
+                ligand_B_RMSD=np.asarray(data["ligand_B_RMSD"], dtype=np.float32),
+                time_ps=np.asarray(data["time_ps"], dtype=np.float32),
+            )


What about building a dict of shared data that will be saved in both cases so the time_ps and ligand_A/B_RMSD data and then if the simtype=="complex" then add the extra data to the dict then you can have single np.savez_compressed call?

jthorton · 2026-06-03T16:50:42Z

+        ligand_B_indices: list[int],
+        rdmol_A: Chem.Mol,
+        rdmol_B: Chem.Mol,
+        protein_selection: str = "protein and name CA",


Do you want to expose this to users, and does it make sense to remove the protein_selection default from the private functions so that if you do update the default, you only have to do it in a single place, i.e single source of truth on the public function.

jthorton · 2026-06-03T16:56:47Z

+        selection_indices = np.array(setup.outputs["selection_indices"])
+        ligand_A_indices = np.where(np.isin(selection_indices, setup.outputs["ligand_A_indices"]))[
+            0
+        ].tolist()
+        ligand_B_indices = np.where(np.isin(selection_indices, setup.outputs["ligand_B_indices"]))[
+            0
+        ].tolist()


I got lost trying to follow this through but could this go wrong if the user accidentally changes the settings to only save the protein and water by mistake?

First pass at adding structural analysis to the SepTop protocol

ea6e58a

hannahbaumann marked this pull request as draft May 28, 2026 10:05

hannahbaumann changed the title ~~Add structural analysis to the SepTop protocol~~ [WIP] Add structural analysis to the SepTop protocol May 28, 2026

pre-commit-ci Bot and others added 3 commits May 28, 2026 10:06

[pre-commit.ci] auto fixes from pre-commit.com hooks

b0538ac

for more information, see https://pre-commit.ci

Small fix

0ae11d9

Small fix

00d3066

hannahbaumann self-assigned this May 28, 2026

hannahbaumann added 3 commits May 29, 2026 09:43

Fix ligand indices mapping

14381f9

Fix complex alignment

811a1b6

Small fix

54d4839

Add ligand indices to test

e1c101e

hannahbaumann added 2 commits June 1, 2026 15:40

Some updates

14ce7fd

Add tests for structural analysis

d953848

hannahbaumann marked this pull request as ready for review June 2, 2026 09:12

pre-commit-ci Bot and others added 2 commits June 2, 2026 09:12

[pre-commit.ci] auto fixes from pre-commit.com hooks

850b1a5

for more information, see https://pre-commit.ci

fix mypy

5394785

pre-commit-ci Bot and others added 2 commits June 2, 2026 09:34

[pre-commit.ci] auto fixes from pre-commit.com hooks

62a34dc

for more information, see https://pre-commit.ci

Update env

d1e3abe

hannahbaumann requested review from IAlibay and jthorton June 2, 2026 12:41

jameseastwood assigned IAlibay and unassigned hannahbaumann Jun 3, 2026

jthorton reviewed Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add structural analysis to the SepTop protocol#1982

[WIP] Add structural analysis to the SepTop protocol#1982
hannahbaumann wants to merge 14 commits into
mainfrom
structural_analysis_septop

hannahbaumann commented May 28, 2026 •

edited

Loading

Uh oh!

hannahbaumann commented May 28, 2026

Uh oh!

hannahbaumann commented May 29, 2026

Uh oh!

codecov Bot commented May 29, 2026 •

edited

Loading

Uh oh!

hannahbaumann commented Jun 2, 2026

Uh oh!

hannahbaumann commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

jthorton Jun 3, 2026

Uh oh!

jthorton Jun 3, 2026

Uh oh!

jthorton Jun 3, 2026

Uh oh!

jthorton Jun 3, 2026

Uh oh!

jthorton Jun 3, 2026

Uh oh!

jthorton Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hannahbaumann commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Developers certificate of origin

Uh oh!

hannahbaumann commented May 28, 2026

Uh oh!

hannahbaumann commented May 29, 2026

Uh oh!

codecov Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hannahbaumann commented Jun 2, 2026

Uh oh!

hannahbaumann commented Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

jthorton Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

jthorton Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

jthorton Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

jthorton Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

jthorton Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

jthorton Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hannahbaumann commented May 28, 2026 •

edited

Loading

codecov Bot commented May 29, 2026 •

edited

Loading