[WIP] Add structural analysis to the SepTop protocol#1982
[WIP] Add structural analysis to the SepTop protocol#1982hannahbaumann wants to merge 14 commits into
Conversation
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1982 +/- ##
==========================================
- Coverage 94.94% 90.58% -4.36%
==========================================
Files 216 217 +1
Lines 20481 20670 +189
==========================================
- Hits 19445 18724 -721
- Misses 1036 1946 +910
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
|
No API break detected ✅ |
| u_top = mda.Universe(pdb_file) | ||
| for state_idx in range(n_lambda): | ||
| universe = create_universe_single_state(u_top._topology, ds, state=state_idx) | ||
| prot = universe.select_atoms(protein_selection) |
There was a problem hiding this comment.
Not sure if its possible or faster but could we do this selection once on the u_top to get the indices of the atoms and then just indexing to get the atoms in each loop?
| trj_file: pathlib.Path, | ||
| output_directory: pathlib.Path, | ||
| dry: bool, | ||
| simtype: str, |
There was a problem hiding this comment.
Use a literal type here with complex and solvent
| n_frames = len(range(0, ds.dimensions["iteration"].size, ds.PositionInterval)) | ||
| else: | ||
| n_frames = ds.dimensions["iteration"].size | ||
| skip = max(n_frames // 500, 1) |
There was a problem hiding this comment.
Is there a reason why we want to keep the analysis to only 500 frames, should this be exposed?
| if simtype == "complex": | ||
| np.savez_compressed( | ||
| npz_file, | ||
| ligand_A_RMSD=np.asarray(data["ligand_A_RMSD"], dtype=np.float32), | ||
| ligand_B_RMSD=np.asarray(data["ligand_B_RMSD"], dtype=np.float32), | ||
| ligand_A_COM_drift=np.asarray(data["ligand_A_COM_drift"], dtype=np.float32), | ||
| ligand_B_COM_drift=np.asarray(data["ligand_B_COM_drift"], dtype=np.float32), | ||
| protein_2D_RMSD=np.asarray(data["protein_2D_RMSD"], dtype=np.float32), | ||
| time_ps=np.asarray(data["time_ps"], dtype=np.float32), | ||
| ) | ||
| else: | ||
| np.savez_compressed( | ||
| npz_file, | ||
| ligand_A_RMSD=np.asarray(data["ligand_A_RMSD"], dtype=np.float32), | ||
| ligand_B_RMSD=np.asarray(data["ligand_B_RMSD"], dtype=np.float32), | ||
| time_ps=np.asarray(data["time_ps"], dtype=np.float32), | ||
| ) |
There was a problem hiding this comment.
What about building a dict of shared data that will be saved in both cases so the time_ps and ligand_A/B_RMSD data and then if the simtype=="complex" then add the extra data to the dict then you can have single np.savez_compressed call?
| ligand_B_indices: list[int], | ||
| rdmol_A: Chem.Mol, | ||
| rdmol_B: Chem.Mol, | ||
| protein_selection: str = "protein and name CA", |
There was a problem hiding this comment.
Do you want to expose this to users, and does it make sense to remove the protein_selection default from the private functions so that if you do update the default, you only have to do it in a single place, i.e single source of truth on the public function.
| selection_indices = np.array(setup.outputs["selection_indices"]) | ||
| ligand_A_indices = np.where(np.isin(selection_indices, setup.outputs["ligand_A_indices"]))[ | ||
| 0 | ||
| ].tolist() | ||
| ligand_B_indices = np.where(np.isin(selection_indices, setup.outputs["ligand_B_indices"]))[ | ||
| 0 | ||
| ].tolist() |
There was a problem hiding this comment.
I got lost trying to follow this through but could this go wrong if the user accidentally changes the settings to only save the protein and water by mistake?





Checklist
newsentry, or the changes are not user-facing.pre-commit.ci autofix.Manual Tests: these are slow so don't need to be run every commit, only before merging and when relevant changes are made (generally at reviewer-discretion).
Developers certificate of origin