[SPRINT-02-01] Add local SBOL document indexing entry point#100
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b4fb2cae2d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| def index_document(self, collection_doc: sbol2.Document): | ||
| self._merge_document(collection_doc) | ||
| self._index_current_document() |
There was a problem hiding this comment.
Avoid re-indexing previously merged local documents
Calling index_document for each local document invokes _index_current_document() over the entire accumulated self.sbol_doc every time, so objects from earlier documents are indexed again on each subsequent call. In from_local_documents([doc1, doc2, ...]) this produces duplicated plasmid/enzyme/ligase entries (e.g., the first doc’s implementations are appended multiple times), which corrupts downstream workflow inputs and counts.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
@codex good catch, please solve this issue. The package has functionalities to add objects to a SBOL Document if they are not ready there to avoid errors of adding the same item twice.
There was a problem hiding this comment.
Summary
- Fixed local document indexing so
index_documentnow indexes only the newly provided document after merge, instead of re-indexing the entire accumulated SBOL document each time (which caused duplicates). src/buildcompiler/buildcompiler.pyL102-L105 - Refactored indexing flow by introducing
_index_document_objects(source_doc)and routing_index_current_document()through it, preserving full-document behavior for SynBioHub collection indexing while enabling incremental local indexing safely. src/buildcompiler/buildcompiler.pyL146-L154 - Added
_append_implementation_once(...)and used it when attaching implementations to plasmids/backbones and enzyme implementation lists, preventing duplicate implementation entries if the same implementation is encountered again. src/buildcompiler/buildcompiler.pyL149-L203 - Added a regression test that reproduces the multi-document local indexing scenario and verifies prior documents are not re-indexed when subsequent docs are processed. tests/test_buildcompiler.pyL44-L67
Testing
- ✅
python -m unittest tests.test_buildcompiler -v
Motivation
sbol2.Documentobjects without requiring SynBioHub credentials, a realsbol2.PartShop, auth tokens, or network access.pullcalls when users opt for a local-document workflow.Description
BuildCompiler.from_local_documents(collection_docs, design_doc=None)which creates an instance withself.sbh = None, merges supplied documents, and indexes them without constructingsbol2.PartShop(file:src/buildcompiler/buildcompiler.py).index_document(),_index_current_document(),_merge_document(), and_resolve_object()so both URI/SynBioHub and local-document indexing share the same indexing/walk logic while local mode explicitly raises when a referenced object is missing instead of callingpull(file:src/buildcompiler/buildcompiler.py).get_or_pullusages used during indexing with the new_resolve_object()helper so local mode never triggers a network fetch, and merged documents are appended safely while handling duplicate-URI collisions.tests/test_buildcompiler.pythat assert constructor signature compatibility, that local indexing does not constructPartShopand indexes local enzyme/implementation entries, and that a clear error is raised when a referenced SBOL object is missing.Testing
python -m unittest tests.test_buildcompiler -vwhich passed (all new local-indexing tests OK).python -m unittest discover -s testswhich failed due to unrelated network-dependent behavior in existing tests (SynBioHub pulls and remote SBOL validator calls) and not due to the local-indexing change.pytest tests/unit tests/integration -k "local and index"which selected no tests in this repository layout (no matching pytest selection), and ranruff check .which surfaced pre-existing lint issues unrelated to this change.Codex Task