Add example and tests for fully yielded Zephyr subgraph search#45
Add example and tests for fully yielded Zephyr subgraph search#45VolodyaCO wants to merge 18 commits into
Conversation
kevinchern
left a comment
There was a problem hiding this comment.
Added minor questions and comments
|
|
||
|
|
||
| def _extract_graph_properties(source: nx.Graph, target: nx.Graph) -> tuple[int, int, int]: | ||
| """Extract and validate Zephyr graph properties, returning ``(m, tp, t)``. |
There was a problem hiding this comment.
I suggest something more descriptive rather than variable names, e.g.,
| """Extract and validate Zephyr graph properties, returning ``(m, tp, t)``. | |
| """Extract and validate Zephyr graph properties (rows, tile count, and target tile count). |
| tuple[int, int, int]: ``(m, tp, t)`` where ``m`` is rows, | ||
| ``tp`` is source tile count, and ``t`` is target tile count. |
There was a problem hiding this comment.
| tuple[int, int, int]: ``(m, tp, t)`` where ``m`` is rows, | |
| ``tp`` is source tile count, and ``t`` is target tile count. | |
| tuple[int, int, int]: source Zephyr rows, source tile count, and target tile count. |
|
|
||
| Raises: | ||
| TypeError: If metadata values are not integers. | ||
| ValueError: If graph metadata is missing or incompatible. |
There was a problem hiding this comment.
I this should detail what it means to be incompatible; incompatibility is implied by ValueError, e.g., positive int or ...
| for n in _source.nodes() | ||
| if n in working_embedding | ||
| ) | ||
| else: |
There was a problem hiding this comment.
| else: | |
| elif yield_type == "edges": | |
| # do stuff | |
| else: | |
| raise ValueError("blah blah") |
I recommend being explicit about checking yield type.
Edit: maybe this is fine since validity of yield type has been validated previously...
There was a problem hiding this comment.
Other two options are edge and rail-edge. This is fine, as you say, because the type has been validated.
| YieldType = Literal["node", "edge", "rail-edge"] | ||
| QuotientSearchType = Literal["by_quotient_rail", "by_quotient_node", "by_rail_then_node"] |
There was a problem hiding this comment.
Question!!
I think most of these are only ever used in type-hinting. Should we still define these constant type hints here? Is there a best-practice or principle that recommends this pattern?
Subjectively, I find it less readable than writing the types out explicitly in the type hints.
cc @thisac
There was a problem hiding this comment.
Yep. in vscode all it takes is to hover over the constant type hint and it will show more details, though.
There was a problem hiding this comment.
Type-aliases are good when not overused. I agree with @kevinchern that it can make code less readable. With that said, I'm fine with these, except perhaps Embedding, EmbeddingChain since they're not very clearly named.
There was a problem hiding this comment.
I've gotten rid of these two type aliases.
| __all__ = ["zephyr_quotient_search"] | ||
|
|
||
| ZephyrNode = tuple[int, int, int, int, int] # (u, w, k, j, z) coordinate tuple | ||
| Embedding = dict[ZephyrNode, ZephyrNode] # Internal single-node format |
There was a problem hiding this comment.
Note for considering a different name: dwave.embedding has an EmbeddedStructure class. Might be worth following that pattern but tailored for Zephyr. e.g., ZephyrEmbeddedStructure (ok maybe this is too long lol)
There was a problem hiding this comment.
This Embedding is just a short-hand typing tool. The EmbeddedStructure is a bit more convoluted.
|
|
||
| pruned_embedding = { | ||
| to_source(k): to_target(v) for k, v in working_embedding.items() if v in target_nodeset | ||
| } # TODO:?: why would a target node in the working_embedding not be in the target_nodeset? |
There was a problem hiding this comment.
@VolodyaCO has this question been addressed with @jackraymond ?
There was a problem hiding this comment.
to_target is an identity, so we can remove it right?
There was a problem hiding this comment.
Perhaps its required because you want to cast it to a ZephyrNode, and we can't be sure that's the format provided - the function name could be changed and type hinting improved?
SebastianGitt
left a comment
There was a problem hiding this comment.
Looks good! Just have some minor comments.
| tiles. It is designed for defective targets where a direct identity map may lose | ||
| nodes or edges. | ||
|
|
||
| The search is organised around the **quotient graph** of the Zephyr topology, formed by |
There was a problem hiding this comment.
| The search is organised around the **quotient graph** of the Zephyr topology, formed by | |
| The search is organized around the **quotient graph** of the Zephyr topology, formed by |
minor issue but US English should be used
There was a problem hiding this comment.
Oh I will have a bunch of these. thanks
| # Phase 1: uniform random deletion | ||
| n_uniform = round(proportion * uniform_proportion * N) | ||
| uniform_indices = rng.choice(N, size=n_uniform, replace=False) | ||
| deleted = {all_nodes[i] for i in uniform_indices} |
There was a problem hiding this comment.
consider renaming this to something more clear
| """Validate that source and target are Zephyr NetworkX graphs. | ||
|
|
||
| Both source and target graphs must be networkx graph instances with a 'family' metadata key | ||
| set to 'zephyr'. Each graph must also contain 'rows', 'tile' and 'labels metadata keys. |
There was a problem hiding this comment.
| set to 'zephyr'. Each graph must also contain 'rows', 'tile' and 'labels metadata keys. | |
| set to 'zephyr'. Each graph must also contain 'rows', 'tile' and 'labels' metadata keys. |
| if expand_boundary_search: | ||
| if w == 0: | ||
| ksymmetric = False | ||
| # brrow candidates from adjacent internal column |
There was a problem hiding this comment.
| # brrow candidates from adjacent internal column | |
| # borrow candidates from adjacent internal column |
| \{(u, w_t, k_t, j, z) : j \in \{0,1\},\ z \in \{0,\dots,m-1\}\}. | ||
|
|
||
| We can define its objective for ``yield_type='edge'`` as the number of edges preserved within | ||
| that rail, i.e., the numbe of edges in the target subgraph induced by the proposed rail, or |
There was a problem hiding this comment.
| that rail, i.e., the numbe of edges in the target subgraph induced by the proposed rail, or | |
| that rail, i.e., the number of edges in the target subgraph induced by the proposed rail, or |
|
|
||
| F_q = \{m \in V(S) \setminus B_q : m \in \operatorname{dom}(\phi)\}, | ||
|
|
||
| where :math:`\phi` is the current embedding`. Then |
There was a problem hiding this comment.
| where :math:`\phi` is the current embedding`. Then | |
| where :math:`\phi` is the current embedding. Then |
|
|
||
| Args: | ||
| quotient_search (str): Search mode. | ||
| yield_type (str): Optimization objective. |
There was a problem hiding this comment.
| yield_type (str): Optimization objective. | |
| yield_type (YieldType): Optimization objective. |
There was a problem hiding this comment.
This should be a str, as it validates the search parameters. If it is not a YieldType, then an error is raised.
| :math:`w` is at a boundary. Defaults to ``True``. | ||
| ksymmetric (bool): If ``True``, treat source :math:`k` order as interchangeable when scoring | ||
| rails. Defaults to ``False``. | ||
| yield_type (str): ``"node"``, ``"edge"``, or ``"rail-edge"``. Defaults to ``"edge"``. |
There was a problem hiding this comment.
| yield_type (str): ``"node"``, ``"edge"``, or ``"rail-edge"``. Defaults to ``"edge"``. | |
| yield_type (YieldType): ``"node"``, ``"edge"``, or ``"rail-edge"``. Defaults to ``"edge"``. |
| ) | ||
|
|
||
|
|
||
| def _validate_graph_inputs(source: nx.Graph, target: nx.Graph): |
There was a problem hiding this comment.
| def _validate_graph_inputs(source: nx.Graph, target: nx.Graph): | |
| def _validate_graph_inputs(source: nx.Graph, target: nx.Graph) -> None: |
| ZephyrSearchMetadata, zephyr_quotient_search) | ||
|
|
||
|
|
||
| def generate_faulty_zephyr_graph(m, t, proportion, uniform_proportion, seed=None): |
There was a problem hiding this comment.
consider adding type hinting here since this function is non-trivial and used in multiple places
kevinchern
left a comment
There was a problem hiding this comment.
Did a few passes with focus on usability and some design patterns 🚀
jackraymond
left a comment
There was a problem hiding this comment.
On top of reviews by Sebastian/Kevin:
Check imports, to_target().
A couple of other comments.
Great job
|
|
||
| pruned_embedding = { | ||
| to_source(k): to_target(v) for k, v in working_embedding.items() if v in target_nodeset | ||
| } # TODO:?: why would a target node in the working_embedding not be in the target_nodeset? |
There was a problem hiding this comment.
to_target is an identity, so we can remove it right?
|
|
||
| pruned_embedding = { | ||
| to_source(k): to_target(v) for k, v in working_embedding.items() if v in target_nodeset | ||
| } # TODO:?: why would a target node in the working_embedding not be in the target_nodeset? |
There was a problem hiding this comment.
Perhaps its required because you want to cast it to a ZephyrNode, and we can't be sure that's the format provided - the function name could be changed and type hinting improved?
| from minorminer.utils.parallel_embeddings import find_sublattice_embeddings | ||
|
|
||
| import dwave_networkx as dnx | ||
| from dwave_networkx.generators._zephyr_playground import zephyr_quotient_search |
There was a problem hiding this comment.
Requires correction, this won't run
There was a problem hiding this comment.
Corrected. Thanks for noticing!
| if target.graph["labels"] != "coordinate": | ||
| raise ValueError("target graph has unknown labelling scheme") | ||
|
|
||
| def to_target(n: ZephyrNode) -> ZephyrNode: |
There was a problem hiding this comment.
to_target is an identity. Is it really necessary? Is it necessary to caste non-ZephyrNode to ZephyrNode for example? If so the type hint is incorrect.
There was a problem hiding this comment.
This to_target is an identity because the case where the graph is not in coordinate labels has already been handled by previous logic. The case where the graph is already in coordinate labels, this function needs to be an identity.
There was a problem hiding this comment.
Is this ever something other than just an identity? Otherwise I still don't see this as doing anything.
There was a problem hiding this comment.
Yes, if target.graph["labels"] == "int" it will be different than the identity.
| working_embedding: Embedding = {n: n for n in source_nodes} | ||
| else: | ||
| # Convert chain format to internal single-node format | ||
| working_embedding = {k: v[0] for k, v in embedding.items()} |
There was a problem hiding this comment.
Do we need to check a few things here?
- Should we check there is a key for every source node? If it is absent throw an error that a key is missing in the candidate (and document requirement)
- Check the length of all values is 1.
There was a problem hiding this comment.
these are both checked in _validate_search_parameters and _ensure_coordinate_source.
There was a problem hiding this comment.
Could be good to add a few single-line comments above the various function calls above, for clarity.
| This routine starts from a source Zephyr graph with ``m`` rows and ``tp`` tiles, | ||
| and maps it into a target Zephyr graph with the same ``m`` rows and ``t >= tp`` | ||
| tiles. It is designed for defective targets where a direct identity map may lose | ||
| nodes or edges. |
There was a problem hiding this comment.
I think we should highlight limitations at the end of this first paragraph:
"Since a greedy method is used for embedding search, it is possible it fails to find a 1:1 embedding where one is viable. A complete method such as :code:minorminer.subgraph.find_subgraph may be more appropriate in a scenario such as this, especially with customization of parameters to the target families. Similarly, when defect rates are high direct use of :code: minorminer.find_embedding may be a more efficient strategy."
I'm teeing up the other module I'd like to add here which is find_subgraph enhanced by S,T pair specific node_labels.
|
I'd like to pair this module with a tool that uses source and target graph specific knowledge to accelerate find_subgraph(). This tool is not zephyr specific, but it would be good if it could be placed alongside this tool. For this reason, could we consider relabeling the subdirectory as 'embedding_methods' rather than 'zephyr_embedding_methods' (less specific), or perhaps 'processor_graph_embedding_methods' (if we wanted to be more specific) |
…en updated when an incomplete source was provided. Clarify that the zephyr_quotient_embedding_search is a greedy method and has limitations in the docstrings. Co-authored-by: Copilot <copilot@github.com>
|
@jackraymond the example has been updated and runs correctly as expected. |
Co-authored-by: Copilot <copilot@github.com>
…on during the example. Co-authored-by: Copilot <copilot@github.com>
| import networkx as nx | ||
| import numpy as np | ||
| from dwave_networkx import zephyr_graph | ||
| from dwave_networkx.generators._zephyr_playground import ( |
thisac
left a comment
There was a problem hiding this comment.
Might be good to check out dwavesystems/dwave-graphs#266 and coordinate with @mahdiehmalekian, specifically with regards to some of the zephyr utility functions and checks.
I'll have a closer look at the main zephyr_quotient_search function and tests.
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| from .zephyr_quotient_embedding_search import * |
There was a problem hiding this comment.
Recommended to use absolute imports instead of relative.
|
|
||
| __all__ = ["zephyr_quotient_search"] | ||
|
|
||
| ZephyrNode = tuple[int, int, int, int, int] # (u, w, k, j, z) coordinate tuple |
There was a problem hiding this comment.
Since this will likely be moved over to dwave-graphs at some point, and there's already a ZephyrCoord tuple-like being added there, it's probably good to coordinate a bit and make sure that these two are compatible.
| __all__ = ["zephyr_quotient_search"] | ||
|
|
||
| ZephyrNode = tuple[int, int, int, int, int] # (u, w, k, j, z) coordinate tuple | ||
| Embedding = dict[ZephyrNode, ZephyrNode] # Internal single-node format |
There was a problem hiding this comment.
Seems simple enough to just keep dict[ZephyrNode, ZephyrNode] as the type.
| YieldType = Literal["node", "edge", "rail-edge"] | ||
| QuotientSearchType = Literal["by_quotient_rail", "by_quotient_node", "by_rail_then_node"] |
There was a problem hiding this comment.
Type-aliases are good when not overused. I agree with @kevinchern that it can make code less readable. With that said, I'm fine with these, except perhaps Embedding, EmbeddingChain since they're not very clearly named.
| if embedding is not None and not isinstance(embedding, dict): | ||
| raise TypeError(f"embedding must be a dictionary when provided. Got {type(embedding)}") | ||
| if embedding is not None: |
There was a problem hiding this comment.
One less check.
| if embedding is not None and not isinstance(embedding, dict): | |
| raise TypeError(f"embedding must be a dictionary when provided. Got {type(embedding)}") | |
| if embedding is not None: | |
| if embedding is not None: | |
| if not isinstance(embedding, dict): | |
| raise TypeError(f"embedding must be a dictionary when provided. Got {type(embedding)}") |
There was a problem hiding this comment.
Adopted your suggestion. Thanks.
| Returns: | ||
| tuple[nx.Graph, set[ZephyrNode], Callable[[ZephyrNode], int | ZephyrNode]]: | ||
| ``(_source, source_nodes, to_source)`` where ``_source`` is | ||
| coordinate-labelled, ``source_nodes`` is the full canonical coordinate | ||
| node set implied by ``m`` and ``tp``, and ``to_source`` maps | ||
| coordinate nodes back to the original source labelling space. |
There was a problem hiding this comment.
Shouldn't be indented on returns. Also, could remove all types in docstrings since type-hints are used.
There was a problem hiding this comment.
Removed indentation on returns and removed types in docstrings.
| "graph, along with any other missing nodes.", UserWarning | ||
| ) | ||
| _source.add_nodes_from(source_nodes) | ||
| breakpoint() |
There was a problem hiding this comment.
Leftover from debugging?
| breakpoint() |
There was a problem hiding this comment.
Yes, wow. Thanks for noticing!
| # If the labels are coordinate. Then we just return the graph as is and the identity function | ||
| # for to_source: | ||
|
|
||
| def to_source(n: ZephyrNode) -> ZephyrNode: | ||
| return n |
| tuple[nx.Graph, Callable[[ZephyrNode], int | ZephyrNode]]: | ||
| ``(_target, to_target)`` where ``_target`` is coordinate-labelled. |
There was a problem hiding this comment.
Removed, left only 1 indentation level.
| if target.graph["labels"] != "coordinate": | ||
| raise ValueError("target graph has unknown labelling scheme") | ||
|
|
||
| def to_target(n: ZephyrNode) -> ZephyrNode: |
There was a problem hiding this comment.
Is this ever something other than just an identity? Otherwise I still don't see this as doing anything.
|
All that needs to be done is to coordinate with @mahdiehmalekian . I will do so this week. |
|
I met with @thisac and @mahdiehmalekian and we decided to simply replace |
| def _validate_search_parameters( | ||
| quotient_search: str, | ||
| yield_type: str, | ||
| embedding: dict[tuple[int, int, int, int, int], tuple[tuple[int, int, int, int, int], ...]] | None, |
There was a problem hiding this comment.
Since it's stated as optional, it should probably be None by default.
| embedding: dict[tuple[int, int, int, int, int], tuple[tuple[int, int, int, int, int], ...]] | None, | |
| embedding: dict[tuple[int, int, int, int, int], tuple[tuple[int, int, int, int, int], ...]] | None = None, |
| if not isinstance(value, tuple): | ||
| raise TypeError( | ||
| f"embedding values must be tuples representing node chains. Got value {value} " | ||
| f"of type {type(value)}" | ||
| ) | ||
| if len(value) == 0: | ||
| raise ValueError( | ||
| f"embedding chains must be non-empty. Got empty tuple for key {key}" | ||
| ) | ||
| elif len(value) != 1: | ||
| raise ValueError( | ||
| f"embedding chains must contain exactly one target node for each source node. " | ||
| f"Got chain of length {len(value)} for key {key}. Chain is: {value}" | ||
| ) |
There was a problem hiding this comment.
Maybe simplify by combining these into a single check?
if not isinstance(value, tuple) or len(value) != 1:
raise ValueError(...)There was a problem hiding this comment.
I got rid of len(value)==0 check and its respective test.
There was a problem hiding this comment.
Actually, I got rid of this and adopted your suggestion.
| if not isinstance(key, tuple): | ||
| raise TypeError( | ||
| f"embedding keys must be tuples representing nodes. Got key {key} " | ||
| f"of type {type(key)}" | ||
| ) | ||
| if len(key) != 5: | ||
| raise ValueError( | ||
| f"embedding keys must be 5-tuples representing Zephyr coordinates. " | ||
| f"Got key {key} of length {len(key)}" | ||
| ) |
There was a problem hiding this comment.
Same here. IMO these messages convey basically the same things with different details, and it's not much clearer to have it twice.
if not isinstance(key, tuple) or len(key) != 5:
raise ValueError(...)| ``(_source, source_nodes, to_source)`` where ``_source`` is | ||
| coordinate-labelled, ``source_nodes`` is the full canonical coordinate | ||
| node set implied by ``m`` and ``tp``, and ``to_source`` maps | ||
| coordinate nodes back to the original source labelling space. |
There was a problem hiding this comment.
Better leave out the variables names from the return since these are just inside the function scope and won't make sense in documentation.
Also, what is coordinate-labelled? Should it be coordinate-labelled Zephyr graph instead?
| ``(_source, source_nodes, to_source)`` where ``_source`` is | |
| coordinate-labelled, ``source_nodes`` is the full canonical coordinate | |
| node set implied by ``m`` and ``tp``, and ``to_source`` maps | |
| coordinate nodes back to the original source labelling space. | |
| coordinate-labelled Zephyr graph and the full canonical coordinate | |
| node set implied by ``m`` and ``tp``, and ``to_source`` maps | |
| coordinate nodes back to the original source labeling space. |
There was a problem hiding this comment.
This is also the case for other returns in this file.
There was a problem hiding this comment.
I have changed this and the other docstrings.
|
|
||
| if expand_boundary_search: | ||
| # Visit interior columns first so boundary expansion can reuse already-assigned assignments: | ||
| iterator = itertools.product( |
There was a problem hiding this comment.
I changed all iterators with xyz_iterator, which make it explicit which coordinates are being iterated on.
| working_embedding: Embedding = {n: n for n in source_nodes} | ||
| else: | ||
| # Convert chain format to internal single-node format | ||
| working_embedding = {k: v[0] for k, v in embedding.items()} |
There was a problem hiding this comment.
Could be good to add a few single-line comments above the various function calls above, for clarity.
| _validate_graph_inputs(source, target) | ||
| m, tp, t = _extract_graph_properties(source, target) |
There was a problem hiding this comment.
I feel like these two function should be combined. They're hidden, quite small, and only called here. And the both loop through the source and target graphs (which could be put in a single loop instead).
There was a problem hiding this comment.
There's still validation done in _extract_graph_properties, so you could just add the graph inputs validation there.
There was a problem hiding this comment.
I think I had this discussion earlier with @kevinchern . I advocated for keeping things separated for readability, one thing is to validate, another is to get these properties. I agree that they can be merged together but also don't see a clear reason why to do that.
There was a problem hiding this comment.
Sry missed this comment until now. Just chiming in to say I agree with decoupling them.
| return _source, source_nodes, to_source | ||
|
|
||
|
|
||
| def _ensure_coordinate_target( |
There was a problem hiding this comment.
Should this better be called _normalize_coordinate_target()? _ensure makes it sound like just a validation function, like the ones above.
| ) | ||
|
|
||
|
|
||
| def _ensure_coordinate_source( |
There was a problem hiding this comment.
Same here wrt naming it _normalize_coordinate_source().
| Add ``dwave.experimental.embedding_methods.zephyr_quotient_search`` for | ||
| finding fully yielded Zephyr-to-Zephyr subgraph embeddings on hardware | ||
| graphs, with configurable search strategy and yield objective. | ||
| A complete usage example and dedicated tests are also added. |
There was a problem hiding this comment.
You can skip the tests and example inclusion here, since it's not part of the package release.
| A complete usage example and dedicated tests are also added. |
…f the returned variables.
Add inline comments to clarify what certain instructions do.
thisac
left a comment
There was a problem hiding this comment.
Thanks for addressing my comments @VolodyaCO!
| @@ -0,0 +1,431 @@ | |||
| # Copyright 2026 D-Wave Systems Inc. | |||
There was a problem hiding this comment.
Just noticed while clicking approve!
| # Copyright 2026 D-Wave Systems Inc. | |
| # Copyright 2026 D-Wave |
Continuation of dwavesystems/dwave-graphs#261
Adds a routine to find embeddings of Zephyr[m, tp] into Zephyr[m, t] with tp < t.