Skip to content

[OpenVINO backend] add func is_splited_model()#20626

Closed
zhaixuejun1993 wants to merge 7 commits intoggml-org:masterfrom
zhaixuejun1993:xuejun/ov-bk-add-func-is-splited-model
Closed

[OpenVINO backend] add func is_splited_model()#20626
zhaixuejun1993 wants to merge 7 commits intoggml-org:masterfrom
zhaixuejun1993:xuejun/ov-bk-add-func-is-splited-model

Conversation

@zhaixuejun1993
Copy link
Copy Markdown
Contributor

This pull request introduces support for detecting and handling split-model computation graphs within the OpenVINO GGML decoder. The changes add a new mechanism to identify whether a computation graph is a split fragment and propagate this information through the decoder interface, will be used in fallback PATH.

Detection and handling of split-model graphs:

  • Added the is_model_splitted function to heuristically determine if a ggml_cgraph represents a split-model fragment, with detailed checks for node usage and input sources. [1] [2]
  • Modified the logic in ov_graph_compute_dynamic to avoid naive computation for split-model graphs by integrating the new detection function.
  • Updated the construction of GgmlOvDecoder instances to pass the split-model flag, ensuring downstream components are aware of the graph's split status. [1] [2] [3]
  • Adjusted ov_graph_compute_static to correctly set the split-model and prefill flags when creating decoder instances.

Decoder interface enhancements:

  • Added the is_splited_model virtual method to the decoder interface and implemented it in GgmlOvDecoder, along with the new member variable m_model_is_splitted. [1] [2] [3]

@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 16, 2026
// Detect whether a cgraph is a split subgraph or not.
// Step 1 compares each node's recorded use_count with actual fan-out references in node->src.
// Step 2 verifies that node inputs come from model nodes/weights/leafs; external sources imply split.
bool is_model_splitted(ggml_cgraph * cgraph) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we can detect if the graph is split in ggml-openvino.cpp when detecting supported/unsupported ops and then we can write this information into runtime context?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally no - checking for supported ops can happen at various stages, depending on what the user code (e.g. llama.cpp) does. Basically, the backend should never be aware if a graph is split or not.

I assume you need some workaround logic to get the test suite rolling. That would be OK. In the long term, this logic should not be needed.

@zhaixuejun1993
Copy link
Copy Markdown
Contributor Author

in dev repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning OpenVINO

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants