Skip to content

[ET-VK][custom_ops] Standardize test case label format across binaries#19570

Merged
meta-codesync[bot] merged 1 commit into
gh/SS-JIA/534/basefrom
gh/SS-JIA/534/head
May 14, 2026
Merged

[ET-VK][custom_ops] Standardize test case label format across binaries#19570
meta-codesync[bot] merged 1 commit into
gh/SS-JIA/534/basefrom
gh/SS-JIA/534/head

Conversation

@SS-JIA
Copy link
Copy Markdown
Contributor

@SS-JIA SS-JIA commented May 13, 2026

Stack from ghstack (oldest at bottom):

Previously the test case labels printed by each custom_ops test binary used wildly inconsistent formats. Examples from a recent on-device pass:

ACCU  conv1d_dw [1,16,64] K=3 s=1 p=1 d=1  Tex(HP) f32
ACCU  I=144  Buf(4W)
correctness_1_64_32_Buffer_Float
small_1d_linear_weight
ACCU  In=1,16,4,4  r=2  same_qp  [4W4C->4W4C]
ACCU  32->3  I=64,64  g=1  k=3  Buf(4C1W) [general]

Standardize all 21 binaries on:

[ACCU/PERF]  [INPUT->OUTPUT dtype]  [shape data + inline op params]  [Storage(layout)]  [optional suffix]

Concrete examples after this commit:

ACCU  f32->f32  [1,16,64] k3 s1 p1 d1                Tex(HP)
ACCU  i8->i8    [144]+[144]                          Buf(4W)
ACCU  f32->f32  [1,64]x[32,64]                       Buf(WP)
ACCU  i32->f32  [1,8] lw                             Tex(WP)
ACCU  i8->i8    [1,16,4,4] r=2                       Buf(4W4C)->Buf(4W4C)  [same_qp]
ACCU  f32->f32  [1,3,64,64]x[32,3,3,3] s1 p1 d1 g1   Buf(4C1W)  [general]

Adds three helpers to test/custom_ops/utils.h:

  • dtype_short(vkapi::ScalarType) -> "f32" / "f16" / "i8" / "i32"
  • shape_bracket(std::vector<int64_t>) -> "[N,C,H,W]" form
  • make_test_label(prefix, in_dtype, out_dtype, shape_str, storage_str, suffix) assembles the four sections with two-space separators and an optional bracketed suffix.

Each binary's TestCase-builder helper now invokes make_test_label rather than rolling its own ostringstream. Op-specific shape detail (kernel/stride/padding/dilation/groups for conv, ratio for pixel_shuffle, output_padding for transposed conv, layout-transition arrows for clone/qdq) is constructed inline since it varies per op. Information that was load-bearing in the old labels (impl_selector tags, +bias/no_bias flags, same_qp/diff_qp markers, const_b for binary ops, multi-stage Buf->Buf transitions) is preserved as a bracketed suffix.

Two minor format deviations worth flagging:

  • choose_qparams_per_row produces two outputs (scale tensor + zero_point tensor). Encoded as out_dtype = "f32,i8" -- a comma-separated tuple after the arrow -- rather than forcing a single token.
  • test_embedding_q4gsw loses the free-form prose labels (small_1d_linear_weight, llama_3_2_1b_prefill_*). Replaced with structural shape info + an lw / scales-dtype suffix; readers can still derive the same context but it is less greppable as a "story".

Differential Revision: D105059941

Previously the test case labels printed by each custom_ops test binary used wildly inconsistent formats. Examples from a recent on-device pass:

    ACCU  conv1d_dw [1,16,64] K=3 s=1 p=1 d=1  Tex(HP) f32
    ACCU  I=144  Buf(4W)
    correctness_1_64_32_Buffer_Float
    small_1d_linear_weight
    ACCU  In=1,16,4,4  r=2  same_qp  [4W4C->4W4C]
    ACCU  32->3  I=64,64  g=1  k=3  Buf(4C1W) [general]

Standardize all 21 binaries on:

    [ACCU/PERF]  [INPUT->OUTPUT dtype]  [shape data + inline op params]  [Storage(layout)]  [optional suffix]

Concrete examples after this commit:

    ACCU  f32->f32  [1,16,64] k3 s1 p1 d1                Tex(HP)
    ACCU  i8->i8    [144]+[144]                          Buf(4W)
    ACCU  f32->f32  [1,64]x[32,64]                       Buf(WP)
    ACCU  i32->f32  [1,8] lw                             Tex(WP)
    ACCU  i8->i8    [1,16,4,4] r=2                       Buf(4W4C)->Buf(4W4C)  [same_qp]
    ACCU  f32->f32  [1,3,64,64]x[32,3,3,3] s1 p1 d1 g1   Buf(4C1W)  [general]

Adds three helpers to test/custom_ops/utils.h:

- dtype_short(vkapi::ScalarType) -> "f32" / "f16" / "i8" / "i32"
- shape_bracket(std::vector<int64_t>) -> "[N,C,H,W]" form
- make_test_label(prefix, in_dtype, out_dtype, shape_str, storage_str, suffix) assembles the four sections with two-space separators and an optional bracketed suffix.

Each binary's TestCase-builder helper now invokes make_test_label rather than rolling its own ostringstream. Op-specific shape detail (kernel/stride/padding/dilation/groups for conv, ratio for pixel_shuffle, output_padding for transposed conv, layout-transition arrows for clone/qdq) is constructed inline since it varies per op. Information that was load-bearing in the old labels (impl_selector tags, +bias/no_bias flags, same_qp/diff_qp markers, const_b for binary ops, multi-stage Buf->Buf transitions) is preserved as a bracketed suffix.

Two minor format deviations worth flagging:

- choose_qparams_per_row produces two outputs (scale tensor + zero_point tensor). Encoded as out_dtype = "f32,i8" -- a comma-separated tuple after the arrow -- rather than forcing a single token.
- test_embedding_q4gsw loses the free-form prose labels (small_1d_linear_weight, llama_3_2_1b_prefill_*). Replaced with structural shape info + an lw / scales-dtype suffix; readers can still derive the same context but it is less greppable as a "story".

Differential Revision: [D105059941](https://our.internmc.facebook.com/intern/diff/D105059941/)

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 13, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19570

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 4 Pending, 2 Unrelated Failures, 1 Unclassified Failure

As of commit 0d3c455 with merge base 1992bdd (image):

UNCLASSIFIED FAILURE - DrCI could not classify the following job because the workflow did not run on the merge base. The failure may be pre-existing on trunk or introduced by this PR:

  • Check Labels (gh) (this job did not run on the merge base, so DrCI cannot tell whether the failure is pre-existing)

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 13, 2026
@meta-codesync meta-codesync Bot merged commit 48d488b into gh/SS-JIA/534/base May 14, 2026
172 of 185 checks passed
@meta-codesync meta-codesync Bot deleted the gh/SS-JIA/534/head branch May 14, 2026 01:56
SS-JIA pushed a commit that referenced this pull request May 14, 2026
Previously the test case labels printed by each custom_ops test binary used wildly inconsistent formats. Examples from a recent on-device pass:

    ACCU  conv1d_dw [1,16,64] K=3 s=1 p=1 d=1  Tex(HP) f32
    ACCU  I=144  Buf(4W)
    correctness_1_64_32_Buffer_Float
    small_1d_linear_weight
    ACCU  In=1,16,4,4  r=2  same_qp  [4W4C->4W4C]
    ACCU  32->3  I=64,64  g=1  k=3  Buf(4C1W) [general]

Standardize all 21 binaries on:

    [ACCU/PERF]  [INPUT->OUTPUT dtype]  [shape data + inline op params]  [Storage(layout)]  [optional suffix]

Concrete examples after this commit:

    ACCU  f32->f32  [1,16,64] k3 s1 p1 d1                Tex(HP)
    ACCU  i8->i8    [144]+[144]                          Buf(4W)
    ACCU  f32->f32  [1,64]x[32,64]                       Buf(WP)
    ACCU  i32->f32  [1,8] lw                             Tex(WP)
    ACCU  i8->i8    [1,16,4,4] r=2                       Buf(4W4C)->Buf(4W4C)  [same_qp]
    ACCU  f32->f32  [1,3,64,64]x[32,3,3,3] s1 p1 d1 g1   Buf(4C1W)  [general]

Adds three helpers to test/custom_ops/utils.h:

- dtype_short(vkapi::ScalarType) -> "f32" / "f16" / "i8" / "i32"
- shape_bracket(std::vector<int64_t>) -> "[N,C,H,W]" form
- make_test_label(prefix, in_dtype, out_dtype, shape_str, storage_str, suffix) assembles the four sections with two-space separators and an optional bracketed suffix.

Each binary's TestCase-builder helper now invokes make_test_label rather than rolling its own ostringstream. Op-specific shape detail (kernel/stride/padding/dilation/groups for conv, ratio for pixel_shuffle, output_padding for transposed conv, layout-transition arrows for clone/qdq) is constructed inline since it varies per op. Information that was load-bearing in the old labels (impl_selector tags, +bias/no_bias flags, same_qp/diff_qp markers, const_b for binary ops, multi-stage Buf->Buf transitions) is preserved as a bracketed suffix.

Two minor format deviations worth flagging:

- choose_qparams_per_row produces two outputs (scale tensor + zero_point tensor). Encoded as out_dtype = "f32,i8" -- a comma-separated tuple after the arrow -- rather than forcing a single token.
- test_embedding_q4gsw loses the free-form prose labels (small_1d_linear_weight, llama_3_2_1b_prefill_*). Replaced with structural shape info + an lw / scales-dtype suffix; readers can still derive the same context but it is less greppable as a "story".

Differential Revision: [D105059941](https://our.internmc.facebook.com/intern/diff/D105059941/)

ghstack-source-id: 381655474
Pull Request resolved: #19570
SS-JIA pushed a commit that referenced this pull request May 14, 2026
Previously the test case labels printed by each custom_ops test binary used wildly inconsistent formats. Examples from a recent on-device pass:

    ACCU  conv1d_dw [1,16,64] K=3 s=1 p=1 d=1  Tex(HP) f32
    ACCU  I=144  Buf(4W)
    correctness_1_64_32_Buffer_Float
    small_1d_linear_weight
    ACCU  In=1,16,4,4  r=2  same_qp  [4W4C->4W4C]
    ACCU  32->3  I=64,64  g=1  k=3  Buf(4C1W) [general]

Standardize all 21 binaries on:

    [ACCU/PERF]  [INPUT->OUTPUT dtype]  [shape data + inline op params]  [Storage(layout)]  [optional suffix]

Concrete examples after this commit:

    ACCU  f32->f32  [1,16,64] k3 s1 p1 d1                Tex(HP)
    ACCU  i8->i8    [144]+[144]                          Buf(4W)
    ACCU  f32->f32  [1,64]x[32,64]                       Buf(WP)
    ACCU  i32->f32  [1,8] lw                             Tex(WP)
    ACCU  i8->i8    [1,16,4,4] r=2                       Buf(4W4C)->Buf(4W4C)  [same_qp]
    ACCU  f32->f32  [1,3,64,64]x[32,3,3,3] s1 p1 d1 g1   Buf(4C1W)  [general]

Adds three helpers to test/custom_ops/utils.h:

- dtype_short(vkapi::ScalarType) -> "f32" / "f16" / "i8" / "i32"
- shape_bracket(std::vector<int64_t>) -> "[N,C,H,W]" form
- make_test_label(prefix, in_dtype, out_dtype, shape_str, storage_str, suffix) assembles the four sections with two-space separators and an optional bracketed suffix.

Each binary's TestCase-builder helper now invokes make_test_label rather than rolling its own ostringstream. Op-specific shape detail (kernel/stride/padding/dilation/groups for conv, ratio for pixel_shuffle, output_padding for transposed conv, layout-transition arrows for clone/qdq) is constructed inline since it varies per op. Information that was load-bearing in the old labels (impl_selector tags, +bias/no_bias flags, same_qp/diff_qp markers, const_b for binary ops, multi-stage Buf->Buf transitions) is preserved as a bracketed suffix.

Two minor format deviations worth flagging:

- choose_qparams_per_row produces two outputs (scale tensor + zero_point tensor). Encoded as out_dtype = "f32,i8" -- a comma-separated tuple after the arrow -- rather than forcing a single token.
- test_embedding_q4gsw loses the free-form prose labels (small_1d_linear_weight, llama_3_2_1b_prefill_*). Replaced with structural shape info + an lw / scales-dtype suffix; readers can still derive the same context but it is less greppable as a "story".

Differential Revision: [D105059941](https://our.internmc.facebook.com/intern/diff/D105059941/)

ghstack-source-id: 381655474
Pull Request resolved: #19570
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants