Commit ad7b73c
Add should_run() + fast-copy infrastructure with targeted_ops annotations (pytorch#18497)
Summary:
Pull Request resolved: pytorch#18497
Adds infrastructure for skipping and fast-copying unchanged nodes during
ExportPass execution, then annotates ~60 ARM backend passes to use it.
## Changes
### 1. should_run() hook on ExportPass / ArmPass
Subclasses that declare a `targeted_ops` class attribute (a set of op
overloads) can be skipped entirely when the graph contains none of their
target ops. ArmPass provides a default implementation via inheritance.
### 2. Fast-copy for cold nodes
When a pass declares `targeted_ops`, nodes whose ops are NOT in the set
are copied into the new graph via `graph.node_copy()` instead of full
FakeTensor dispatch. Per-node cost drops from ~0.4 ms to ~0.02 ms (~20x).
Includes a safety guard: nodes without `val` metadata (e.g. nodes
inserted by `call()` overrides before `super().call()`) fall back to
full dispatch instead of propagating None.
### 3. FakeTensor cache extension
Context manager `_extend_faketensor_cache_builtins()` temporarily extends
the FakeTensor dispatch cache to cover ExecuTorch op namespaces
(quantized_decomposed, tosa, dim_order_ops, cortex_m). Avoids redundant
re-dispatches for non-builtin ops across 50+ passes.
### 4. __init_subclass__ auto-discovery on ArmPass
Subclasses with existing `_TARGET_OPS`, `_supported_ops`, or
`_EDGE_OPS`/`_ATEN_OPS` attributes get `targeted_ops` populated
automatically at class definition time — no manual annotation needed.
### 5. targeted_ops annotations on ~60 ARM passes
Each annotation is a one-liner declaring the ops the pass checks in
`call_operator()`. Combined with should_run() and fast-copy, this
achieves the measured speedup below.
## Benchmark
Model: small CNN feature extractor (~50K params, 9 conv layers with
LayerNorm, targeting Ethos-U55 via the ARM/TOSA lowering pipeline).
Graph: ~1200 nodes, 146 ExportPass invocations.
lower() before: 186 s
lower() after: 100 s
Passes skipped: 53 of 146
Delta: -86 s (-46 %)
Adds should_run() hook to ExportPass that subclasses can override to skip
execution when a pass has no work to do. ArmPass implements a default that
checks a targeted_ops class attribute against the graph's call_function nodes.
Also adds:
- _fast_copy_node path in ExportInterpreter.run_node that uses graph.node_copy
instead of full FakeTensor dispatch for cold nodes in passes that declare
targeted_ops. Per-node cost drops from ~0.4ms to ~0.02ms.
- _extend_faketensor_cache_builtins context manager that extends FakeTensor
dispatch cache to cover ExecuTorch ops (quantized_decomposed, tosa, etc.)
- __init_subclass__ on ArmPass for auto-discovery of targeted_ops from
existing _TARGET_OPS, _supported_ops, _EDGE_OPS/_ATEN_OPS attributes
- targeted_ops annotations on ~60 ARM pass subclasses
Measured on SleepNet featurizer (U55 lowering):
lower(): 185s -> 96s = -89s (-48%)
Differential Revision: D975281101 parent 980c012 commit ad7b73c
62 files changed
Lines changed: 548 additions & 55 deletions
File tree
- backends/arm/_passes
- exir
- tests
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
23 | 42 | | |
24 | 43 | | |
25 | 44 | | |
| |||
78 | 97 | | |
79 | 98 | | |
80 | 99 | | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
81 | 128 | | |
82 | 129 | | |
83 | 130 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | 9 | | |
11 | | - | |
12 | 10 | | |
13 | 11 | | |
14 | 12 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
13 | 12 | | |
14 | 13 | | |
15 | | - | |
16 | 14 | | |
17 | 15 | | |
18 | 16 | | |
| |||
35 | 33 | | |
36 | 34 | | |
37 | 35 | | |
| 36 | + | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
12 | 11 | | |
13 | 12 | | |
14 | 13 | | |
| |||
58 | 57 | | |
59 | 58 | | |
60 | 59 | | |
| 60 | + | |
| 61 | + | |
61 | 62 | | |
62 | 63 | | |
63 | 64 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
13 | 12 | | |
14 | 13 | | |
15 | 14 | | |
| |||
36 | 35 | | |
37 | 36 | | |
38 | 37 | | |
| 38 | + | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
11 | 10 | | |
12 | 11 | | |
13 | | - | |
14 | 12 | | |
15 | 13 | | |
16 | 14 | | |
| |||
35 | 33 | | |
36 | 34 | | |
37 | 35 | | |
| 36 | + | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
24 | 29 | | |
25 | 30 | | |
26 | 31 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
27 | 32 | | |
28 | 33 | | |
29 | 34 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | 9 | | |
11 | 10 | | |
12 | 11 | | |
13 | | - | |
14 | 12 | | |
15 | 13 | | |
16 | 14 | | |
| |||
32 | 30 | | |
33 | 31 | | |
34 | 32 | | |
| 33 | + | |
| 34 | + | |
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| 40 | + | |
| 41 | + | |
40 | 42 | | |
41 | 43 | | |
42 | 44 | | |
| |||
0 commit comments