Skip to content

Commit f2244fd

Browse files
Decompose byte_extract from union through widest member
When byte_extract is applied to a union-typed expression with a non-constant computed offset, the byte_extract lowering creates a massive expression because it must consider all possible byte layouts. For unions where the widest member covers the entire union, we can instead decompose the access through that member, avoiding the expensive lowering. Add union handling to the non-constant-offset overload of get_subexpression_at_offset: for symbol and member expressions of union type, recurse into the widest member. Guard this on the widest member's size equalling the union's size (no trailing padding). The constant-offset overload is left unchanged to preserve existing simplification behavior (e.g., byte_extract(byte_update(...)) patterns used during constant propagation). On the reproducer from #8813, the union version now takes 1.0s (previously 94s), matching the struct version at 1.0s. Fixes: #8813 Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
1 parent 6df204e commit f2244fd

3 files changed

Lines changed: 59 additions & 0 deletions

File tree

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
// Test that byte_extract from a union does not cause performance explosion.
2+
// See GitHub issue diffblue/cbmc#8813.
3+
#include <assert.h>
4+
#include <stdint.h>
5+
6+
typedef struct
7+
{
8+
int32_t data[8];
9+
} arr;
10+
11+
int main()
12+
{
13+
union
14+
{
15+
arr a;
16+
arr b;
17+
} u;
18+
unsigned i;
19+
__CPROVER_assume(i < 8);
20+
u.a.data[i] = 42;
21+
assert(u.b.data[i] == 42);
22+
}
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
CORE
2+
main.c
3+
4+
^VERIFICATION SUCCESSFUL$
5+
^EXIT=0$
6+
^SIGNAL=0$
7+
--
8+
^warning: ignoring
9+
--
10+
Union of same-typed members should not cause performance explosion.
11+
See GitHub issue diffblue/cbmc#8813.

src/util/pointer_offset_size.cpp

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -859,6 +859,32 @@ std::optional<exprt> get_subexpression_at_offset(
859859

860860
return {};
861861
}
862+
else if(expr.type().id() == ID_union || expr.type().id() == ID_union_tag)
863+
{
864+
// For union-typed expressions with non-constant offset, access
865+
// through the widest member to avoid expensive byte_extract lowering.
866+
// This is safe when the widest member covers the full union (no
867+
// trailing padding beyond it), since all members start at offset 0.
868+
if(expr.id() == ID_symbol || expr.id() == ID_member)
869+
{
870+
const union_typet &union_type =
871+
expr.type().id() == ID_union_tag
872+
? ns.follow_tag(to_union_tag_type(expr.type()))
873+
: to_union_type(expr.type());
874+
875+
const auto union_size = pointer_offset_bits(union_type, ns);
876+
const auto widest_member = union_type.find_widest_union_component(ns);
877+
if(
878+
widest_member.has_value() && union_size.has_value() &&
879+
widest_member->second == *union_size)
880+
{
881+
const member_exprt member(
882+
expr, widest_member->first.get_name(), widest_member->first.type());
883+
return get_subexpression_at_offset(member, offset, target_type, ns);
884+
}
885+
}
886+
return {};
887+
}
862888
else
863889
return {};
864890
}

0 commit comments

Comments
 (0)