debuginfo: slices are DW_TAG_array_type's#153238
Conversation
|
r? @davidtwco rustbot has assigned @davidtwco. Use Why was this reviewer chosen?The reviewer was selected based on:
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
another relevant thread from gdb bugtracker about rust's weird unsized type debug info and some related fixes in gdb: https://sourceware.org/bugzilla/show_bug.cgi?id=30330 . The type of change here (changing slice types to be array types with no count, similar to C's VLA/FMA) was discussed but I guess never implemented in rustc? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@rustbot ready |
|
@bors r+ |
debuginfo: slices are DW_TAG_array_type's Rust debuginfo for unsized structs with embedded slice types is lacking, and it makes it difficult for debuggers to correctly print types like `Rc<str>` (eg rust-lang#114195). Specifically, these three structs would have essentially the same debug info (modulo some names): ```rust struct Foo { a: u32, s: str } // unsized struct Bar { a: u32, s: [u8] } // unsized struct Baz { a: u32, s: u8 } ``` This changes how the compiler generates debuginfo in this case, and I used existing compilers (gcc, clang) handling of C flexible member arrays as a reference.
Rollup of 11 pull requests Successful merges: - #148788 (Unconstrained parameter fix) - #153238 (debuginfo: slices are DW_TAG_array_type's) - #155521 (Add lint againts invalid runtime symbol definitions) - #156319 (Require EIIs to be defined when we compile a rust dylib) - #156452 (Implement pinned drop sugar) - #156600 (Make const param default test reproduce original ICE) - #156493 (actually run the temp_dir doctest) - #156556 (Require UTF-8 in `Utf8Pattern::StringPattern`) - #156565 (delegation: emit error when self type is not specified and accessed) - #156586 (Use DropCtxt::new_block and new_block_with_statements systematically.) - #156587 (Correctly handle associated items in rustdoc macro expansion)
debuginfo: slices are DW_TAG_array_type's Rust debuginfo for unsized structs with embedded slice types is lacking, and it makes it difficult for debuggers to correctly print types like `Rc<str>` (eg rust-lang#114195). Specifically, these three structs would have essentially the same debug info (modulo some names): ```rust struct Foo { a: u32, s: str } // unsized struct Bar { a: u32, s: [u8] } // unsized struct Baz { a: u32, s: u8 } ``` This changes how the compiler generates debuginfo in this case, and I used existing compilers (gcc, clang) handling of C flexible member arrays as a reference.
…uwer Rollup of 12 pull requests Successful merges: - #153238 (debuginfo: slices are DW_TAG_array_type's) - #156319 (Require EIIs to be defined when we compile a rust dylib) - #156452 (Implement pinned drop sugar) - #156554 (Allow user-provided `llvm_args` to override target spec arguments) - #156571 (Disable `main_needs_argc_argv` for Wasm) - #156600 (Make const param default test reproduce original ICE) - #156493 (actually run the temp_dir doctest) - #156556 (Require UTF-8 in `Utf8Pattern::StringPattern`) - #156565 (delegation: emit error when self type is not specified and accessed) - #156586 (Use DropCtxt::new_block and new_block_with_statements systematically.) - #156587 (Correctly handle associated items in rustdoc macro expansion) - #156604 (coverage: Reduce and clarify the context-mismatch test case)
|
This pull request was unapproved. This PR was contained in a rollup (#156613), which was unapproved. |
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
Is there a way to run the tests that failed in the rollup PR (auto - aarch64-msvc-1, try - x86_64-mingw-1, try - x86_64-msvc-1) on this PR? I think I fixed the issue, but I'm not able to check msvc compile targets on my machine. |
|
@bors try jobs=aarch64-msvc-1,x86_64-mingw-1,x86_64-msvc-1 |
This comment has been minimized.
This comment has been minimized.
debuginfo: slices are DW_TAG_array_type's try-job: aarch64-msvc-1 try-job: x86_64-mingw-1 try-job: x86_64-msvc-1
This comment has been minimized.
This comment has been minimized.
|
@bors try jobs=aarch64-msvc-1,x86_64-mingw-1,x86_64-msvc-1 |
|
@shua: 🔑 Insufficient privileges: not in try users |
|
@bors try jobs=aarch64-msvc-1,x86_64-mingw-1,x86_64-msvc-1 |
This comment has been minimized.
This comment has been minimized.
debuginfo: slices are DW_TAG_array_type's try-job: aarch64-msvc-1 try-job: x86_64-mingw-1 try-job: x86_64-msvc-1
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
💔 Test for 2468142 failed: CI. Failed job:
|
|
AFAIK, I'm in the process of reworking the way these tests function as part of GSoC and #148483 anyway, so it'll get taken care of, but imo it's not worth holding this PR up over. I'm very exciting for this change to land, as it means we can have ~1 visualizer that handles any given dynamically sized type. |
| let pointee_type_di_node = match pointee_type.kind() { | ||
| // `&[T]` will look like `{ data_ptr: *const T, length: usize }` | ||
| ty::Slice(element_type) => type_di_node(cx, *element_type), | ||
| // `&str` will look like `{ data_ptr: *const u8, length: usize }` | ||
| ty::Str => type_di_node(cx, cx.tcx.types.u8), | ||
|
|
||
| // `&dyn K` will look like `{ pointer: _, vtable: _}` | ||
| // any Adt `Foo` containing an unsized type (eg `&[_]` or `&dyn _`) | ||
| // will look like `{ data_ptr: *const Foo, length: usize }` | ||
| // and thin pointers `&Foo` will just look like `*const Foo`. | ||
| // | ||
| // in all those cases, we just use the pointee_type | ||
| _ => type_di_node(cx, pointee_type), | ||
| }; |
There was a problem hiding this comment.
Oh, I think I get it. Basically, if you have a struct:
struct T {
a: S,
b: [S]
}A pointer to that struct has the layout:
{
data_ptr: *T {
a: S,
b: [S]
},
length: usize,
}The length applies to the b field, not the whole struct. That applies recursively, so if you have some wrappers:
struct X(T);
struct W(X);
...
let ref_w = &W(X(T{...}));the data layout would be
{
data_ptr: *W {
__0: X {
__0 T {
a: S,
b: [S]
}
}
},
length: usize
}All that said, I'd still need to test this a bit to see if it's the case, but I think this match is making too many assumptions and isn't accurately representing Rust's !Sized type layout, because { data_ptr: *const T, length: usize } can mean T itself is the unsized part (i.e. [T]) or that a field of T is unsized.
It might be the case that the visualization won't work out of the box, but so long as a struct's !Sized field can be differentiated from an "identically sized" Sized field, the visualizers can fix it before it reaches the user.
i.e. if we can tell the difference between &X(T) and
struct T2 {
a: S,
b: S,
}
struct Y([T2]);from the field layout information alone, we're golden.
I have a full adversarial example here, where the align and padding mean there's no other way to tell them apart (with output from GDB failing to differentiate them for reference).
There was a problem hiding this comment.
Looked into this a bit more, I think this is due to GDB's printing behavior. The types are being output accurately:
(gdb) ptype a
type = struct *const arbitrary::A {
data_ptr: *mut arbitrary::A,
length: usize,
}
(gdb) ptype arbitrary::A
type = struct arbitrary::A {
f1: u64,
f2: u16,
f3: [arbitrary::S],
}I think the problem is GDB is hard-coded to treat structs with the shape of {data_ptr, length} as slices and prints them "shallowly" as slices (see: val_print_slice and value_print_inner).
I'm not really familiar with GDB's value or type representations though, so I could be wrong. @tromey does the above sound about right?
I don't think we'll get a disable-gdb-pretty-printers test for this to pass until GDB gets updated. On the other hand, visualizers should be able to override this behavior and print properly. It'd need a check for the {data_ptr, length} shape in rust_types.classify_rust_type, and the provider would need to search the pointee's last field recursively to find the one with a [T]/str type.
There was a problem hiding this comment.
I'm not sure I understand your concerns in the first comment, but I think we have the same understanding from your second. Specifically, if gdb (or lldb, etc) sees a "fat pointer" type of {data_ptr, length} it should recursively walk through the debuginfo definitions of data_ptr's type:
- if it is a
DW_TAG_array_typewith a member fieldDW_AT_countof-1, then it is a dynamic array, and the fat pointer'slengthis for this value, - if it is a
DW_TAG_structure_type, check the type of the last field in this struct, starting from step 1. above - if it is neither of those two cases, this is some error in debuginfo output; do whatever you think is best.
I found similar confusion around how to handle this in https://sourceware.org/bugzilla/show_bug.cgi?id=30330
View all comments
Rust debuginfo for unsized structs with embedded slice types is lacking, and it makes it difficult for debuggers to correctly print types like
Rc<str>(eg #114195). Specifically, these three structs would have essentially the same debug info (modulo some names):This changes how the compiler generates debuginfo in this case, and I used existing compilers (gcc, clang) handling of C flexible member arrays as a reference.