Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cDAC] Contract implementations needed for managed stack walking #110758

Open
12 of 38 tasks
Tracked by #108553
max-charlamb opened this issue Dec 16, 2024 · 2 comments
Open
12 of 38 tasks
Tracked by #108553

[cDAC] Contract implementations needed for managed stack walking #110758

max-charlamb opened this issue Dec 16, 2024 · 2 comments
Labels
area-Diagnostics-coreclr enhancement Product code improvement that does NOT require public API changes/additions tracking This issue is tracking the completion of other related issues.
Milestone

Comments

@max-charlamb
Copy link
Contributor

max-charlamb commented Dec 16, 2024

In order to support SOS !analyze the cDAC must implement stack walking through the following APIs:

IXCLRDataTask::CreateStackWalk();
IXCLRDataStackWalk::Request(DACSTACKPRIV_REQUEST_FRAME_DATA, ...);
IXCLRDataStackWalk::GetContext();
IXCLRDataStackWalk::Next();

To create a full managed stack walk, two types of stacks must be read. Call frames on the stack and capital "F" Frames representing runtime unmanaged frames. The Frames are pushed and popped to a linked list on the runtime Thread object. For more information see: BOTR Stack Walking.

To unwind managed call frames, we delegate to the existing native unwinding code. For managed frames, all platforms use the Windows unwind logic and codes. For native call frames, this is platform specific. Since we only care about managed call frames, all unwinding uses the Windows implementation under src/coreclr/unwinder/. See the following for more information:

Simplified Stack Unwinding Algorithm

  1. Read thread context.
  2. If the current IP is in managed code, use Windows style unwinding until the current IP is not managed code.
  3. For each capital 'F' Frame from deepest to shallowest:
    1. If the Frame contains a context, read the SP/IP. Otherwise skip this frame.
    2. If the IP is in managed code, use Windows style unwinding until the current IP is not managed code.

Work Items

Initial implementation will focus on x64 stack walking before expanding to all supported platforms.

Paths Forwards

  1. Independent Work
    1. Complete Frame support
    2. Cross-platform building
    3. Filtering flags
    4. ARM32/x86 Support. Should wait for x86 to use funclets before adding support?
  2. SOS Work
    1. Flip to invoke cDAC directly, allow testing on platforms where the cross-DAC is not available.
    2. Automated testing
    3. Testing with customer dumps - Include all required datadescriptors ASAP to allow verification on customer dumps.
  3. Threat Model
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Dec 16, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Dec 16, 2024
@max-charlamb max-charlamb changed the title Contracts Required for Stack walking [cDAC] Contract implementations needed for managed stack walking Dec 16, 2024
@max-charlamb max-charlamb added enhancement Product code improvement that does NOT require public API changes/additions area-Diagnostics-coreclr tracking This issue is tracking the completion of other related issues. and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Dec 16, 2024
Copy link
Contributor

Tagging subscribers to this area: @tommcdon
See info in area-owners.md if you want to be subscribed.

@jkoritzinsky
Copy link
Member

I would recommend the following approach for the context:

T_CONTEXT should be an opaque (to cdac) struct with a defined size in the data descriptor for the current platform. If any of the registers are used as part of stackwalking (likely the instruction pointer, stack pointer, frame pointer), there should be well-known field names for those concepts, which each runtime's data descriptor maps to the right registers.

If it was desired to map all registers and offsets to manually assign them out at the cDAC boundary, the data descriptor could have all of the fields for all of the register names for each platform, and then the cdac would do an if-else chain to dynamically read the field offsets and assign them to the buffer depending on the target runtime's architecture. However, even in this case, I'd recommend having well-known names for the above-mentioned registers if there's any contract implementation or logic that uses them as it makes it easier to implement the algorithms with less platform-specific goo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Diagnostics-coreclr enhancement Product code improvement that does NOT require public API changes/additions tracking This issue is tracking the completion of other related issues.
Projects
None yet
Development

No branches or pull requests

3 participants