Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProcessTests.LongProcessNamesAreSupported test failure #111460

Closed
jkotas opened this issue Jan 15, 2025 · 9 comments · Fixed by #111850
Closed

ProcessTests.LongProcessNamesAreSupported test failure #111460

jkotas opened this issue Jan 15, 2025 · 9 comments · Fixed by #111850
Assignees
Labels
area-System.Diagnostics.Process blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' in-pr There is an active PR which will close this issue when it is merged Known Build Error Use this to report build issues in the .NET Helix tab
Milestone

Comments

@jkotas
Copy link
Member

jkotas commented Jan 15, 2025

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=917938
Build error leg or test failing: System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported
Pull request: #111446

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": ["System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported [FAIL]","Assert.Contains() Failure: Filter not matched in collection"], 
  "ErrorPattern": "",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=917938
Error message validated: [System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported [FAIL] Assert.Contains() Failure: Filter not matched in collection]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 1/15/2025 5:36:13 PM UTC

Report

Build Definition Test Pull Request
926703 dotnet/runtime System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported #111742
921261 dotnet/runtime System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported #111032
920396 dotnet/runtime System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported #111513
919939 dotnet/runtime System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported #111123
918750 dotnet/runtime System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported
917938 dotnet/runtime System.Diagnostics.Tests.ProcessTests.LongProcessNamesAreSupported #111446

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 1 6
@jkotas jkotas added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab labels Jan 15, 2025
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jan 15, 2025
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jan 15, 2025
@jkotas jkotas added area-System.Diagnostics.Process untriaged New issue has not been triaged by the area owner and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jan 15, 2025
@jkotas jkotas changed the title .ProcessTests.LongProcessNamesAreSupported test failure ProcessTests.LongProcessNamesAreSupported test failure Jan 15, 2025
@jeffhandley
Copy link
Member

Assigned to @adamsitnik for triage

@danmoseley
Copy link
Member

Sees "123456789012345" for process started as "123456789012345678901234567890". The failures are apparently all Linux on 32bit arm

@danmoseley
Copy link
Member

Either it can't find /proc/(pid)/comm or that contains the truncated value for some reason or it's not passing the file correctly.

@jkotas
Copy link
Member Author

jkotas commented Jan 20, 2025

The failures are apparently all Linux on 32bit arm

It affects all Linux architectures as far as I can tell. For example, https://helixr1107v0xdcypoyl9e7f.blob.core.windows.net/dotnet-runtime-refs-heads-main-019d0ba545ab465c98/System.Diagnostics.Process.Tests/1/console.f5719d04.log?helixlogtype=result shows this failure on alpine-3.21-helix-amd64

@jkotas
Copy link
Member Author

jkotas commented Jan 20, 2025

Either it can't find /proc/(pid)/comm or that contains the truncated value for some reason or it's not passing the file correctly.

Note that this is intermittent failure. We run this tests thousands of times per week, but it was only hit in 5 runs.

My guess is that something is changing the process name during the process lifetime. When we read the process name too soon or too late, we see the truncated process name.

@MichalStrehovsky
Copy link
Member

I see it more often in native AOT runs. Build analysis unfortunately doesn't track runtime-nativeaot-outerloop pipeline so it's not part of the report above.

But for example the outerloop run in https://github.com/dotnet/runtime/pull/111713/checks?check_run_id=36020981722 just hit this on both linux-arm and linux-arm64 legs within a single pipeline run.

@jkotas
Copy link
Member Author

jkotas commented Jan 23, 2025

I see it more often in native AOT runs.

It suggests that we see the wrong process name when we read it too soon. (Native AOT is typically a lot faster for code that is executed once or just a few times).

@adamsitnik
Copy link
Member

Triage: it seems that reading a process name on Linux makes some of our tests flaky (#109630, #110643, #111460).

I am going to try to add a simple retry mechanism to see if it helps to verify #111460 (comment) hypothesis.

@jkotas
Copy link
Member Author

jkotas commented Jan 27, 2025

I have built a small stress test for the long process names:

using System.Diagnostics;

const string LongProcessName = "123456789012345678901234567890";
const string LongProcessPath = "./" + LongProcessName;

const UnixFileMode ExecutablePermissions = UnixFileMode.UserRead | UnixFileMode.UserExecute | UnixFileMode.UserWrite |
                                           UnixFileMode.GroupRead | UnixFileMode.GroupExecute |
                                           UnixFileMode.OtherRead | UnixFileMode.OtherExecute;

File.WriteAllText(LongProcessPath, $"#!/bin/sh\nsleep 1\n"); // sleep 10 min.
File.SetUnixFileMode(LongProcessPath, ExecutablePermissions);
for (int i = 0; i < 3; i++)
    new Thread(Work).Start();

for (;;)
{
    foreach (var p in Process.GetProcesses())
    {
        string name = p.ProcessName;
        if (name.StartsWith("123") && (name != LongProcessName))
           Console.WriteLine($"{DateTime.Now:o} {p.Id} {name}");
    }
}

void Work()
{
    for (;;)
    {
        using (Process px = Process.Start(LongProcessPath, "600"))
        {
            Console.WriteLine($"{DateTime.Now:o} {px.Id} Started");
            px.WaitForExit();
            Console.WriteLine($"{DateTime.Now:o} {px.Id} Exited");
        }
    }
}

This stress test shows that this Process.Name API is very flaky for long process names during process startup and shutdown. Here are some snippets of output:

Short process name is read shortly after process startup:

2025-01-26T22:04:27.9074627-08:00 4208 Started
2025-01-26T22:04:27.9088986-08:00 4201 Exited
2025-01-26T22:04:27.9092349-08:00 4208 123456789012345
2025-01-26T22:04:27.9092615-08:00 4209 Started
2025-01-26T22:04:28.9094267-08:00 4206 Exited
2025-01-26T22:04:28.9106330-08:00 4212 Started
2025-01-26T22:04:28.9127174-08:00 4208 Exited

Short process name is read around process shutdown:

2025-01-26T22:08:50.2811874-08:00 4466 Started
2025-01-26T22:08:50.2812914-08:00 4468 Started
2025-01-26T22:08:50.2814119-08:00 4466 123456789012345
2025-01-26T22:08:51.2816386-08:00 4465 Exited
2025-01-26T22:08:51.2817512-08:00 4465 123456789012345
2025-01-26T22:08:51.2822449-08:00 4471 Started
2025-01-26T22:08:51.2824086-08:00 4468 Exited
2025-01-26T22:08:51.2826317-08:00 4472 Started
2025-01-26T22:08:51.2837465-08:00 4466 Exited

jkotas added a commit to jkotas/runtime that referenced this issue Jan 27, 2025
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Jan 27, 2025
jkotas added a commit to jkotas/runtime that referenced this issue Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Diagnostics.Process blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' in-pr There is an active PR which will close this issue when it is merged Known Build Error Use this to report build issues in the .NET Helix tab
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants