Skip to content

Conversation

karolz-ms
Copy link
Member

Description

I am often hitting a timeout associated with application startup when running these tests on my dev machine. I am rebuilding DCP often and Defender takes time to scan new bits.

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?

@karolz-ms karolz-ms requested review from radical and Copilot May 22, 2025 21:08
@karolz-ms karolz-ms requested a review from mitchdenny as a code owner May 22, 2025 21:08
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label May 22, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Addresses intermittent test failures by extending the application startup timeout in DistributedApplicationTests to accommodate slower environments (e.g., Defender scans).

  • Replaces shorter DefaultOrchestratorTestTimeout with DefaultOrchestratorTestLongTimeout for app.StartAsync() calls across various tests.
  • Some subsequent resource retrieval and wait calls still use the shorter timeout—these should be made consistent.
  • Suggests reducing duplication in timeout invocations for maintainability.
Comments suppressed due to low confidence (4)

tests/Aspire.Hosting.Tests/DistributedApplicationTests.cs:683

  • This resource retrieval still uses the shorter DefaultOrchestratorTestTimeout after extending the startup timeout. Update to DefaultOrchestratorTestLongTimeout to keep timeouts consistent and avoid flakiness.
var aspireDashboard = await KubernetesHelper.GetResourceByNameMatchAsync<Executable>(kubernetes, $"aspire-dashboard-{ReplicaIdRegex}-{suffix}", r => r.Status?.EffectiveEnv is not null).DefaultTimeout(TestConstants.DefaultOrchestratorTestTimeout);

tests/Aspire.Hosting.Tests/DistributedApplicationTests.cs:723

  • This resource retrieval uses the shorter timeout while the startup call was extended. Switch to DefaultOrchestratorTestLongTimeout for consistency.
var aspireDashboard = await KubernetesHelper.GetResourceByNameMatchAsync<Executable>(kubernetes, $"aspire-dashboard-{ReplicaIdRegex}-{suffix}", r => r.Status?.EffectiveEnv is not null).DefaultTimeout(TestConstants.DefaultOrchestratorTestTimeout);

tests/Aspire.Hosting.Tests/DistributedApplicationTests.cs:1083

  • Waiting for text uses the shorter timeout after startup now uses the extended timeout. Consider using DefaultOrchestratorTestLongTimeout here as well to prevent premature failures.
await app.WaitForTextAsync("Application started.").DefaultTimeout(TestConstants.DefaultOrchestratorTestTimeout);

tests/Aspire.Hosting.Tests/DistributedApplicationTests.cs:1133

  • This wait call still relies on the default short timeout. Update to DefaultOrchestratorTestLongTimeout to align with the extended startup timeout.
await app.WaitForTextAsync("Application started.").DefaultTimeout(TestConstants.DefaultOrchestratorTestTimeout);

@karolz-ms
Copy link
Member Author

@RussKie @eerhardt Garnett tests failed twice for this PR. It cannot possibly affect anything related to Garnett support. Known issue?

@radical
Copy link
Member

radical commented May 22, 2025

 [+32/x2/?0] Aspire.Hosting.Garnet.Tests.dll (net8.0|x64) - Aspire.Hosting.Garnet.Tests.GarnetFunctionalTests.VerifyGarnetResource (7m 00s)
  
  Hang dump timeout of '00:07:00' expired
  The following tests were still running when dump was taken (format: [<time-elapsed-since-start>] <name>):
  [00:00:50] Aspire.Hosting.Garnet.Tests.GarnetFunctionalTests.VerifyGarnetResource
  Creating dump file '/home/runner/work/aspire/aspire/testresults/Aspire.Hosting.Garnet.Tests_3858_hang.dmp'
  [createdump] 000a7265 TickFrequency: 1000000 ticks per ms
  [createdump] 000a7265 PAGE_SIZE 4096
  [createdump] Gathering state for process 3858 Aspire.Hosting.
  [createdump] 000a7266 Thread 0f12 RIP 00007fd426898d71 RSP 00007fff27c5a520
...

We need to open an issue for this.

@karolz-ms
Copy link
Member Author

karolz-ms commented May 23, 2025

It might be related to the new DCP. Will try again when DCP 0.15.1 is merged

UPDATE nope the 0.15.1 hit the same issue

@danegsta
Copy link
Member

Garnet test failure was due to a new Garnet image being pushed that requires a specific command line argument in order to enable binding to 0.0.0.0.

@karolz-ms
Copy link
Member Author

I guess I need to rebase...

@karolz-ms karolz-ms force-pushed the dev/karolz/test-improvements branch from b28032e to df89579 Compare May 23, 2025 16:32
@karolz-ms karolz-ms merged commit 8e6c841 into main May 23, 2025
254 checks passed
@karolz-ms karolz-ms deleted the dev/karolz/test-improvements branch May 23, 2025 16:49
@github-actions github-actions bot locked and limited conversation to collaborators Jun 23, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants