-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flaky Test]: TestStandaloneUpgradeUninstallKillWatcher / TestStandaloneDowngradeToSpecificSnapshotBuild / TestStandaloneUpgradeRetryDownload – target version has the same commit hash #4152
Comments
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
I'll investigate why it didn't work. |
Interesting, I'm looking at the {
"Time": "2024-01-26T09:36:35.485254967Z",
"Action": "output",
"Package": "github.com/elastic/elastic-agent/testing/integration(linux-amd64-ubuntu-2204-upgrade)(sudo)",
"Test": "TestStandaloneDowngradeToSpecificSnapshotBuild",
"Output": " upgrade_downgrade_test.go:105: Testing Elastic Agent upgrade from 8.13.0-SNAPSHOT to 8.13.0-SNAPSHOT+l534sdis...\n"
} So, the target version was So, the check I introduced in #4095 passed because of that, but in the upgrader itself here elastic-agent/testing/upgradetest/upgrader.go Lines 220 to 222 in b39b9af
|
Maybe this can be related to our work for releasing Agent faster as we are adding the +buildID there |
I found 2 possible problems: 1. Our fetcher discards the build ID unless
|
var uri string | |
var prevErr error | |
if !f.snapshotOnly { | |
uri, prevErr = findURI(ctx, f.doer, version) | |
} | |
preVersion := version | |
version, _ = splitBuildID(version) | |
if uri == "" { | |
if !strings.HasSuffix(version, "-SNAPSHOT") { | |
version += "-SNAPSHOT" | |
} | |
uri, err = findURI(ctx, f.doer, version) | |
if err != nil { | |
return nil, fmt.Errorf("failed to find snapshot URI for version %s: %w (previous error: %w)", preVersion, err, prevErr) | |
} | |
} | |
path := fmt.Sprintf("elastic-agent-%s-%s", version, suffix) | |
downloadSrc := fmt.Sprintf("%s%s", uri, path) |
On line 72 we remove the build suffix and further we use only the version prefix.
We use the fetcher in the fixture like this:
elastic-agent/testing/integration/upgrade_downgrade_test.go
Lines 98 to 103 in b39b9af
endFixture, err := atesting.NewFixture( | |
t, | |
endParsedVersion.String(), | |
atesting.WithFetcher(atesting.ArtifactFetcher()), | |
) | |
require.NoError(t, err) |
However, it should not have broken the test cases reported in this issue, because in them f.snapshotOnly = false
.
2. Our artefact cache inside the fixture does not account for the build ID:
For each artefact download we have a log line like this:
{
"Time": "2024-01-26T09:12:47.267365303Z",
"Action": "output",
"Package": "github.com/elastic/elastic-agent/testing/integration(linux-amd64-ubuntu-2204-fleet-airgapped-privileged)(sudo)",
"Test": "TestFleetAirGappedUpgradePrivileged",
"Output": " fetcher_artifact.go:213: Downloading artifact from https://snapshots.elastic.co/8.13.0-ous7zk9j/downloads/beats/elastic-agent/elastic-agent-8.13.0-SNAPSHOT-linux-x86_64.tar.gz\n"
}
but there is no log line for downloading 8.13.0-l534sdis
and for TestStandaloneDowngradeToSpecificSnapshotBuild
we always have
{
"Time": "2024-01-26T09:32:57.4664485Z",
"Action": "output",
"Package": "github.com/elastic/elastic-agent/testing/integration(windows-amd64-2022-upgrade)(sudo)",
"Test": "TestStandaloneDowngradeToSpecificSnapshotBuild",
"Output": " fetcher.go:91: Using existing artifact elastic-agent-8.13.0-SNAPSHOT-windows-x86_64.zip\n"
}
The test fixture has its own cache for artefacts:
elastic-agent/pkg/testing/fixture.go
Lines 799 to 807 in b39b9af
res, err := f.fetcher.Fetch(ctx, f.operatingSystem, f.architecture, f.version) | |
if err != nil { | |
return "", err | |
} | |
path, err := cache.fetch(ctx, f.t, res) | |
if err != nil { | |
return "", err | |
} | |
return path, nil |
and it's using a result of the Name()
function as a cache key:
elastic-agent/pkg/testing/fetcher.go
Lines 86 to 99 in b39b9af
func (c *fetcherCache) fetch(ctx context.Context, l Logger, res FetcherResult) (string, error) { | |
name := res.Name() | |
src := filepath.Join(c.dir, name) | |
_, err := os.Stat(src) | |
if err == nil || os.IsExist(err) { | |
l.Logf("Using existing artifact %s", name) | |
return src, nil | |
} | |
err = res.Fetch(ctx, l, c.dir) | |
if err != nil { | |
return "", err | |
} | |
return src, nil | |
} |
which is just a path of the artefact
elastic-agent/pkg/testing/fetcher_artifact.go
Lines 99 to 101 in b39b9af
func (r *artifactResult) Name() string { | |
return r.path | |
} |
which is eventually just a filename built here:
path := fmt.Sprintf("elastic-agent-%s-%s", version, suffix) |
To summarise: our fetcher and cache don't support a build suffix. This means that every test that requests a certain build version would fetch the latest version snapshot instead. I suppose the tests reported by this issue are such tests that work with build IDs. I'm working on a fix. |
Build sample https://buildkite.com/elastic/elastic-agent/builds/6766#018d7830-5a99-4bea-8f4e-1300402c505c Problematic code: elastic-agent/testing/integration/upgrade_standalone_retry_test.go Lines 44 to 49 in 2555088
elastic-agent/testing/integration/upgrade_uninstall_test.go Lines 51 to 52 in 2555088
Working on the fix. |
Failing test case
TestStandaloneUpgradeUninstallKillWatcher / TestStandaloneDowngradeToSpecificSnapshotBuild / TestStandaloneUpgradeRetryDownload
Error message
target version has the same commit hash
Build
https://buildkite.com/elastic/elastic-agent/builds/6583
OS
Linux, Windows
Stacktrace and notes
The text was updated successfully, but these errors were encountered: