Skip to content

Commit

Permalink
Update build triage script to include pipeline analysis (#3692)
Browse files Browse the repository at this point in the history
This update primarily allows the script to correctly identify
tag-driven build pipelines. JDK8 has three types of these, and
all types should be located where possible.

We can also ignore pipelines that are still running, as these will
have incomplete results.

User-launched pipelines will also be ignored, as the script cannot
yet spot the difference between a rerun using default options,
and an experimental rerun whose outcome should not be used to
indicate the overall health of the build system.

I have also repaired the bug where unstable builds are erroneously
included in the list of failed builds. Unstable builds are typically
passed builds which have a number of failed post-build unit tests.

Addendum1: Excluding build github action when modifying build triage script

Addendum2: Enabling users to launch this workflow manually

Lastly, we no longer expect arm32 builds on JDK21 and up.

Minor cosmetic improvements included.

Signed-off-by: Adam Farley <adfarley@redhat.com>
  • Loading branch information
adamfarley authored Mar 8, 2024
1 parent efc1fe2 commit 3347a4f
Show file tree
Hide file tree
Showing 3 changed files with 160 additions and 34 deletions.
7 changes: 3 additions & 4 deletions .github/workflows/build-autotriage.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
---
# Runs a script to triage the latest timer-initiated Temurin build pipelines.

name: "Build Autotriage"
name: Build Autotriage

on:
workflow_dispatch:
schedule:
- cron: '0 0 * * MON'

Expand All @@ -17,7 +16,7 @@ jobs:
issues: write
runs-on: ubuntu-latest
name: Run Build Triage
if: github.repository == 'adoptium/temurin-build'
if: ${{ (github.repository == 'adoptium/temurin-build') || (github.event_name == 'workflow_dispatch') }}
steps:
- uses: actions/checkout@f43a0e5ff2bd294095638e18286ca9a3d1956744 # v3.6.0

Expand Down
1 change: 1 addition & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ on:
- "build-farm/**"
- "sbin/**"
- "**.sh"
- "!tooling/build_autotriage/**"
- ".github/workflows/build.yml"
- "security/**"
- "cyclonedx-lib/**"
Expand Down
186 changes: 156 additions & 30 deletions tooling/build_autotriage/build_autotriage.sh
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ temurinPlatforms+=("aix-ppc64"); platformStart+=(8); platformEnd+=(9
temurinPlatforms+=("alpine-linux-aarch64"); platformStart+=(21); platformEnd+=(99)
temurinPlatforms+=("alpine-linux-x64"); platformStart+=(8); platformEnd+=(99)
temurinPlatforms+=("linux-aarch64"); platformStart+=(8); platformEnd+=(99)
temurinPlatforms+=("linux-arm"); platformStart+=(8); platformEnd+=(99)
temurinPlatforms+=("linux-arm"); platformStart+=(8); platformEnd+=(20)
temurinPlatforms+=("linux-ppc64le"); platformStart+=(8); platformEnd+=(99)
temurinPlatforms+=("linux-s390x"); platformStart+=(11); platformEnd+=(99)
temurinPlatforms+=("linux-x64"); platformStart+=(8); platformEnd+=(99)
Expand Down Expand Up @@ -101,6 +101,57 @@ argumentParser() {
done
}

# Takes a TRSS pipeline ID and jdk version, and determines if the pipeline was started by a user.
# Returns 0 (false), 1 (true), or 2 (if we couldn't get the information).
# Args: wasPipelineStartedByUser pipelineIDexample jdk[0-9]+u?
wasPipelineStartedByUser() {
pipelineID=$1
jenkinsJDK=$2
sampleBuildName="none"
sampleBuildNum="none"

# We fetch the list, but only need the first build's output.
tempListOfPipelineBuilds=$(wget -q -O - "https://trss.adoptium.net/api/getAllChildBuilds?parentId=${pipelineID}&buildNameRegex=^${jenkinsJDK}\-.*temurin$")

# Identify a single build.
for jsonEntry in $tempListOfPipelineBuilds
do
if [[ $jsonEntry =~ ^\"buildName\"\:.* ]]; then
sampleBuildName="${jsonEntry:13:-1}"
elif [[ $jsonEntry =~ .*\"buildNum\"\.* ]]; then
sampleBuildNum="${jsonEntry:11}"
fi
if [[ ! "${sampleBuildName}_${sampleBuildNum}" =~ none ]]; then
continue
fi
done

# Abort if we can't find a build in this pipeline.
if [[ "${sampleBuildName}_${sampleBuildNum}" =~ none ]]; then
return 2
fi

# Now we retrieve the build job output, limit to first dozen lines, and check if a user started it.
failedJob="https://ci.adoptium.net/job/build-scripts/job/jobs/job/${jenkinsJDK}/job/${sampleBuildName}/${sampleBuildNum}/consoleText"

wget -q -O - "${failedJob}" > ./jobOutput.txt

# Limit the scan to the first 20 lines. If the line we're looking for exists, it'll be at the very top. Time saving.
counter=0

while IFS= read -r jobOutputLine; do
counter=$((counter+1))
if [[ "$jobOutputLine" =~ Started.by.user ]]; then
return 1
fi
if [[ ${counter} -gt 20 ]]; then
return 0
fi
done < ./jobOutput.txt

return 0
}

# Iterates over the supplied JDK versions and identifies the latest timer-triggered build URLs for each version.
# This function then checks that we're building Eclipse Temurin on every platform we should be, and makes a list
# of all the failing builds.
Expand All @@ -114,7 +165,13 @@ identifyFailedBuildsInTimerPipelines() {
latestTimerPipelineRaw=$(wget -q -O - "https://trss.adoptium.net/api/getBuildHistory?buildName=openjdk${arrayOfAllJDKVersions[v]}-pipeline")
latestTimerPipelineRaw="${latestTimerPipelineRaw},HereIsTheEndOfAVeryLongFile"
latestTimerPipeline=""
latestJdk8Pipelines=("none" "none" "none")
latestTimerJenkinsJobID=""
pipelineStatus="unknown"
jdkJenkinsJobVersion="jdk${arrayOfAllJDKVersions[v]}${arrayOfUs[v]}"
if [[ ${arrayOfAllJDKVersions[v]} -eq ${headJDKVersion} ]]; then
jdkJenkinsJobVersion="jdk"
fi
oldIFS=$IFS
IFS=","

Expand All @@ -133,51 +190,121 @@ identifyFailedBuildsInTimerPipelines() {
latestTimerJenkinsJobID=${jsonEntry:11}
fi

if [[ ! $jsonEntry =~ .*user.* ]]; then
if [[ $jsonEntry =~ ^\"startBy\"\:\"timer\"[\}]?$ ]]; then
break
elif [[ $jsonEntry =~ ^\"startBy\"\:\".*build-scripts/weekly-openjdk.* ]]; then
break
elif [[ $jsonEntry =~ ^\"startBy\"\:\".*releaseTrigger_[0-9]+ea.* ]]; then
if [[ $jsonEntry =~ ^\"status\" ]]; then
pipelineStatus=${jsonEntry:10:-1}
fi

# Skip pipelines that are still running.
if [[ $jsonEntry =~ ^\"startBy\" ]]; then
if [[ ! $pipelineStatus == "Done" ]]; then
pipelineStatus="unknown"
continue
fi
fi

# Expecting 3 pipelines for jdk8, but only 1 for other versions.
if [[ ${arrayOfAllJDKVersions[v]} -eq 8 ]]; then
if [[ $jsonEntry =~ ^\"startBy\".*betaTrigger.8ea ]]; then
if [[ $jsonEntry =~ betaTrigger\_8ea.x64AlpineLinux && ${latestJdk8Pipelines[1]} == "none" ]]; then
if wasPipelineStartedByUser "$latestTimerPipeline" "${jdkJenkinsJobVersion}"; then
latestJdk8Pipelines[1]=$latestTimerPipeline
echo "Found Alpine Linux JDK8 pipeline here: https://ci.adoptium.net/job/build-scripts/job/openjdk8-pipeline/${latestTimerJenkinsJobID}/"
fi
elif [[ $jsonEntry =~ betaTrigger\_8ea\_arm32Linux && ${latestJdk8Pipelines[2]} == "none" ]]; then
if wasPipelineStartedByUser "$latestTimerPipeline" "${jdkJenkinsJobVersion}"; then
latestJdk8Pipelines[2]=$latestTimerPipeline
echo "Found Arm32 Linux JDK8 pipeline here: https://ci.adoptium.net/job/build-scripts/job/openjdk8-pipeline/${latestTimerJenkinsJobID}"
fi
elif [[ ${latestJdk8Pipelines[0]} == "none" ]]; then
if wasPipelineStartedByUser "$latestTimerPipeline" "${jdkJenkinsJobVersion}"; then
latestJdk8Pipelines[0]=$latestTimerPipeline
echo "Found core JDK8 pipeline here: https://ci.adoptium.net/job/build-scripts/job/openjdk8-pipeline/${latestTimerJenkinsJobID}"
fi
fi
if [[ ${latestJdk8Pipelines[0]} != "none" && ${latestJdk8Pipelines[1]} != "none" && ${latestJdk8Pipelines[2]} != "none" ]]; then
echo "Found all 3 pipelines for JDK8."
break
fi
fi
else
if [[ $jsonEntry =~ ^\"startBy\"\:\"timer ]]; then
if wasPipelineStartedByUser "$latestTimerPipeline" "${jdkJenkinsJobVersion}"; then
break
fi
elif [[ $jsonEntry =~ ^\"startBy\"\:\".*build-scripts/weekly-openjdk ]]; then
if wasPipelineStartedByUser "$latestTimerPipeline" "${jdkJenkinsJobVersion}"; then
break
fi
elif [[ $jsonEntry =~ ^\"startBy\"\:.*betaTrigger_${arrayOfAllJDKVersions[v]}ea ]]; then
if wasPipelineStartedByUser "$latestTimerPipeline" "${jdkJenkinsJobVersion}"; then
break
fi
fi
fi

if [[ $jsonEntry =~ ^HereIsTheEndOfAVeryLongFile$ ]]; then
errorLog "Could not find any timer/ea-tag triggered pipeline jobs for jdk${arrayOfAllJDKVersions[v]}${arrayOfUs[v]}. Skipping to the next jdk version."
if [[ ${arrayOfAllJDKVersions[v]} -eq 8 ]]; then
if [[ ${latestJdk8Pipelines[0]} != "none" || ${latestJdk8Pipelines[1]} != "none" || ${latestJdk8Pipelines[2]} != "none" ]]; then
errorLog "Could not find all three of the pipelines for jdk8. Will triage the pipelines we could find."
continue 1
fi
fi
errorLog "Could not find any non-user pipeline jobs for ${jdkJenkinsJobVersion}. Skipping to the next jdk version."
continue 2
fi
done

echo "Found TRSS pipeline id for jdk${arrayOfAllJDKVersions[v]}${arrayOfUs[v]} - ${latestTimerPipeline}"
echo "Whose URL is: https://ci.adoptium.net/job/build-scripts/job/openjdk${arrayOfAllJDKVersions[v]}-pipeline/${latestTimerJenkinsJobID}/"
if [[ ! ${arrayOfAllJDKVersions[v]} -eq 8 ]]; then
echo "Found TRSS pipeline id for ${jdkJenkinsJobVersion} - ${latestTimerPipeline}"
echo "Whose URL is: https://ci.adoptium.net/job/build-scripts/job/openjdk${arrayOfAllJDKVersions[v]}-pipeline/${latestTimerJenkinsJobID}/"
fi

# Now grab a full list of builds launched by this pipeline.
jdkJenkinsJobVersion="jdk${arrayOfAllJDKVersions[v]}${arrayOfUs[v]}"
if [[ ${arrayOfAllJDKVersions[v]} -eq headJDKVersion ]]; then
jdkJenkinsJobVersion="jdk"
listOfPipelineBuilds=""
if [[ ${arrayOfAllJDKVersions[v]} -eq 8 ]]; then
for jp in "${!latestJdk8Pipelines[@]}"
do
if [[ ${latestJdk8Pipelines[jp]} != "none" ]]; then
echo "wgetting https://trss.adoptium.net/api/getAllChildBuilds?parentId=${latestJdk8Pipelines[jp]}&buildNameRegex=^jdk8u\-.*temurin$"
listOfPipelineBuilds+=$(wget -q -O - "https://trss.adoptium.net/api/getAllChildBuilds?parentId=${latestJdk8Pipelines[jp]}&buildNameRegex=^jdk8u\-.*temurin$")
fi
done
else
echo "wgetting https://trss.adoptium.net/api/getAllChildBuilds?parentId=${latestTimerPipeline}&buildNameRegex=^${jdkJenkinsJobVersion}.*temurin$"
listOfPipelineBuilds=$(wget -q -O - "https://trss.adoptium.net/api/getAllChildBuilds?parentId=${latestTimerPipeline}&buildNameRegex=^${jdkJenkinsJobVersion}\-.*temurin$")
fi
echo "wgetting https://trss.adoptium.net/api/getAllChildBuilds?parentId=${latestTimerPipeline}&buildNameRegex=^jdk${arrayOfAllJDKVersions[v]}${arrayOfUs[v]}.*temurin$"
listOfPipelineBuilds=$(wget -q -O - "https://trss.adoptium.net/api/getAllChildBuilds?parentId=${latestTimerPipeline}&buildNameRegex=^${jdkJenkinsJobVersion}\-.*temurin$")

declare -a listOfBuildNames
declare -a listOfBuildNums
declare -a listOfBuildResults

shorterListOfBuilds=""

# Using this single-build tuple to ensure all of the build data lines up in the three arrays.
sbTuple=("none" "none" "none")

# Now we identify each build in the pipeline.
for jsonEntry in $listOfPipelineBuilds
do
if [[ $jsonEntry =~ ^\"buildName\"\:.* ]]; then
listOfBuildNames+=("${jsonEntry:13:-1}")
shorterListOfBuilds+="${jsonEntry},"
sbTuple[0]=${jsonEntry}
elif [[ $jsonEntry =~ .*\"buildNum\"\.* ]]; then
listOfBuildNums+=("${jsonEntry:11}")
sbTuple[1]=${jsonEntry}
elif [[ $jsonEntry =~ .*\"buildResult\".* ]]; then
listOfBuildResults+=("${jsonEntry:15:-1}")
continue
sbTuple[2]=${jsonEntry}
elif [[ $jsonEntry =~ \"_id\" ]]; then
sbTuple=("none" "none" "none")
fi
if [[ ! "${sbTuple[0]},${sbTuple[1]},${sbTuple[2]}" =~ none ]]; then
listOfBuildNames+=("${sbTuple[0]:13:-1}")
listOfBuildNums+=("${sbTuple[1]:11}")
listOfBuildResults+=("${sbTuple[2]:15:-1}")
shorterListOfBuilds+="${sbTuple[0]},"
sbTuple=("none" "none" "none")
fi
done

echo "That pipeline's builds have been identified. Now validating them."
echo "The builds for those pipelines have been identified. Now validating them."

IFS=$oldIFS

Expand Down Expand Up @@ -211,7 +338,7 @@ identifyFailedBuildsInTimerPipelines() {
done

if [[ ${triageThesePlatforms} = "" ]]; then
errorLog "Cannot find any valid build platforms launched by jdk ${arrayOfAllJDKVersions[v]}${arrayOfUs[v]} pipeline ${latestTimerJenkinsJobID}. Skipping to the next jdk version."
errorLog "Cannot find any valid build platforms launched by ${jdkJenkinsJobVersion} pipelines. Skipping to the next jdk version."
continue
fi
echo "Platforms validated. Identifying build numbers for these platforms: ${triageThesePlatforms:1:-1}"
Expand All @@ -221,13 +348,9 @@ identifyFailedBuildsInTimerPipelines() {
for b in "${!listOfBuildNames[@]}"
do
if [[ $triageThesePlatforms =~ .*,${listOfBuildNames[$b]},.* ]]; then
if [[ ! ${listOfBuildResults[$b]} =~ ^SUCCESS$ ]]; then
if [[ ! ${listOfBuildResults[$b]} =~ ^UNSTABLE$ ]]; then
jdkJenkinsJobVersion="jdk${arrayOfAllJDKVersions[v]}${arrayOfUs[v]}"
if [[ ${arrayOfAllJDKVersions[v]} -eq headJDKVersion ]]; then
jdkJenkinsJobVersion="jdk"
fi
failedJobLink="https://ci.adoptium.net/job/build-scripts/job/jobs/job/${jdkJenkinsJobVersion}/job/${listOfBuildNames[$b]}/${listOfBuildNums[$b]}/"
if [[ ! ${listOfBuildResults[b]} =~ ^SUCCESS$ ]]; then
if [[ ! ${listOfBuildResults[b]} =~ ^UNSTABLE$ ]]; then
failedJobLink="https://ci.adoptium.net/job/build-scripts/job/jobs/job/${jdkJenkinsJobVersion}/job/${listOfBuildNames[b]}/${listOfBuildNums[b]}/"
echo "Identified a failed build for triage: ${failedJobLink}"
arrayOfFailedJobs+=("${failedJobLink}")
fi
Expand All @@ -246,6 +369,7 @@ buildFailureTriager() {
# Iterate over the failures found and triage them against the pending array of regexes.
for failedJob in "${arrayOfFailedJobs[@]}"; do
wget -q -O - "${failedJob}/consoleText" > ./jobOutput.txt

# If the file size is beyond 50m bytes, then report script error and do not triage, for efficiency.
fileSize=$(wc -c < ./jobOutput.txt)
if [[ ${fileSize} -gt 52500000 ]]; then
Expand All @@ -254,6 +378,7 @@ buildFailureTriager() {
totalBuildFailures=$((totalBuildFailures+1))
continue
fi

while IFS= read -r jobOutputLine; do
for regexIndex in "${!arrayOfRegexes[@]}"; do
# When a regex matches, store the id of the regex we matched against, and also the line of output that matched the regex.
Expand All @@ -269,12 +394,13 @@ buildFailureTriager() {
fi
done
done < ./jobOutput.txt

# If we reach this line, then we have not matched any of the regexs
arrayOfRegexsForFailedJobs+=("Unmatched")
arrayOfErrorLinesForFailedJobs+=("No error found")
totalBuildFailures=$((totalBuildFailures+1))
done
echo "Triage has ended."
echo "Triage has ended."
}

# Stores everything we've found in a markdown-formatted file.
Expand Down

0 comments on commit 3347a4f

Please sign in to comment.