OPCT-226: cmd/report UX enhancements #76

mtulio · 2023-08-08T19:28:20Z

The opct report provides several improvements in the UX while reviewing the conformance results archive using OPCT, such as:

creating an intuitive HTML report allowing users to explore quickly
issues and navigate to the logs for each test failure
introduce several gates/SLO/checks to be used as post-processors and
get better visibility in the results, based on existing knowledge
base/CI data or external systems
providing a better CLI UI exploring results

Changes (overview)

Improvements

Documentation updates for review guides
report: now the counters are displaying the percentage of it (when compared with the total). So it can quickly have one idea of failures - in general, we expect lower than 1% of failures in a regular installation for OpenShift Conformance

==> Result Summary by test suite:
┌───────────────────────────────────────────┐
│ 05-openshift-cluster-upgrade: ✅          │
├───────────────────────────┬───────────────┤
│ Total tests               │ 1             │
│ Passed                    │ 0             │
│ Failed                    │ 0             │
│ Timeout                   │ 0             │
│ Skipped                   │ 1             │
│ Result Job                │ passed        │
└───────────────────────────┴───────────────┘
┌───────────────────────────────────────────┐
│ 10-openshift-kube-conformance: ✅         │
├───────────────────────────┬───────────────┤
│ Total tests               │ 399           │
│ Passed                    │ 399           │
│ Failed                    │ 0             │
│ Timeout                   │ 0             │
│ Skipped                   │ 0             │
│ Filter Failed Suite       │ 0 (0.00%)     │
│ Filter Failed KF          │ 0 (0.00%)     │
│ Filter Replay             │ 0 (0.00%)     │
│ Filter Failed Baseline    │ 0 (0.00%)     │
│ Filter Failed Priority    │ 0 (0.00%)     │
│ Filter Failed API         │ 0 (0.00%)     │
│ Failures (Priotity)       │ 0 (0.00%)     │
│ Result - Job              │ passed        │
│ Result - Processed        │ passed        │
└───────────────────────────┴───────────────┘
┌───────────────────────────────────────────┐
│ 20-openshift-conformance-validated: ❌    │
├───────────────────────────┬───────────────┤
│ Total tests               │ 3783          │
│ Passed                    │ 1574          │
│ Failed                    │ 16            │
│ Timeout                   │ 0             │
│ Skipped                   │ 2193          │
│ Filter Failed Suite       │ 14 (0.37%)    │
│ Filter Failed KF          │ 14 (0.37%)    │
│ Filter Replay             │ 13 (0.34%)    │
│ Filter Failed Baseline    │ 13 (0.34%)    │
│ Filter Failed Priority    │ 13 (0.34%)    │
│ Filter Failed API         │ 1 (0.03%)     │
│ Failures (Priotity)       │ 1 (0.03%)     │
│ Result - Job              │ failed        │
│ Result - Processed        │ failed        │
└───────────────────────────┴───────────────┘

report: a headline with grouped (by test tags) occurrences is displayed before each failed list for both conformance plugins

 => 10-openshift-kube-conformance: (36 failures, 28 flakes)
 --> Failed tests to Review (without flakes) - Immediate action:
[total=8] [sig-apps=2 (25.00%)] [sig-cli=2 (25.00%)] [sig-node=2 (25.00%)] [sig-api-machinery=1 (12.50%)] 
[sig-arch=1 (12.50%)]
[...]
--> Failed flake tests - Statistic from OpenShift CI
[total=28] [sig-api-machinery=12 (42.86%)] [sig-node=4 (14.29%)] [sig-network-edge=4 (14.29%)] [sig-trt=3 (10.71%)] 
[sig-architecture=3 (10.71%)] [bz-OLM=1 (3.57%)] [sig-arch=1 (3.57%)]
[...]
 => 20-openshift-conformance-validated: (102 failures, 43 flakes)

 --> Failed tests to Review (without flakes) - Immediate action:
[total=59] [sig-builds=15 (25.42%)] [sig-apps=10 (16.95%)] [sig-cli=7 (11.86%)] [sig-imageregistry=7 (11.86%)] 
[sig-auth=6 (10.17%)] [sig-network=6 (10.17%)] [sig-arch=2 (3.39%)] [sig-devex=1 (1.69%)] 
[sig-instrumentation=1 (1.69%)] [sig-api-machinery=1 (1.69%)] [sig-scheduling=1 (1.69%)] [sig-node=1 (1.69%)] 
[bz-Unknown=1 (1.69%)]
[...]
 --> Failed flake tests - Statistic from OpenShift CI
[total=43] [sig-api-machinery=12 (27.91%)] [sig-node=6 (13.95%)] [sig-arch=5 (11.63%)] [sig-network-edge=4 (9.30%)] 
[sig-trt=3 (6.98%)] [sig-instrumentation=3 (6.98%)] [sig-network=2 (4.65%)] [sig-architecture=2 (4.65%)] 
[sig-autoscaling=1 (2.33%)] [bz-OLM=1 (2.33%)] [bz-Routing=1 (2.33%)] [bz-Unknown=1 (2.33%)] 
[bz-kube-apiserver=1 (2.33%)] [bz-DNS=1 (2.33%)]
[...]

The final results binary [pass/fail] message (after filters) has been removed from the openshift conformance plugin. This field is temporarily removed to prevent mistakes when some issues happen in the filter pipeline, and also to focus on the failures, not on the binary value, considering that the goal is to zero-ed the failed tests.
run --watch --watch-interval: Status can set custom watch interval to decrease the number of logs in CI
report --diff: created as an alias of --baseline, setting it as deprecated
report is now showing the "Checks" section, applying some rules to help the reviewer where to focus, and prevent submitting results to the partner support case prematurely.

Error counters

report/Plugins: search for failures by pattern in the plugin logs and aggregate it as "Suite Errors"
report/Must-gather: search for failures by pattern in the pod logs (must-gather), aggregating it as "Workload Errors"

HTML report

report: extracts important information, processes it, and saves it to a local directory to be served by a local web server, allowing one to quickly navigate to the failures for each test using a browser.
report: rank by error count
report HTML: scrapes the test documentation for Kuebrnetes Conformance, shows the link for each item when it is available in the Kubernetes suites
report menu "Workload errors": has been added showing information about the pod logs counters
report menu "Suite errors": has been added showing information about the pod logs counters
report menu "Checks": has been added showing information about the result checklist
report tab "CAMGI": redirects to CAMGI static HTML page extracted from must-gather, when it is present, otherwise shows how to use CAMGI when it is not processed by the plugin
report tab "Filter": redirects to a static HTML page with all tests allowing the user to explore the test details
report tab "Events": redirects to a static HTML page with events created by Must-gather (extracted in the runtime)

Many other features in the WebUI.

Plugin Runtime

status: replace the message waiting for post-processor... to complete when the pod is finished.

# from
Sat, 15 Jul 2023 00:39:31 -03> Global Status: running
JOB_NAME                           | STATUS     | RESULTS    | PROGRESS                  | MESSAGE                                           
05-openshift-cluster-upgrade       | complete   |            | 0/0 (0 failures)          | waiting for post-processor...                     
10-openshift-kube-conformance      | complete   |            | 5/377 (0 failures)        | waiting for post-processor...                     
20-openshift-conformance-validated | running    |            | 0/3684 (0 failures)       | status=waiting-for=10-openshift-kube-conformance=(0/-372/0)=[3/1080]
99-openshift-artifacts-collector   | running    |            | 0/0 (0 failures)          | status=blocked-by=20-openshift-conformance-validated=(0/-3684/0)=[0/1080]

# to
Sat, 15 Jul 2023 01:17:58 -03> Global Status: running
JOB_NAME                           | STATUS     | RESULTS    | PROGRESS                  | MESSAGE                                           
05-openshift-cluster-upgrade       | complete   |            | 0/0 (0 failures)          | complete                                          
10-openshift-kube-conformance      | complete   |            | 5/377 (0 failures)        | complete                                          
20-openshift-conformance-validated | running    |            | 5/3684 (0 failures)       | status=running=T/C/P/F/S=3684/5/3/0/2             
99-openshift-artifacts-collector   | running    |            | 0/0 (0 failures)          | status=waiting-for=20-openshift-conformance-validated=(0/-3679/0)=[8/1080]

Supporting new openshift-tests plugin re-arch / refact: the main plugin/step orchestrating the conformance workflow (aka openshift-tests-plugin) has been refactored to Golang, especially targeting:
- A) unblock complex/parallel operations in the plugin runtime;
- B) delegate conformance executions to the tests image as a sidecar container, preventing plugin development when new dependencies are required by openshift-tests
- C) [drastically] decrease the amount of failures impacting the test results related to the OPCT test environment
- D) allow JUnit test processor before submitting to aggregator server.
Supporting plugin Replay

Bug fixes

report: Flake tests now is querying to correct OCP version on Sippy API
report: Flake Filter report prevents duplicated tests from immediate action
report: Flake Filter report does not show items reporting than 5% of flake count in OCP CI (by Sippy)

Flakes	Perc		 TestName
288	23.782%		[bz-DNS][invariant] alert/KubePodNotReady should not be at or above pending in ns/openshift-dns
967	79.851%		[bz-OLM][invariant] alert/KubePodNotReady should not be at or above pending in ns/openshift-marketplace
[...]

Filter pipeline is not breaking when the suite list is empty (collected by artifact collector plugin). Example:

 Total tests by conformance suites:
 - kubernetes/conformance: 0 
 - openshift/conformance: 0

Done checklist

create documentation for report
create documentation for the report checklist section
review tests
review feature description
Create a dedicated Jira Epic/cards for known issues

Documentation checklist

Review Rules
OPCT-226: Introduce knowledge document for OPCT Report Rules #77
OPCT-226: Added check rules for etcd #80
Command documentation report: OPCT-257: docs review documents for v0.5 report/review feature #113
Command documentation adm baseline *: OPCT-257: docs review documents for v0.5 report/review feature #113
Guide to 'Deep Dive' in the report options: OPCT-257: docs review documents for v0.5 report/review feature #113

openshift-ci · 2023-08-08T19:28:26Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from mtulio. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mtulio · 2023-08-08T19:57:49Z

internal/pkg/summary/consolidated.go

+		err := ps.Documentation.Load()
+		if err != nil {
+			return err
+		}


If load failed, we must log the error, but not return error and failed the execution. Considering this is not required by the execution, having tests without docs linked due to external connectivity must be acceptable.

mtulio · 2023-08-08T19:59:01Z

internal/pkg/summary/consolidated.go

+		}
+		err = ps.Documentation.BuildIndex()
+		if err != nil {
+			return err


same for this. What kind of errors we could have here? Is it acceptable to keep running the processing pipeline without this feature? Are we cleaning the test docs fields to prevent odd information in the test attributes when processing the HTML interface?

mtulio · 2023-08-08T20:03:16Z

internal/pkg/summary/mustgather.go

+					}
+				}
+			*/
+			break


Don't want to create directory for each item in the tarball, we'll choose what to filter, and where to save in the next case item.

Suggestion to move that case item to bottom, keeping the example of creating nested directories for directory items in the tarball, continuing the loop in the case - or just remove the case item. :)

mtulio · 2023-08-08T20:04:41Z

internal/pkg/summary/mustgather.go

+		switch {
+		// no more files
+		case err == io.EOF:
+			return nil


maybe we need to exit the loop if we want to return to the regular function execution.

mtulio · 2023-08-08T20:05:32Z

internal/pkg/summary/mustgather.go

+	}
+	log.Debug("Waiting for processing routines")
+	waiterProcNS.Wait()
+	return nil


TODO check what to do when successfully finished the processing

mtulio · 2023-08-08T20:07:25Z

internal/pkg/summary/plugin.go

+		for kerr, errName := range test.ErrorCounters {
+			if _, ok := ps.ErrorCounters[kerr]; !ok {
+				ps.ErrorCounters[kerr] = errName
+			} else {
+				ps.ErrorCounters[kerr] += errName
+			}


TODO if it is possible to move/use to shared functions in error.go

mtulio · 2023-08-23T19:31:24Z

Makefile

@@ -4,55 +4,68 @@ export GO111MODULE=on
 # Disable CGO so that we always generate static binaries:


TODO move the changes in Makefile to a dedicated PR targeting the "Project rename"

cmd/opct/root.go

mtulio · 2023-08-23T19:35:06Z

data/templates/report/README.md

@@ -0,0 +1,14 @@
+# Report HTML app


mtulio · 2023-08-23T19:36:14Z

data/templates/report/filter.html

@@ -0,0 +1,447 @@
+<!-- README for template delimiter: This file changed the template delimiter for Golang to '[ [' and '] ]',


is it possible to mve this file to a menu in default page?

mtulio · 2023-08-23T19:37:21Z

data/templates/report/report.html

+  <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-eOJMYsd53ii+scO/bJGFsiCZc+5NDVN2yr8+0RDqr0Ql0h+rP48ckxlpbzKgwra6" crossorigin="anonymous">
+  <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.0-beta3/dist/js/bootstrap.bundle.min.js" integrity="sha384-JEW9xMcG8R+pH31jmWH6WWP0WintQrMb4s7ZOdauHnUtxwoG2vI5DkLtS3qm9Ekf" crossorigin="anonymous"></script>
+  <script src="https://cdn.jsdelivr.net/npm/vue@2/dist/vue.js"></script>
+  <script src="https://unpkg.com/axios/dist/axios.min.js"></script>
+
+  <!-- Load required Bootstrap and BootstrapVue CSS -->
+  <link type="text/css" rel="stylesheet" href="https://unpkg.com/bootstrap/dist/css/bootstrap.min.css" />
+  <link type="text/css" rel="stylesheet" href="https://unpkg.com/bootstrap-vue@latest/dist/bootstrap-vue.min.css" />
+
+  <!-- Load polyfills to support older browsers -->
+  <script src="https://polyfill.io/v3/polyfill.min.js?features=es2015%2CIntersectionObserver" crossorigin="anonymous"></script>
+
+  <!-- Load Vue followed by BootstrapVue -->
+  <script src="https://unpkg.com/vue@latest/dist/vue.min.js"></script>
+  <script src="https://unpkg.com/bootstrap-vue@latest/dist/bootstrap-vue.min.js"></script>
+
+  <!-- Load the following for BootstrapVueIcons support -->
+  <script src="https://unpkg.com/bootstrap-vue@latest/dist/bootstrap-vue-icons.min.js"></script>


is there duplicated scripts?

mtulio · 2023-08-23T20:33:13Z

internal/pkg/summary/testdoc.go

+func (d *TestDocumentation) Load() error {
+	req, err := http.NewRequest(http.MethodGet, *d.SourceBaseURL, nil)
+	if err != nil {
+		// fmt.Printf("client: could not create request: %s\n", err)


Suggested change

// fmt.Printf("client: could not create request: %s\n", err)

mtulio · 2023-08-23T20:34:03Z

internal/pkg/summary/testdoc.go

+	return nil
+}
+
+func (d *TestDocumentation) BuildIndex() error {


TODO add test for this function

mtulio · 2023-08-23T20:35:44Z

pkg/report/cmd.go

 	}

 	if input.saveTo != "" {
+		// TODO: ConsolidatedSummary should be migrated to SaveResults


Suggested change

// TODO: ConsolidatedSummary should be migrated to SaveResults

mtulio · 2023-08-23T20:37:40Z

pkg/report/cmd.go

+	if p.Name == summary.PluginNameOpenShiftUpgrade || p.Name == summary.PluginNameArtifactsCollector {
+		return
+	}
+	// fmt.Printf("   - Failed (without filters) : %s\n", calcPercStr(int64(len(p.FailedList)), stat.Total))


Suggested change

// fmt.Printf(" - Failed (without filters) : %s\n", calcPercStr(int64(len(p.FailedList)), stat.Total))

mtulio · 2023-08-23T20:39:54Z

pkg/status/status.go

@@ -20,24 +20,40 @@ import (
 )

 const (
-	StatusInterval   = time.Second * 10
+	DefaultStatusIntervalSeconds = 10
+	// StatusInterval               = time.Second * 10


Move this change to isolate PR?

Suggested change

// StatusInterval = time.Second * 10

This change introduces the document used in the report enhancement in progress at #76 The Rules are a set of tests with fixed acceptance criteria that will evaluate the data collected when running `report` command. To allow flexibility and increase the UX, we are only keeping in the code the "Rule ID", linking it to a "Knowledge Base" to explore why the rule is failing, and the recommended next steps. The version of #76 is tested internally by developers but the document can be available as a dev preview when the report frontend is mounting the hyperlinks for the Rules. The demo is available here (direct 4:40) [Red Hat only]: https://drive.google.com/file/d/1v4EIO1mXesDKKy0meZeiVcZ-kSjf7o_E/view?usp=sharing The rules are implemented here: https://github.com/redhat-openshift-ecosystem/provider-certification-tool/blob/87d4f19f7454ed388d546b0737f7462a6235189f/internal/pkg/summary/checks.go This is how it will be referenced in CLI: `./opct-dev report results-failed-registry.tar.gz --save-to /tmp/results/ --loglevel debug` ![Screenshot from 2023-08-23 16-12-42](https://github.com/redhat-openshift-ecosystem/provider-certification-tool/assets/3216894/d9f19624-42e7-40ab-bf1d-f6f7a695c416) This is how it will be linked in the HTML report: ![Screenshot from 2023-08-23 16-15-36](https://github.com/redhat-openshift-ecosystem/provider-certification-tool/assets/3216894/4237b6af-f43f-49cf-a73f-63cc42b182b8) ![Screenshot from 2023-08-23 16-17-17](https://github.com/redhat-openshift-ecosystem/provider-certification-tool/assets/3216894/079209dd-d57a-4a87-8401-d3b7f343f2a1)

This change introduces the [camgi](https://github.com/elmiko/camgi.rs) report to the artifact collector plugin - results artifacts: ~~~ $ tree plugins/99-openshift-artifacts-collector/ plugins/99-openshift-artifacts-collector/ ├── definition.json ├── results │ └── global (...) │ ├── artifacts_must-gather_camgi.html (...) └── sonobuoy_results.yaml ~~~ CAMGI is a tool broadly used in OCP CI to help to troubleshoot many objects in must-gather with a friendly interface. The binary is pretty small, increasing 700 MiB to the compressed image in the Quay.io, and ~1MiB local. ~~~ $ podman images |grep tools quay.io/ocp-cert/tools v0.1.0 66dad94e6c9f 56 minutes ago 253 MB quay.io/ocp-cert/tools v0.0.0-f38min-oc4133-s05612-v0 58b81800eeca 4 weeks ago 252 MB ~~~ The report camgi.html could be bigger, depending on the size and logs in must-gather. The estimated size would be ~30-40MiB. This image will also produce a human-friendly version (SemVer) for the tools image (base image), preventing adding features to the label. This PR also can help in the report improvement introduced in redhat-openshift-ecosystem/opct#76 . Status: - [x] embed camgi binary to the base image - [x] build a dev image and test it - [x] Merge #43 (and rebase here) - [x] change the tools image from this PR to v0.2.0 (leave v0.1.0 to #43) - [ ] update OPCT documentation

This change introduce several improvements in the UX while reviewing the report by: - creationg an intuitive HTML report allowing users to quickly see issues and navigate to the logs for each test failure - introduce several gates/SLO/checks to be used as post-processor and get better visibility in the results, based in existing knowledge base/CI data or external ssytems See the PR with details of improvements: redhat-openshift-ecosystem#76 ---- fix sippy query in the flake filter create a rank of tags, adding percentage of counters create html report saving data to json improve report html add log links and rank by error count update camgi tab add waiting interval support status: improve message field with plugin phase report: generating filter.html file with failures table renameing vfs path for html templates report html - update rank and output as json using report as data source of CLI report; apply docs add timers to the runtime report update report collecting more env information report: update html templates to improve filters adding suite errors menu Update suite error menu using native js create a rank of error counters adding an option to extract errors from must-gather extracting must-gather event filters adding a tab to the report adding the structure of report checks introducing dynamic checks moving report to summary Adding checks with acceptance values from baseline adding alert pop to Checks menu parallelism processing must-gather, loweing 3 times proc time adding support to all plugins in report http file server to serve report review: documenting and linking check rules reorg mkdocs/todo report: embeding etcd error parser to json report report/must-gather: consolidating etd parser report: supporting etcd, network check and embed data remove codegen script / not used distributing report to packages parsing meta/run.log extracting runtime information introducing new parsers: opct and meta config renaming sippy to its package renaming packages and cleanup review report html and threasholds review timers for plugins add etcd checks ; cleanup review thresholds bumping plugins image review documentation and doc strings update check platform type fix tests in tags fix dev docs dev/report: adding initial flow of report Fix Containerfile for CI when releasing Fix windows build update makefile to fix windows build isolating status watch interval from redhat-openshift-ecosystem#87 reverting unrelated docs changes improving unit tests adding metaconfig unit tests and test data adding tests for meta-run.log parser adding parsers and tests for config maps fixes in the opct metrics report gen metrics report adding sample document to generate batch reports Review checks to support attributes review report to improve checks increasing the error pattern information for etcd taking notes for report dev doc adding plugin log extraction and link to the plugin name collecting node info extracting install-config review report html and metrics with adm parse cmd adding charts with ploty remove comments when plotting fix log save opct adm parse-etcd-logs to quickly access parsed logs (#4) increasing documentation coverage bump to use new quay.io org create target directory before extractors rename title of checks on report cli fixes in the report check prevent empty data fix/report/cli: show other plugin than k8s report/check: 012 - check plugin failure fixes after rebase rename namespace Supporting PDB to opct server add new gen openshift-tests plugin based in go allow custom openshift-tests image to devel w/ kind working version with entrypoint for plugin plugin manifests fixes to support remote entrypoint to tests image update plugin manifests to new plugin version refact CLI UI tables to enhance results intro yamllint and fixing yaml assets/manifests plugins working version the default flow enhance pre-run checks for missing config create local log file for all levels updatem plugins according to the latest goplugin version add collector image to the manifest template review packages / moving to report package using ETL-like cleanup must-gather/extractor packages refact/must-gather: isolate and tune leaky bucket processor cmd/run: fix flag to use full image path to allow CI presubmits review the result filter pipeline OPCT-292: cmd/publish - add experimental publish used in CI Add a experimental command to allow Prow CI to publish results in the opct-storage (S3) without needing to install dependencies in the test step, preventing failures. OPCT-292: cmd/publish - add metadata option when publishing artifact doc: add review/support documentations feat: upload baseline artifact on s3 report/feat: opct adm baseline (get|list|indexer|publish) review baseline reitrieving summary/baseline from service/API feat: add replay to filter pipeline feat: introduce adm setup-ndoe to helper in tests review plugin filter order fix checks to good tresholds

mtulio · 2024-08-08T06:29:32Z

This version is accepted. Running it to parse archives recently rehearsed on CI[1][2](which includes plugin upgrades) I am seeing observing results.

This PR, and the plugin refact one, are ready to move on.

cc @rvanderp3

=> Result Summary by test suite:
┌───────────────────────────────────────────┐
│ 05-openshift-cluster-upgrade: ✅          │
├───────────────────────────┬───────────────┤
│ Total tests               │ 1             │
│ Passed                    │ 0             │
│ Failed                    │ 0             │
│ Timeout                   │ 0             │
│ Skipped                   │ 1             │
│ Result Job                │ passed        │
└───────────────────────────┴───────────────┘
┌───────────────────────────────────────────┐
│ 10-openshift-kube-conformance: ✅         │
├───────────────────────────┬───────────────┤
│ Total tests               │ 378           │
│ Passed                    │ 377           │
│ Failed                    │ 1             │
│ Timeout                   │ 0             │
│ Skipped                   │ 0             │
│ Filter Failed Suite       │ 0 (0.00%)     │
│ Filter Failed KF          │ 0 (0.00%)     │
│ Filter Replay             │ 0 (0.00%)     │
│ Filter Failed Baseline    │ 0 (0.00%)     │
│ Filter Failed Priority    │ 0 (0.00%)     │
│ Filter Failed API         │ 0 (0.00%)     │
│ Failures (Priotity)       │ 0 (0.00%)     │
│ Result - Job              │ failed        │
│ Result - Processed        │ passed        │
└───────────────────────────┴───────────────┘
┌───────────────────────────────────────────┐
│ 20-openshift-conformance-validated: ✅    │
├───────────────────────────┬───────────────┤
│ Total tests               │ 3822          │
│ Passed                    │ 1437          │
│ Failed                    │ 15            │
│ Timeout                   │ 0             │
│ Skipped                   │ 2370          │
│ Filter Failed Suite       │ 14 (0.37%)    │
│ Filter Failed KF          │ 14 (0.37%)    │
│ Filter Replay             │ 9 (0.24%)     │
│ Filter Failed Baseline    │ 9 (0.24%)     │
│ Filter Failed Priority    │ 9 (0.24%)     │
│ Filter Failed API         │ 0 (0.00%)     │
│ Failures (Priotity)       │ 0 (0.00%)     │
│ Result - Job              │ failed        │
│ Result - Processed        │ passed        │
└───────────────────────────┴───────────────┘
┌───────────────────────────────────────────┐
│ 80-openshift-tests-replay: ⚠              │
├───────────────────────────┬───────────────┤
│ Total tests               │ 19            │
│ Passed                    │ 6             │
│ Failed                    │ 13            │
│ Timeout                   │ 0             │
│ Skipped                   │ 0             │
│ Filter Failed Suite       │ 13 (68.42%)   │
│ Filter Failed KF          │ 12 (63.16%)   │
│ Filter Replay             │ 0 (0.00%)     │
│ Filter Failed Baseline    │ 0 (0.00%)     │
│ Filter Failed Priority    │ 0 (0.00%)     │
│ Filter Failed API         │ 0 (0.00%)     │
│ Failures (Priotity)       │ 13 (68.42%)   │
│ Result - Job              │ failed        │
│ Result - Processed        │ failed        │
└───────────────────────────┴───────────────┘
┌───────────────────────────────────────────┐
│ 99-openshift-artifacts-collector: ✅      │
├───────────────────────────┬───────────────┤
│ Total tests               │ 16            │
│ Passed                    │ 16            │
│ Failed                    │ 0             │
│ Timeout                   │ 0             │
│ Skipped                   │ 0             │
│ Result Job                │ passed        │
└───────────────────────────┴───────────────┘


┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Validation checks / Results                                                                                                                                      │
├───────────┬───┬────────┬────────────────────────────────────────────────────────────────────────────────────────┬──────────────────────────────┬─────────────────┤
│     ID    │ # │ RESULT │                                       CHECK NAME                                       │            TARGET            │     CURRENT     │
├───────────┼───┼────────┼────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────┼─────────────────┤
│ OPCT-010  │ ⚠ │  warn  │ The cluster logs should generate fewer error reports in the logs                       │ W:<=30k,F:>100k              │ 39249           │
├───────────┼───┼────────┼────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────┼─────────────────┤
│ OPCT-020  │ ✔ │  pass  │ All nodes must be healthy                                                              │ 100%                         │ 100.000%        │
│ OPCT-021  │ ✔ │  pass  │ Pods Healthy must report higher than 98%                                               │ >=98%                        │ 99.000          │
│ OPCT-001  │ ✔ │  pass  │ Kubernetes Conformance [10-openshift-kube-conformance] must pass 100%                  │ Priority==0|Total!=Failed    │ Priority==0     │
│ OPCT-004  │ ✔ │  pass  │ OpenShift Conformance [20-openshift-conformance-validated]: Pass ratio must be >=98.5% │ Pass>=98.5%(Fail>1.5%)       │ Fail==0.39%(15) │
│ OPCT-005  │ ✔ │  pass  │ OpenShift Conformance Validation [20]: Filter Priority Requirement >= 99.5%            │ W<=0.50%,F>0.50%             │ Fail==0.24%(9)  │
│ OPCT-005B │ ✔ │  pass  │ OpenShift Conformance Validation [20]: Required to Pass After Filtering                │ Pass==100%(W<=0.50%,F>0.50%) │ Fail==0.00%(0)  │
│ OPCT-011  │ ✔ │  pass  │ The test suite should generate fewer error reports in the logs                         │ Pass<=150(W>150,F>300)       │ 56              │
│ OPCT-003  │ ✔ │  pass  │ Plugin Collector [99-openshift-artifacts-collector] must pass                          │ passed                       │ passed          │
│ OPCT-002  │ ✔ │  pass  │ Plugin Conformance Upgrade [05-openshift-cluster-upgrade] must pass                    │ passed                       │ passed          │
│ OPCT-010A │ ✔ │  pass  │ etcd logs: slow requests: average should be under 500ms                                │ <=500.00 ms                  │ 370.103         │
│ OPCT-010B │ ✔ │  pass  │ etcd logs: slow requests: maximum should be under 1000ms                               │ <=1000.00 ms                 │ 842.130         │
│ OPCT-022  │ ✔ │  pass  │ Detected one or more plugin(s) with potential invalid result                           │ passed                       │ passed          │
│ OPCT-023A │ ✔ │  pass  │ Sanity [10-openshift-kube-conformance]: potential missing tests in suite               │ F:<300                       │ Total==378      │
│ OPCT-023B │ ✔ │  pass  │ Sanity [20-openshift-conformance-validated]: potential missing tests in suite          │ F:<3000                      │ Total==3822     │
│ --        │ ✔ │  pass  │ Platform Type must be supported by OPCT                                                │ None|External|AWS|Azure      │ None            │
│ --        │ ✔ │  pass  │ Cluster Version Operator must be Available                                             │ True                         │ True            │
│ --        │ ✔ │  pass  │ Cluster condition Failing must be False                                                │ False                        │ False           │
│ --        │ ✔ │  pass  │ Cluster upgrade must not be Progressing                                                │ False                        │ False           │
│ --        │ ✔ │  pass  │ Cluster ReleaseAccepted must be True                                                   │ True                         │ True            │
│ --        │ ✔ │  pass  │ Infrastructure status must have Topology=HighlyAvailable                               │ HighlyAvailable              │ HighlyAvailable │
│ --        │ ✔ │  pass  │ Infrastructure status must have ControlPlaneTopology=HighlyAvailable                   │ HighlyAvailable              │ HighlyAvailable │
├───────────┼───┼────────┼────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────┼─────────────────┤
│ OPCT-030  │ ✔ │  skip  │ Node Topology: ControlPlaneTopology HighlyAvailable must use multi-zone                │ W:>1,P:>2                    │ Type==None      │
├───────────┼───┼────────┼────────────────────────────────────────────────────────────────────────────────────────┼──────────────────────────────┼─────────────────┤
│           │   │        │    TOTAL: 23, FAILED: 0 (0.00%), WARN: 1 (4.35%), PASS: 21 (91.30%), SKIP: 1 (4.35%)   │                              │                 │
└───────────┴───┴────────┴────────────────────────────────────────────────────────────────────────────────────────┴──────────────────────────────┴─────────────────┘

[1] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_release/54718/rehearse-54718-periodic-ci-redhat-openshift-ecosystem-provider-certification-tool-main-4.15-platform-none-vsphere/1821048927051321344/artifacts/platform-none-vsphere/provider-certification-tool-results/artifacts/
[2] https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_release/54718/rehearse-54718-periodic-ci-redhat-openshift-ecosystem-provider-certification-tool-main-4.15-platform-none-vsphere-upgrade/1821193104133197824/artifacts/platform-none-vsphere-upgrade/provider-certification-tool-results/artifacts/

The report feature introduces several improvements in the UX while reviewing the report by: - creationg an intuitive HTML report allowing users to quickly see issues and navigate to the logs for each test failure - introduce several gates/SLO/checks to be used as post-processor and get better visibility in the results, based in existing knowledge base/CI data or external ssytems - providing a better CLI UI exploring results See the PR with details of improvements: redhat-openshift-ecosystem#76

The report feature introduces several improvements in the UX while reviewing the report by: - creationg an intuitive HTML report allowing users to quickly see issues and navigate to the logs for each test failure - introduce several gates/SLO/checks to be used as post-processor and get better visibility in the results, based in existing knowledge base/CI data or external ssytems - providing a better CLI UI exploring results See the PR with details of improvements: #76

mtulio · 2024-08-08T18:06:31Z

Moving forward this feature. ✅

The CI tests is showing good results with rehearsal jobs:

Github Actions workflow for entire pipeline (including release alpha): https://github.com/redhat-openshift-ecosystem/provider-certification-tool/actions/runs/10306207517
Prow CI rehearsal for Platform None vSphere (default conformance workflow): https://prow.ci.openshift.org/job-history/gs/test-platform-results/pr-logs/directory/rehearse-54718-periodic-ci-redhat-openshift-ecosystem-provider-certification-tool-main-4.15-platform-none-vsphere
Prow CI rehearsal for Platform None vSphere (conformance upgrade workflow): https://prow.ci.openshift.org/job-history/gs/test-platform-results/pr-logs/directory/rehearse-54718-periodic-ci-redhat-openshift-ecosystem-provider-certification-tool-main-4.15-platform-none-vsphere-upgrade

This feature is also exercised by internal teams and partners since Aug 2023 with v0.5.0-alpha* releases. 🚀

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 8, 2023

mtulio commented Aug 8, 2023

View reviewed changes

mtulio force-pushed the devel-report-html branch 2 times, most recently from b2e1648 to 4426012 Compare August 23, 2023 06:23

mtulio mentioned this pull request Aug 23, 2023

OPCT-226: Introduce knowledge document for OPCT Report Rules #77

Merged

mtulio commented Aug 23, 2023

View reviewed changes

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 24, 2023

mtulio mentioned this pull request Aug 25, 2023

OPCT-226: Added camgi report to artifact collector plugin redhat-openshift-ecosystem/provider-certification-plugins#44

Merged

5 tasks

mtulio changed the title ~~OPCT report: enhancements in the review process~~ OPCT-226: report enhancements in the review process Aug 25, 2023

mtulio mentioned this pull request Aug 25, 2023

OPCT-210: added stats on etcd-parser-attl tool redhat-openshift-ecosystem/provider-certification-plugins#42

Closed

mtulio force-pushed the devel-report-html branch from 710784d to a811e31 Compare August 28, 2023 04:02

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 28, 2023

mtulio force-pushed the devel-report-html branch from a811e31 to 0b9fb41 Compare August 28, 2023 04:17

mtulio force-pushed the devel-report-html branch from 92e8279 to 91ab22d Compare September 21, 2023 21:12

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 21, 2023

mtulio force-pushed the devel-report-html branch from 91ab22d to 7e10673 Compare September 21, 2023 21:31

openshift-merge-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 21, 2023

mtulio force-pushed the devel-report-html branch from 7e10673 to 69a455b Compare September 22, 2023 18:44

This was referenced Sep 22, 2023

OPCT-213: backport plugins to support OCP 4.14 on v0.4 #79

Merged

OPCT-226: Added check rules for etcd #80

Merged

mtulio force-pushed the devel-report-html branch from dc21563 to 413e5dd Compare September 25, 2023 17:18

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 8, 2024

mtulio force-pushed the devel-report-html branch from 1cc6a1b to 50fe32e Compare August 8, 2024 05:15

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 8, 2024

mtulio force-pushed the devel-report-html branch from 50fe32e to c5d9ae5 Compare August 8, 2024 05:29

mtulio force-pushed the devel-report-html branch from c5d9ae5 to df9de61 Compare August 8, 2024 05:41

mtulio force-pushed the devel-report-html branch from df9de61 to 3f7c678 Compare August 8, 2024 05:53

mtulio force-pushed the devel-report-html branch from 3f7c678 to be60edf Compare August 8, 2024 06:14

mtulio marked this pull request as ready for review August 8, 2024 06:29

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 8, 2024

openshift-ci bot requested review from faermanj and jcpowermac August 8, 2024 06:29

mtulio force-pushed the devel-report-html branch from be60edf to c02c85c Compare August 8, 2024 16:38

mtulio merged commit f321a27 into redhat-openshift-ecosystem:main Aug 8, 2024
13 of 14 checks passed

mtulio deleted the devel-report-html branch August 8, 2024 18:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OPCT-226: cmd/report UX enhancements #76

OPCT-226: cmd/report UX enhancements #76

mtulio commented Aug 8, 2023 •

edited

Loading

openshift-ci bot commented Aug 8, 2023

mtulio Aug 8, 2023 •

edited

Loading

mtulio Aug 8, 2023

mtulio Aug 8, 2023

mtulio Aug 8, 2023

mtulio Aug 8, 2023

mtulio Aug 8, 2023

mtulio Aug 23, 2023

mtulio Aug 23, 2023

mtulio Aug 23, 2023

mtulio Aug 23, 2023

mtulio Aug 23, 2023

mtulio Aug 23, 2023

mtulio Aug 23, 2023

mtulio Aug 23, 2023

mtulio Aug 23, 2023

mtulio commented Aug 8, 2024

mtulio commented Aug 8, 2024

		@@ -4,55 +4,68 @@ export GO111MODULE=on
		# Disable CGO so that we always generate static binaries:

		@@ -0,0 +1,447 @@
		<!-- README for template delimiter: This file changed the template delimiter for Golang to '[ [' and '] ]',

OPCT-226: cmd/report UX enhancements #76

OPCT-226: cmd/report UX enhancements #76

Conversation

mtulio commented Aug 8, 2023 • edited Loading

Changes (overview)

Improvements

Error counters

HTML report

Plugin Runtime

Bug fixes

Done checklist

Documentation checklist

openshift-ci bot commented Aug 8, 2023

mtulio Aug 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mtulio commented Aug 8, 2024

mtulio commented Aug 8, 2024

mtulio commented Aug 8, 2023 •

edited

Loading

mtulio Aug 8, 2023 •

edited

Loading