Skip to content

Commit 9cc1afc

Browse files
nuclearcatJeny SadadianfrapradoRicardo CañueloHelen Koike
committed
Improve workflows (#2)
* src/scheduler: store error message when job fails with "submit_error" It is helpful for debugging to catch error message when scheduler fails to submit job to runtime. Store the error message to `data.error_msg` field. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config: pipeline: Set minimum kernel version for DT kselftest to 6.7 The test was introduced upstream in version 6.7, so no point in trying to run it on earlier versions. Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * configs/: Update volteer device Update volteer devices according lab availability Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * result_summary templates: detailed output for active/inactive regressions Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: new presets for active regressions Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: update CHANGELOG Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * data: chmod -R 777 ./data/output to avoid permission error Avoid errors like PermissionError: [Errno 13] Permission denied: '/home/kernelci/data/output/stable-rc-boot.html' Signed-off-by: Helen Koike <helen.koike@collabora.com> * result_summary: move code to _get_logs Signed-off-by: Helen Koike <helen.koike@collabora.com> * result_summary: use ThreadPoolExecutor to fetch logs Fetching logs is the bottleneck of the script. Fetch them in parallel with ThreadPoolExecutor. Signed-off-by: Helen Koike <helen.koike@collabora.com> * result_summary: fix result presets stable-rc-build-failures and stable-rc-boot-failures weren't querying specifically for test failures. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * src/regression_tracker: rework regression detection Take into account "active" and "inactive" regressions when creating them and when processing new passed or failed nodes. When a node passes, it checks if it "inactivates" an existing "active" regression. When a node fails, it checks if it needs to create a new regression or update an existing "active" one. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * src/regression_tracker: link failed nodes to active regressions When a failed node generates a regression, or when it's a re-run of a run that generated a still active regression, link the node to the regression id. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: support for date ranges for creation and update New command line options to let the user specify date ranges for node creation and last update: --created-from, --created-to, --last-updated-from, --last-updated-to Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary templates: support for date ranges for creation and last update Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: support for extra query parameters in cmdline New command line option: --query-params to specify a set of extra query parameters to complete or override preset parameters. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary presets: html markup in some preset titles Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary changelog: update and move to docs folder Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: move parameter loading and processing to 'setup' Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: refactor and split into two clases (single, run) Split the ResultSummary class into a base class and two child classes: ResultSummarySingle and ResultSummaryLoop (only a stub at this point). Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: WIP initial implementation of the "loop" command Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: huge refactoring Implement "summary" (single-shot) and "monitor" (loop) modes based on preset parameters instead of on the command-line main command. Split the logic into multiple files, move all monitor-specific and summary-specific code to independent files, common code in a separate file. Full of kludges, I don't like how this is looking so far, might consider reimplementing it without any dependencies on pipeline code. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary templates: fix markup and indentation Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: new generic templates for monitor mode Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary presets: examples for "monitor" and "summary" modes Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary changelog: summary and monitor modes Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary templates: fix generic regression report Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: summary: fix last_updated option handling Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: embed css stylesheet in html files Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * regression_tracker: [trivial] make regression active by default Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4 If the "result" field is ever made non-optional in the models we can probably remove this. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * regression_tracker: [trivial] set default empty node sequence Fixup for commit fcb29501663d78920bcd129bd57c36b9af624bc4 If the "node_sequence" field is ever made non-optional in the models we can probably remove this. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: add cmdline option --output-dir Introduce a new command-line option: --output-dir, and rename the old --output to --output-file. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary changelog: command-line options change Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * config: jobs-chromeos: remove meaningless Tast tests Several Tast tests can only fail in the context of KernelCI: * `video.PlatformDecoding.v4l2_state*_vp9_0_svc` do not actually exist, causing the whole test job to fail * `platform.DLCService*` and `platform.Memd` rely on features only present in the downstream Chrom{e,ium}OS kernel (see b/247467814 and b/244479619 for those having access to Google's issue tracker) * `kernel.ConfigVerify.chromeos` relies on downstream-only config options such as `CONFIG_SECURITY_CHROMIUMOS` and other similar ones, and therefore can only fail when testing upstream kernels Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: scheduler-chromeos: don't execute non-working Tast tests Currently, HEVC-related tests are known to either fail or be skipped as ChromeOS doesn't yet handle hardware decoding of HEVC media. This is expected to be fixed at some point though, so we're keeping the job definitions and only remove the corresponding scheduler entries in order to reinstate those jobs when relevant. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: jobs-chromeos: exclude Tast tests known to always fail Several decoder tests always fail on all platforms where they're executed, adding only noise to otherwise useful test results. Disable those for improving the quality of the results. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: chromeos: add special case for pre-6.7 qcom codec tests On Qualcomm-based ChromeBooks (`trogdor` being the only model in Collabora's lab), we noticed systematic failures of all `vp9_*_frm_resize` and `vp9_*_sub8x8_sf` tests when using a kernel up to 6.6. With 6.7 and above, all of those tests (except one) now pass. It therefore makes sense to exclude those on pre-6.7 kernels so we don't report known failures and get rid of some noise. This involves "duplicating" affected test jobs (although I did my best to minimize that) and setting rules so only the working variant is executed, based on the version of the kernel being tested. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * lava_callback: Compress the log files to save storage space As storage space in cloud and egress have high costs, better to compress potentially large files. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * tests: Add basic yaml validation Add yaml load to figure out earlier issues with yaml Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: chromeos: drop stoneyridge/pineview naming in platforms anchors The "stoneyridge" and "pineview" naming used in the Chromebook platform anchors refers to ChromiumOS specific config fragments, but doesn't necessarily match the actual platform of all the devices listed. Use more generic names to distinguish amd and intel Chromebooks. Signed-off-by: Laura Nao <laura.nao@collabora.com> * config: chromeos: rename test job anchors that use chromeos specific configs Rename test job anchors that use chromeos specific kernel configurations to include the 'chromeos' infix. Signed-off-by: Laura Nao <laura.nao@collabora.com> * config: chromeos: add baseline tests Enable the baseline tests on all the supported Chromebooks with their default kernel configuration. Signed-off-by: Laura Nao <laura.nao@collabora.com> * config: chromeos: drop stoneyridge/pineview naming in job defs The "stoneyridge" and "pineview" naming used in some Chromebook job definitions refers to ChromiumOS specific config fragments, but doesn't necessarily match the actual platforms targeted by the jobs. Replace all occurrences with more generic intel/amd naming. Signed-off-by: Laura Nao <laura.nao@collabora.com> * config: chromeos: drop chromeos infix from baseline jobs Keeping different job names for tests targeting different kernel configs might cause too much duplication. Drop the 'chromeos' infix from the job name for the tests using the chromeos config fragment. Users will be able to filter the results using the data.defconfig/data.config_full fields anyway. Signed-off-by: Laura Nao <laura.nao@collabora.com> * result_summary: post-process results for summary and monitor modes Split the post-processing of nodes to a common function that can be used for both summary and monitor modes. Currently, post-processing involves only the collection of logs. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary templates: update and fix presets and templates Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * doc/result-summary-CHANGELOG: update Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * config/pipeline.yaml: enable 'BayLibre' lab Add lab configuration for BayLibre. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * docker-compose.yaml: add `lab-baylibre` runtime Add runtime argument `lab-baylibre` to `scheduler-lava` container. This will enable the pipeline to run and submit jobs to BayLibre. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/pipeline.yaml: add `baseline-x86-baylibre` job Add job configuration `baseline-x86-baylibre` for BayLibre. Add scheduler entry as well. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/pipeline.yaml: add `baseline-armel-baylibre` job Add job configuration `baseline-armel-baylibre` for BayLibre. Add scheduler entry and platform config as well. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/pipeline: enable `android` tree and build configs Monitor linux `android` tree. Add build configs for `android-mainline` branch. Signed-off-by: Helen Koike <helen.koike@collabora.com> * config/pipeline.yaml: add kbuild definitions for android-mainline Add kbuild jobs to compile the kernel for android-mainline branch Signed-off-by: Helen Koike <helen.koike@collabora.com> * config/pipeline.yaml: add entries to schedule to build android-mainline Add entries to `scheduler:` section to run the builds for android-mainline. Signed-off-by: Helen Koike <helen.koike@collabora.com> * result_summary: fix node filter in monitor mode Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * kernelci.toml: set `checkout` node timeout to `180 min` Currently set `60 min` timeout is not enough as some `kbuild` jobs and its sub-tests take around 2 hrs to complete after getting submitted to runtime. Here is an example from staging. See the information for a `checkout` and its child nodes: | id | name | created | updated | timeout | |--------------------------|---------------------|----------------------------|----------------------------|----------------------------| | 661c9d59b60b785eb9fc42b0 | checkout | 2024-04-15T03:22:01.317000 | 2024-04-15T03:51:03.870000 | 2024-04-15T04:22:01.284000 | | 661c9d97b60b785eb9fc42b4 | kbuild-gcc-10-arm64 | 2024-04-15T03:23:03.399000 | 2024-04-15T03:50:15.031000 | 2024-04-15T09:23:03.399000 | | 661ca3f7b60b785eb9fc4ead | baseline-arm64 | 2024-04-15T03:50:15.304000 | 2024-04-15T05:09:45.247000 | 2024-04-15T09:50:15.304000 | Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * result_summary: add email report capabilities for monitor mode Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary templates: plain text single report templates Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * config: chromeos: add baseline-nfs tests Enable the baseline-nfs tests on all the supported Chromebooks, with both the default and the chromeos kernel configurations. Signed-off-by: Laura Nao <laura.nao@collabora.com> * src/timeout: set `checkout` result For `TIMEOUT` mode, set `checkout` node result to `fail` if its state is `running` as it means code checkout is still going on and node timed-out. Set it to `pass` if its state is any other than `running`. Set `checkout` node result to `pass` if mode is `DONE` as it means once `checkout` has been in `available` or `closing` state and it could successfully complete source code checkout. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * regression_tracker: bugfix, failed test with no prior runs Handle the case of a failed test run when it's the first occurence of that test case. Consider it "not a regression" for now, since we're defining a regression as a "breaking point" between a success and a failure. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * config: platforms-chromeos: fix dalboz device type Due due to a copy/paste mishap, the device type for `asus-CM1400CXA-dalboz` had a trailing `_chromeos`, leading LAVA to fail finding the correct device type, and no job from the new system running on this platform. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: jobs-chromes: run Tast tests only on 5.4+ Current ChromeOS images have `ext4` filesystems using options not present in 4.19. Therefore tests cannot run on kernels that old, and this leads to false positives in corrupt device identification, so we should only run those tests on 5.4 and later kernels. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: platforms-chromes: drop non-existent platform `hp-x360-12b-ca0500na-n4000-octopus` isn't a device type available in Collabora's LAVA lab, so let's drop its definition. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: exclude android tree from kbuild jobs Only Android-specific kbuild jobs should run for this tree, let's not overload our system with unneeded builds. Take this opportunity to limit mediatek kbuilds to 6.1+ as that's the earliest version that has upstream support for at least one of our devices. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * src/timeout: a bug fix in `_submit_lapsed_nodes` Fix a glitch in the code related to setting `checkout` node result. Fixes: 361fc0d ("src/timeout: set `checkout` result") Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * pipeline.yaml: Update early access FQDN We are moving k8s from eastus to westus3 as it is cheaper Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * src/tarball: fix `_kdir` in `update_repo` Fix the below error: ``` kernelci-pipeline-tarball | File "/home/kernelci/./pipeline/tarball.py", line 79, in _update_repo kernelci-pipeline-tarball | kernelci.shell_cmd(f"rm -rf {self._kdir}") kernelci-pipeline-tarball | ^^^^^^^^^^ kernelci-pipeline-tarball | AttributeError: 'Tarball' object has no attribute '_kdir' ``` Fixes: 0a2fe9c ("src/patchset.py: Implement Patchset service) Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/timeout: fix method to get child nodes recursively `TimeoutService._get_child_nodes_recursive` is used to get pending child nodes recursively for closing and timed-out nodes. It overwrites the result while being called recursively. Fix the method to make it work properly. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config: pipeline: rename "armel" arch to "arm" `armel` has various meanings depending on the system: for ChromeOS, it is ARMv7, while in Debian it's ARMv{5T,6}. Moreover, this project is *Kernel*CI and the kernel uses `arm` for all 32-bits ARM devices. In order to avoid confusion (including those wondering what the heck does `armel` mean), let's rename `armel` to `arm`. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: use per-system arch property where relevant With the new `*arch` fields present in the platform configurations, we don't have to hardcode the architecture strings in some specific cases. Let's adapt the config files so we use `{cros,deb,k}arch` wherever it makes sense. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * src/timeout: set timed-out `checkout` result Set timed-out `checkout` node result to `incomplete` while in `running` state. As it denotes that the node timed-out while checkout was still going on. Also, set error related information i.e. `error_code` and `error_msg`. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/tarball: update checkout node when update repo fails Tarball updates source code repo and creates tarball. If update repo operation fails even with second attempt, it means it failed to checkout souce code. Hence, update `checkout` node with state `done` state and result `fail`. Also, set appropriate error information to the `data` field. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config: pipeline: enable collabora-next tree and build config Monitor the collabora-next tree. Add build config for the for-kernelci branch. Signed-off-by: Laura Nao <laura.nao@collabora.com> * config: chromeos: enable acpi kselftest on collabora-next tree Run the ACPI kselftest on the for-kernelci branch of the collabora-next tree. See: https://lore.kernel.org/linux-kselftest/20240308144933.337107-1-laura.nao@collabora.com/T/#t Signed-off-by: Laura Nao <laura.nao@collabora.com> * result_summary: restore missing split_query_params function Restore this function that was accidentally removed during the last refactoring. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * lava_callback: Don't upload empty files to Azure There is no use for lot of empty files on Azure, that only complicate cleanup. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * result_summary presets: unify preset and output names Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary presets: update preset for aferraris Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary presets: new presets for laura.nao Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary presets: fixes and new presets for nfraprado Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary presets: fix arch query parameters Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * k8s: Lot of deployment tested fixes Fixes in yaml files for k8s production deployment. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * result-summary presets: Fix build failure and regression monitors Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * result_summary: added debug traces to the monitor Show detailed info of the node filterings in real time. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary: fix corner case bug when no logs are found Cover rare case where neither the node nor any of its parents up to the checkout node have any log artifacts. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary presets: refine stable-rc presets Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary templates: add regression info to test reports Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary templates: escape log snippets Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * src: lava_callback: add device ID to node data It can be useful to know the exact device on which a job ran, without having to open the LAVA job page. This is done by querying the device ID from the callback data and appending it to the node data. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * src: lava_callback: upload raw callback data as well Debugging callback issues is complex due to the raw data not being saved after processing. This change ensures we save the callback data as a JSON file in order to ease development. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * DONOTMERGE lava_callback: add debug statements Why the heck doesn't this just work??? Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * result_summary_templates: fix error 'node' is undefined The object is named test and not node, so s/node/test Signed-off-by: Helen Koike <helen.koike@collabora.com> * config/runtime/kunit: set architecture info Set architecture field for `kunit` test nodes. If no `arch` argument is supplied, kunit takes `um` (User Mode Linux) as architecture to run tests. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/timeout: count running child jobs of build nodes Add a method to count running jobs of `kbuild` nodes i.e. jobs being submitted after successful builds. Fox example `baseline` or `tast` jobs. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/timeout: handle closing `checkout` node differently Usually, `checkout` should be transited to `done` state when all its child nodes are completed. In case of closing `checkout`, take into account running child jobs of build nodes before transiting its state to `done`. Otherwise, `checkout` will be assigned to `done` state even if some child jobs are still running. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/timeout: handle holdoff reached `checkout` node differently Usually, available `checkout` for which holdoff is reached should be transited to `done` state only when all its child nodes are completed. In case of such `checkout` node, take into account running child jobs of build nodes before transiting its state to `done`. Otherwise, `checkout` will be assigned to `done` state even if some child jobs are still running. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * Revert "DONOTMERGE lava_callback: add debug statements" This reverts commit 5ed8218d99840373bbba5830b1976813b52bf4b1. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * Create dependabot.yml * result_summary_templates: make generic-test-failures generic to all results The generic-test-failures templates can be used to show general results just replacing the name "failures" by "results". Makeing it easier to be re-used by communities that want to have pre-sets to list all results of the tests, so: s/generic-test-failures/generic-test-results Signed-off-by: Helen Koike <helen.koike@collabora.com> * result-summary.yaml: add preset to list android build tests Since we now build android, add a preset to allow result-summary.yaml to list all build results from Android tree. Signed-off-by: Helen Koike <helen.koike@collabora.com> * tarball: Implement checkout for specific commit We often need not ToT, but specific commit, implement this. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * jobs-chromeos.yaml: Disable module compression for every kernel version Commit d4bbe942098b ("kbuild: remove CONFIG_MODULE_COMPRESS"), introduced in kernel v5.13, substituted CONFIG_MODULE_COMPRESS=n for CONFIG_MODULE_COMPRESS_NONE=y as the way to disable module compression. Since module compression causes "Invalid ELF header magic: != ELF" errors during boot on the ChromeOS base config, add the missing config to disable module compression on kernels > v5.13 as well. Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * src: lava_callback: reduce callback data size The callback data is quite large, especially as it includes the full log which we already upload separately. By dropping it and compressing the whole file with `gzip` we can avoid wasting too much storage space. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * src: lava_callback: don't leak secret token The callback data contains the secret tokens value which shouldn't be leaked. Ensure we drop it from the uploaded data. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: platforms-chromeos: use new cros-flash image This ensures we use the new version of the `install-modules` script. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * src: regression_tracker: add the "device" field to regression data This can be helpful. We're not using it as a search param though, as we don't want to narrow down the search that much, using the platform only is better. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: result_summary_templates: report device used for job This information is now available, and it can be useful to know the affected device withouth having to look at the LAVA job details. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * kubernetes: Update deployment recipe Update list of labs and add KCI_INSTANCE variable. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * lava-callback: Limit threads of lava-callback Due inrush of lava callbacks and slow Azure Files processing, we need to make sure we dont spawn too many threads. Also add hard limit of memory 1Gbyte Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * result_summary presets: add presetes for fluster test Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes: - Make template generic for all v4l2 tests - Rebase on main * result_summary presets: make the name of fluster test generic Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: enable first fluster test for mt8195-cherry-tomato-r2 Enable first fluster test, AV1-TEST-VECTORS for mt8195-cherry-tomato-r2. Run the test on mainline and next until more trees are added. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes: - Create generic v4l2-decoder-conformance-job and use anchers from it - Update the rootfs address - Move anchor to _anchor - Update with nitpicks * config: jobs-chromeos: Add kernelci tree for testing purpose Remove this commit before merging. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: chromeos: Enable cpufreq kselftest Enable cpufreq kselftest on all the trees and branches. Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com> * result_summary presets: fix preset for kselftest-dt failures monitor Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * result_summary presets: new presets for kselftest-cpufreq Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * config: mt8195-cherry-tomato-r2: enable all fluster tests for all branches Add all the trees and branches on which the tests would be ran. Enable all the tests for tomato. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes: - The build config cannot be added yet. Just list the trees, it will only use the branches configured in build_configs: - mainline will use master - next will use master - collabora-chromeos-kernel will use for-kernelci - media will use master and fixes - Remove kernelci tree as it was added just for testing purpose * config: mt8183-kukui-jacuzzi-juniper-sku16: enable add all supported fluster tests Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> jacuzzi * config: mt8186-corsola-steelix-sku131072: enable add all supported fluster tests Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: mt8192-asurada-spherion-r0: enable add all supported fluster tests Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes: - Don't specify the platforms manually as they are already mentioned in test-job-arm64-mediatek * config: sc7180-trogdor-kingoftown/lazor-limozeen: enable add all supported fluster tests Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes: - Use test-job-arm64-qualcomm instead and carete separate jobs for qualcomm devices - Don't specify platforms manually as they are already mentioned in test-job-arm64-qualcomm * build(deps): bump uwsgi from 2.0.21 to 2.0.22 in /docker/lava-callback Bumps [uwsgi](https://uwsgi-docs.readthedocs.io/en/latest/) from 2.0.21 to 2.0.22. --- updated-dependencies: - dependency-name: uwsgi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * pipeline.yaml: Add stable-rc build variants Add more build variants for stable-rc tree to match legacy system. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * result_summary: add error classification Classify errors according to patterns in the logs Signed-off-by: Helen Koike <helen.koike@collabora.com> * result_summary presets: add collabora-chromeos-kernel and media trees for fluster tests Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: Use media-stage instead of media-tree Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config/pipeline: enable android branches from legacy Enable all android branches from the legacy system Signed-off-by: Helen Koike <helen.koike@collabora.com> * trigger: Add exclude/include tree list for trigger As we need to restrict list of running kernels on staging, we need to add option allowing that. Also it will be good to exclude staging kernels from production kernel list. So in case of staging we need to run kernels only from tree "kernelci" and sometimes something else, for example "mediatek". Option will look like: --trees kernelci,mediatek or --trees kernelci On production we need to exclude trees kernelci and buggytree: --trees !kernelci,buggytree or just kernelci: --trees !kernelci Purpose of this option is that our compiling capacity is limited, and right now staging and production both compiling very large set of kernels, we need to reduce this amount to drop costs. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: platforms-chromeos: use CrOS R124 files ChromeBooks were upgraded with a new image based on ChromiumOS R124, so we must use those files now. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: jobs-chromeos: drop non-existent Tast tests Those were removed between R120 and R124 and therefore cause test failures with the new images. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * result_summary presets: fix acpi kselftest presets We're interested in catching regressions and failures in the both the kselftest-acpi test suites and its test cases. Match the nodes by group in the presets accordingly. Fix template used by the failure monitor preset. Signed-off-by: Laura Nao <laura.nao@collabora.com> * src: update return values of `APIHelper.receive_event_node` `APIHelper.receive_event_node` method is used to receive node data from PubSub event. The method has been updated to return `is_hierarchy` flag as well which represents events related to node hierarchy. Update pipeline services using the method accordingly. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * result_summary presets: refine presets for v4l2-decoder-conformance Modify the regression preset to monitor regressions on both the v4l2-decoder-conformance test suites and its test cases, by matching the nodes by group instead of by name. Also, change the failure preset to monitor for all errors caused by runtime errors. Signed-off-by: Laura Nao <laura.nao@collabora.com> * result_summary presets: add summary presets for v4l2-decoder-conformance Add summary presets to fetch regressions and failures on v4l2-decoder-conformance tests. Two of the presets are the same used by the monitor; add one additional preset to fetch all the failures on both the test suites and their test cases. Signed-off-by: Laura Nao <laura.nao@collabora.com> * lava_callback.py: Remove error_code/error_msg on lava-callback Sometimes due congestion node might be set to timeout, but then result might arrive late and we need to use it properly. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * result_summary presets: fix dt kselftest presets Fix the dt kselftest preset, just like was done for the acpi one, as the current preset doesn't match the actual results we're interested in. Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * doc/connecting-lab: refine documentation Refine documentation for connecting LAVA labs and submitting jobs to the lab. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * lava_callback: Sometimes we get totally invalid log file uploaded Most likely problems lays in threading of flask, and possibly callbacks are getting mixed. This commit attempts to introduce several countermeasures against that. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * doc: add `_index.md` page Add index documentation page. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * doc: add `pipeline-details` page Move `pipeline-details` documentation from the API repository to this repo to make it close to the source. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * doc/connecting-lab: adjust `weight` property Change `weight` property of existing doc page to accommodate with transition of pipeline related docs to pipeline repo. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * doc: add `developer-documentation` page Add developer manual documentation. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/pipeline.yaml: add lab config for Qualcomm Add an entry to `runtimes` section for Qualcomm lab configurations. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/pipeline.yaml: add `baseline-x86` job for qualcomm Add job configuration `baseline-x86-qualcomm` for running baseline job in Qualcomm LAVA lab. Add scheduler entry as well. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * docker-compose.yaml: add lab-qualcomm runtime Add runtime argument `lab-qualcomm` to `scheduler-lava` container. This will enable the pipeline to run and submit jobs to Qualcomm LAVA lab. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/pipeline.yaml: add `baseline-arm64` job for qualcomm Add job configuration `baseline-arm64-qualcomm` for running baseline job for `arm64` in Qualcomm LAVA lab. Add scheduler entry as well. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * pipeline.yaml: Update RISC-V configs 1)rv32 defconfig doesn't exist, remove 2)nommu_k210_defconfig have modules disabled Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * lava_callback.py: Sanitize lava log data As we use this data in reports, lets remove all non-printable characters as they confuse grafana, browsers and others. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config/runtime/kunit.jinja2: fix result map Fix result map for skipped tests. Initially, API didn't have `skip` available node result in the schema. That's why it was mapped to `None` result. But now API has `skip` result to denote skipped tests. Fix the result mapping accordingly. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config: jobs-chromeos: Add lab-setup fragment Add the lab-setup fragment to the chromebook builds, which contains the architecture independent kernel configs needed to run tests on the platform. Notably this disables IP autoconfig by the kernel. The result of this change is that the 12 seconds boot delay and the consequent deferred probe pending warnings will no longer happen on any platform. Particularly on mt8186-corsola-steelix-sku131072 (due to a different network adapter being used) on which it was still happening. Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * lava_callback: bump up slightly threads number Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: chromeos: enable watchdog reset test on Chromebooks Add a basic test to verify watchdog reset functionality. Enable the test on all ARM64 and AMD x86_64 Chromebooks. For Intel Chromebooks, enable the test only on octopus, as ACPI PM Timer on the other devices has been disabled in coreboot. Signed-off-by: Laura Nao <laura.nao@collabora.com> * src/send_kcidb: use schema version 4.3 Test status `MISS` was added to KCIDB in schema v4.2 and supported by the latest version i.e. v4.3. Hence, use the latest version for submission as API may send a few tests with "MISS" status. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * send_kcidb: re-structure code for parsing checkout node Move code for parsing checkout node to a separate method. Add `valid` field to parsed checkout node. It denotes if source code was successfully checked out. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: print more information on invalid data Print details for invalid revision data for the sake of debugging. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: optimize `kcidb` import Remove redundant `kcidb` import and adjust kcidb Client call accordingly. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: remove keys with `None` values KCIDB doesn't allow `None` as field value. Remove all optional fields with `None` value to make it valid data for submitting to KCIDB. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config: add `kcidb_test_suite` property Every KernelCI test will be mapped to a unified test suite for KCIDB data submission. Add `kcidb_test_suite` property to test job definitions in YAML configuration files. The added property will store the mapped KCIDB test suite name. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: parse and submit node test and build data Listen to all the node events with node state `done` or `available` and submit the node to KCIDB. Parse node received from the event and create KCIDB schema compatible object based on type of the node i.e. checkout, build or test. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: set `log_excerpt` for builds and tests Fetch logs from compressed log file(*.log.gz) URL and send last 16*1024 characters for setting `log_excerpt` field for build and test nodes as it is the max allowed length of the KCIDB field. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/jobs-chromes: add kcidb test suite property for watchdog test Add KCIDB test suite mapping for `watchdog_reset` test. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * lava_callback.py: disable log removal from callback data We need it for investigations if we have any critical data loss during log sanitizing. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * src/send_kcidb: add error info to build nodes Add error metadata fields such as `error_code` and `error_msg` to `misc` field for build nodes. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * result_summary presets: add watchdog-reset presets for mainline/next Add monitor and summary presets to track the results from the watchdog reset test on the mainline and next trees. Signed-off-by: Laura Nao <laura.nao@collabora.com> * pipeline.yaml: Fix fluster rootfs URL Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * src/send_kcidb: get error metadata for failed/incomplete tests Tweak condition to get error metadata for test nodes. It should get error info for incomplete nodes as well and not just failed nodes. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: send tests only if KCIDB test mapping exists All test suite definitions must have `kcidb_test_suite` property i.e. KCIDB test suite mapping. Only send tests for those the mapping is found. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * tests/validate_yaml: add validation for KCIDB mapping To submit KernelCI generated data to KCIDB, it is required to have a mapping for all the job definition with `kcidb_test_suite` property. Add validation to ensure all the jobs have a mapping present to avoid missing data submission. This check is to notify test authors trying to enable tests in maestro to include the required property for the mapping in their definition. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/pipeline.yaml: add qcs6490-rb3gen2 boot test Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com> * config: chromeos: Enable kselftest-dt on Qualcomm platforms Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * pipeline.yaml: Add one um build for android trees As per request of Android team it will be good to check for breakages UM builds as well. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: use `kind=job` for test suites As part of re-structuring test hierarachy, `Job` model has been introduced for test suite/job nodes. It uses node kind `job`. Update test configurations in `pipeline.yaml` and `jobs-chromeos.yaml` to use `kind=job` to generate job nodes. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/runtime/kunit.jinja2: provide `kind` value for child tests In case of submitting test hierarchy, child nodes by default inherit `kind` value from parent node. As we are re-structuring test hierarchy, test suit/job nodes will have `kind=job` where its child test nodes will have `kind=test`. Provide `kind` field explicitly to test result hierarchy to preserve different kind value than the parent node. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/runtime/kunit.jinja2: fix `NameError` Fix the below error in `_submit` method: ``` Traceback (most recent call last): File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 287, in main job.submit(results) File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 138, in submit self._submit(result) File "/home/kernelci/data/output/tmp94nrvsvs/kunit-x86_64", line 265, in _submit return node NameError: name 'node' is not defined ``` Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/runtime/kunit.jinja2: evaluate job node result Evaluate job node result from child node results if `null` result is receive from test result parser. For example nodes such as `fortify`: https://staging.kernelci.org:9000/viewer?node_id=6670ab43d0b7694b399897c4 Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: fix parsing of KUnit log file Handle both compressed(gzip) and plain text log files for getting log excerpt. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: HTTP exception handling for log excerpt Add HTTP exception handling for getting log excerpt data. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config: platforms-chromeos: Add serial delay for some Mediatek platforms Add test_character_delay to the Spherion, Tomato and Steelix platforms to workaround the fact that they're sometimes unable to process serial input fast enough, resulting in mangled commands and consequently flaky test results, as described in https://github.com/kernelci/kernelci-project/issues/366. The right place to do this change would be in the device-type template as described in LAVA's documentation [1]. This overriding in KernelCI is meant only as a temporary workaround to verify whether this fixes the issue. If it does, then we'll do it in LAVA upstream instead. [1] https://docs.lavasoftware.org/lava/debugging.html#differences-in-input-speeds Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * config: chromeos: Enable error-logs kselftest for MediaTek Chromebooks Run the error-logs kselftest on MediaTek Chromebooks. This test is currently under review upstream [1] so, in the meantime, it has been added to the collabora-next tree so it can prove its value by helping to detect issues upstream. [1] https://lore.kernel.org/all/20240423-dev-err-log-selftest-v1-0-690c1741d68b@collabora.com Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * config/pipeline.yaml: enable CIP lab Add configuration for LAVA CIP lab. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config/pipeline.yaml: add baseline-x86 test for CIP Add `baseline-x86-cip` test to be submitted to CIP LAVA lab. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * docker-compose.yaml: add `lab-cip` runtime Add runtime argument `lab-cip` to `scheduler-lava` container. This will enable the pipeline to run and submit jobs to CIP LAVA lab. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: enable `job` node submission to KCIDB Parse newly added job node and its child tests for KCIDB submission. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: don't submit `setup` test suite nodes `setup` test suite has been introduced to store test results for environment setup checks before running actual test suite. KCIDB doesn't require `setup` test suite result as long as main test job result is submitted. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: add a check before sending data Check if parsed data is available before sending revision data to KCIDB. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: fix logs Fix log statement about submitting node to KCIDB as we are not sending all the nodes we receive event for to KCIDB. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: handle skipped tests Do not retrieve artifacts or metadata from parent node for skipped tests as in pratice only kernel revision, test runtime and platform will be available for skipped tests. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * result_summary/utils: ignore failures on log retrieval Make the script continue running if there was an error fetching a test log. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * doc/developer-documentation: add docs for enabling new tests Add developer documentation for enabling new tests. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * Fix links after docs page migration Documentation has been migrated to the "docs.*" subdomain. Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com> * pipeline.yaml: Add kcidebug fragment Add useful low-overhead debug option to kernel, and test on most x86 boards we have available, with minimal baseline tests. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * configs: update gcc-10 to gcc-12 As we upgrade compiler images, we need update gcc version Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * regression_tracker: workaround: match node paths programatically Don't use 'path' as an api search parameter. The use of lists as query parameters (path is a list) is undefined. Instead, do the filtering in code. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * config: remove qemu jobs from lab-qualcomm QEMU jobs use container pulled from hub.docker.com. After the lab move pulling from this registry is no longer possible at Qualcomm. This patch disables QEMU jobs from Qualcomm lab. Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com> * validate_yaml.py: Improve pipeline validation Add validation that scheduler entries have matching job entry, this is critical validation, and job entries have at least one entry in the scheduler. Fix one entry detected by this validation Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * pipeline.yaml: Add broonie(Mark Brown) trees to pipeline It is time to enable even more trees. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * validate_yaml.py: Add additional verification for duplicate keys We might have redefined same keys in different yaml files, this tool will ensure consistency of this entries. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * validate_yaml.py: Remove path separator Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * validate_yaml.py: Rename variable to schedules Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config/kernelci.toml: update KCIDB origin name As we agreed to refer new KernelCI API & Pipeline as "maestro", use the new name while submitting data to KCIDB. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * src/send_kcidb: update KCI result mapping with KCIDB status Update evaluation of KCIDB status from KCI result. Create 2 categories for error codes: 1. When pre-check tests completed but actual test suite coudln't run - this will have `MISS` status 2. When pre-check tests completed, actual test suite could run but somehow couldn't complete - this will have `ERROR` status Some LAVA error codes can occur at any point of execution such as `Cancelled` and `Test`. Listed such error codes to the most relevant category based on analysis of available results. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * result_summary presets: fix presets for v4l2-decoder-conformance Following recent updates to data representation on KernelCI nodes, the top-level nodes for tests now have their kind set to 'job' instead of 'test'. Update the presets for v4l2-decoder-conformance tests accordingly. Signed-off-by: Laura Nao <laura.nao@collabora.com> * result_summary presets: fix output file name in kselftest-acpi preset Signed-off-by: Laura Nao <laura.nao@collabora.com> * config: enable dmabuf-heaps, exec and iommu kselftest suites Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes: - Add kcidb_test_suite * config: result-summary: add generic rule to monitor failures and regression Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: Add rt-stable builds Copy rt-stable builds from legacy KernelCI. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes: - Major changes to move to new way of writing kbuild jobs * config: pipeline: Add v6.6-rt branch for builds Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: result-summary: add rt-stable kbuilds presets Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: chromeos: Add 'nfs' suffix to KCIDB suite name for baseline-nfs The baseline test is currently run with both ramdisk and nfs rootfs. To distinguish baseline-nfs tests in KCIDB, add an 'nfs' suffix to the KCIDB test suite name. Signed-off-by: Laura Nao <laura.nao@collabora.com> * aks: Add kubernetes kcidb deployment We need file that will manage deployment of kcidb bridge in kubernetes production deployment. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * kubernetes: Adjust trigger k8s options Ignore kernelci tree on production, as it is special "staging"-only tree, and read all /config directory, not just default pipeline.yaml. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * regression_tracker: bugfix: catch empty search condition Fix _get_last_matching_node(), after the previous change there was an unhandled scenario where nodes may be empty but the function wouldn't return None immediately. Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> * config: pipeline: correct the kind of kselftest suites to job Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * scheduler-chromeos.yaml: Temporarily disable non-essential tast tests As per discussion, we disable temporary tast tests which unlikely will be reviewed. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * k8s/aks: Update deployment files 1)Update memory limit, as working with linux sources might require 3Gbyte of RAM. 2)Update config file path 3)Add callback environment variable 4)Update image reference to fresh one Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: pipeline: enable android builds with gcc-12 for all architectures Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: enable android builds with clang-17 for all architectures Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: remove build_variants from android build_configs The build_variants is legacy way to specify the different variants. We have moved to the newer way to specify the variants. Hence remove the build_variants from android build_configs. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: add android15-6.6-lts branch for build as well The android15-6.6-lts has been included recently in legacy KernelCI: https://github.com/kernelci/kernelci-core/pull/2597 Add the same in newer KernelCI. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: add blocklist for riscv older kernels for android builds Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: update KCIDB test suite mapping for baseline Use `boot` as KCIDB test suite mapping for all baseline tests. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * callback_url: Update config and README As we are moving callback URL to environment variable, updating config and README accordingly. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: pipeline: enable android baseline (boot) testing for arm and arm64 in only allmodconfig Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * scheduler.py: If event have jobfilter, inject it to the node data When someone generate artificial event with jobfilter, this is likely maintainer trying to repeat job. Treat this accordingly, and inject job filter to job node, so we will run only tests maintainer wants. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * lava_callback: migrate to fastapi It will be easier to maintain API and Pipeline, as both will be powered by FastAPI framework. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: chromeos: Update fluster rootfs URL Signed-off-by: Laura Nao <laura.nao@collabora.com> * config: pipeline: fix defconfigs in fragments Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * kbuild.jinja2: support defconfig as list or str As required in https://github.com/kernelci/kernelci-core/pull/2608 defconfig might be two types. Support it in jinja2 accordingly. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: piepline: add kbuilds of lee-mfd with default defconfigs Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: enable baseline testing for mfd for one board of each arch Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: fix platform sections for Qualcomm and Android schedules Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com> * k8s: Update deployment to uvicorn, as we use fastapi now Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * config: pipeline: Unblock android runs on lava-collabora Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * pipeline: Enable preempt-rt cyclictest test Enable the first preempt-rt test, cyclictest in new KernelCI. Enable it on all platforms. Since these are all smoke test there is no point in running them too long. Thus reduce the runtime per test to one minute. This should keep the total preempt-rt runtime roughly in the same time frame. The changes have been ported from Daniel's PR [1]. [1] https://github.com/kernelci/kernelci-core/pull/2397 Signed-off-by: Daniel Wagner <wagi@monom.org> Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * pipeline: add all the test jobs for all rt-test Add jobs definition of all the rt-tests. Enable cyclicdeadline and rtla tests to run on all targets. The changes have been ported from Daniel's PR [1]. [1] https://github.com/kernelci/kernelci-core/pull/2397 Signed-off-by: Daniel Wagner <wagi@monom.org> Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: add template and test properties for preempt_rt jobs Add template, job add kcidb_test_suite properties for all preempt-rt jobs Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: rename preempt-rt to rt-tests which is correct name of tests The legacy was using preempt-rt name of tests. But the repository has rt-tests name. We must use the same name to merge with execution results coming from other CIs in KCIDB. Suggested-by: Jeny Sadadia <jeny.sadadia@collabora.com> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: add the correct nfsroot for rt-tests Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: Remove android's deprecated branches It has been confirmed with Todd that we should remove the deprecated branches. Hence remove those branches. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * config: pipeline: run baseline on non-allmodconfig The allmodconfig generates very large kernel image. It cannot be booted on the arm64 and arm targets as tftp errors out that size is too large. Reduce the kernel image size. Use the default defconfig. The same defconfigs have been booting for other trees. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * doc: developer-documentation: Update documentation by adding more details - Reorganize some things - Specify how to write different variants by removing old syntax - Give two separate templates for kbuild and test - Try to put more details for new contributors Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> --- Changes since v1: - Fix type - Apply suggestions from code review * doc/developer-documentation: fix a glitch in enabling new tree section Fix a minor bug in YAML block formatting. Fixes: f5f57de ("doc: developer-documentation: Update documentation by adding more details") Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * doc/developer-documentation: update a section title Rename a section from "Enabling a new Kernel tree" to "Enabling new KernelCI trees, builds, and tests" as it explains enabling tests as well. Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> * config: use the new `tree:branch` format for rules For cases where we want a single branch to be allowed for a given tree, we can now use the `tree:branch` format in rules. Convert existing rules accordingly. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config: pipeline: fix improper use of "filters" attribute The `filters` param was used in the legacy system but has been replaced by `rules`, with a different syntax. For Android RISC-V builds, this was used to deny job execution on kernels < 4.19, so let's translate this condition with the rules format, and do a similar change for the `rt-tests`-based jobs. Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> * config/pipeline.yaml: Fix x86 typo in kcidebug job names The kcidebug jobs that run on MediaTek and Qualcomm platforms should have arm64 in the name rather than x86. Fix the typo. Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> * config: pipeline: remove params The parameters are only needed when they are changed or appeneded. Remvoe the parameters which aren't being modified. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> * validate_yaml.py: Jobs are required to have template parameter Add more validation to config files of mandatory parameters. Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * validate_yaml.py: Add more job validations Add basic validation, each job must have kind parameter Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> * workflows: Add label on CI check failures Automatically add label so broken PR wont go to staging Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> --------- Signed-off-by: Jeny Sadadia <jeny.sadadia@collabora.com> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com> Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Signed-off-by: Helen Koike <helen.koike@collabora.com> Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> Signed-off-by: Laura Nao <laura.nao@collabora.com> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Milosz Wasilewski <quic_mwasilew@quicinc.com> Signed-off-by: Paweł Wieczorek <pawiecz@collabora.com> Signed-off-by: Daniel Wagner <wagi@monom.org> Co-authored-by: Jeny Sadadia <jeny.sadadia@collabora.com> Co-authored-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Co-authored-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com> Co-authored-by: Helen Koike <helen.koike@collabora.com> Co-authored-by: Arnaud Ferraris <arnaud.ferraris@collabora.com> Co-authored-by: Laura Nao <laura.nao@collabora.com> Co-authored-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Co-authored-by: Shreeya Patel <shreeya.patel@collabora.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Milosz Wasilewski <milosz.wasilewski@foundries.io> Co-authored-by: Paweł Wieczorek <pawiecz@collabora.com> Co-authored-by: Milosz Wasilewski <quic_mwasilew@quicinc.com> Co-authored-by: Daniel Wagner <wagi@monom.org> Signed-off-by: Denys Fedoryshchenko <denys.f@collabora.com>
1 parent a96c348 commit 9cc1afc

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

74 files changed

+8601
-740
lines changed

.github/dependabot.yml

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# To get started with Dependabot version updates, you'll need to specify which
2+
# package ecosystems to update and where the package manifests are located.
3+
# Please see the documentation for all configuration options:
4+
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
5+
6+
version: 2
7+
updates:
8+
- package-ecosystem: "" # See documentation for possible values
9+
directory: "/" # Location of package manifests
10+
schedule:
11+
interval: "weekly"

.github/workflows/main.yml

+24
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,27 @@ jobs:
4040
- name: Run pycodestyle
4141
run: |
4242
pycodestyle src/*.py
43+
44+
- name: Install python yaml package
45+
run: |
46+
pip install pyyaml
47+
48+
- name: Run basic yaml validation
49+
run: |
50+
python tests/validate_yaml.py
51+
on-fail:
52+
if: failure() && github.event_name == 'pull_request'
53+
runs-on: ubuntu-latest
54+
needs: check
55+
steps:
56+
- name: Add label to PR
57+
uses: actions/github-script@v7
58+
with:
59+
script: |
60+
const label = 'staging-skip';
61+
github.rest.issues.addLabels({
62+
owner: context.repo.owner,
63+
repo: context.repo.repo,
64+
issue_number: context.issue.number,
65+
labels: [label]
66+
});

.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
11
.env
22
.docker-env
33
data
4+
*.pyc
5+
*.venv
6+

README.md

+38-2
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,50 @@ KernelCI Pipeline
44
Modular pipeline based on the new [KernelCI
55
API](https://github.com/kernelci/kernelci-api).
66

7-
Please refer to the [pipeline design documentation](https://kernelci.org/docs/api/overview/#pipeline-design) for more details.
7+
Please refer to the [pipeline design documentation](https://docs.kernelci.org/api_pipeline/api/design/#pipeline-design) for more details.
88

99
To use it, first, start the API. Then start the services in this repository on the same host.
1010

11-
Follow instructions to [add a token and start the services](https://kernelci.org/docs/api/getting-started/#setting-up-a-pipeline-instance).
11+
Follow instructions to [add a token and start the services](https://docs.kernelci.org/api_pipeline/api/local-instance/#setting-up-a-pipeline-instance).
1212

1313
> **Note** The `trigger` service was run only once as it's not currently configured to run periodically.
1414
15+
### Setting up LAVA lab
16+
17+
For scheduling jobs, the pipeline needs to be able to submit jobs to a "LAVA lab" type of runtime and receive HTTP(S) callbacks with results over "lava-callback" service.
18+
Runtime is configured in yaml file following way, for example:
19+
```
20+
lava-collabora: &lava-collabora-staging
21+
lab_type: lava
22+
url: https://lava.collabora.dev/
23+
priority_min: 40
24+
priority_max: 60
25+
notify:
26+
callback:
27+
token: kernelci-api-token-staging
28+
```
29+
30+
- url is endpoint of LAVA lab API where job will be submitted.
31+
- notify.callback.token is token DESCRIPTION used in LAVA job definition. This part is a little bit tricky: https://docs.lavasoftware.org/lava/user-notifications.html#notification-callbacks
32+
If you specify token name that does not exist in LAVA under user submitting job, callback will return token secret set to description. If following example it will be "kernelci-api-token-staging".
33+
If you specify token name that matches existing token in LAVA, callback will return token value (secret) from LAVA, which is usually long alphanumeric string.
34+
Tokens generated in LAVA in "API -> Tokens" section. Token name is "DESCRIPTION" and token value (secret) can be shown by clicking on green eye icon named "View token hash".
35+
Callback URL is set in pipeline instance environment variable KCI_INSTANCE_CALLBACK.
36+
37+
The `lava-callback` service is used to receive notifications from LAVA after a job has finished. It is configured to listen on port 8000 by default and expects in header "Authorization" token value(secret) from LAVA. Mapping of token value to lab name is done over toml file. Example:
38+
```
39+
[runtime.lava-collabora]
40+
runtime_token = "REPLACE-LAVA-TOKEN-GENERATED-BY-LAB-LAVA-COLLABORA"
41+
callback_token = "REPLACE-LAVA-TOKEN-GENERATED-BY-LAB-LAVA-COLLABORA"
42+
43+
[runtime.lava-collabora-early-access]
44+
runtime_token = "REPLACE-LAVA-TOKEN-GENERATED-BY-LAB-LAVA-COLLABORA-EARLY-ACCESS"
45+
callback_token = "REPLACE-LAVA-TOKEN-GENERATED-BY-LAB-LAVA-COLLABORA"
46+
```
47+
In case we have single token, it will be same token used to submit job(by scheduler), runtime_token only, but if we use different to tokens to submit job and to receive callback, we need to specify both runtime_token and callback_token.
48+
49+
Summary: Token name(description) is used in yaml configuration, token value(secret) is used in toml configuration.
50+
1551
### Setup KernelCI Pipeline on WSL
1652

1753
To setup `kernelci-pipeline` on WSL (Windows Subsystem for Linux), we need to enable case sensitivity for the file system.

0 commit comments

Comments
 (0)