sqern - keep track of Linux kernel versions

Overview

The sqern suite (a PostgreSQL database and a set of client libraries / command-line tools) is used internally in the StorPool automated testing infrastructure to keep track of which kernel versions in which Linux distributions have been tested, and to what extent. It is intentionally designed to be minimalistic in what it does - keep track of the current testing state of a kernel version - leaving most of the validation and processing to other tools in the testing infrastructure.

Problem description

General

There is a need for persistent storage and periodic or occassional updates of various types of data (see the concepts section below and the data model section), at first related to testing the behavior of StorPool under various Linux kernel versions, but possibly later extending to other data stored for other purposes. There are four types of objects that data must be stored for:

semi-constant ones: Linux distributions, various state names
rarely modified ones: Linux kernel versions for the various distributions, customer cluster support state
often-modified ones: StorPool build and test runs for specific Linux kernel versions, kernel versions installed on specific customer hosts
"housekeeping" and internal implementation: change descriptions, change events

Testing StorPool on Linux kernel versions

The direct reason for designing the sqern object types and database structure was the need to automate the tracking of the state of support of various Linux kernel versions in different Linux distributions:

a Jenkins pipeline periodically invokes a tool on the builder-docker host that detects new Linux kernel versions in the repositories of the upstream Linux distributions
that same tool downloads the necessary OS packages and converts them to the file and directory format used by the StorPool build infrastructure
a Jenkins pipeline triggers builds and automated tests for StorPool on the newly-detected kernel versions
a Jenkins pipeline triggers builds and automated tests for StorPool on all the supported kernel versions
a manually-triggered Jenkins pipeline triggers builds and automated tests for a manually-specified set of kernel versions and distributions
the built (and tested) StorPool packages are manually marked as ready for deployment, first on our own infrastructure, then at customer sites

Combinations of random program failures and people not noticing or forgetting to track the progress of the builds for a specific Linux kernel version often lead to delays in making StorPool available to customers using that version. It is the purpose of the sqern suite of specifications and tools to automate most of the tracking to avoid such problems.

Concepts

Linux distribution ("distro")

StorPool supports various versions (suites) of various Linux distributions, and there is an internal nomenclature (somewhat influenced by historical reasons). Each distribution version has a "builder name", e.g. centos7 or ubuntu-22.04, which is the primary name that the sqern tool uses and reports, and a "StorPool build variant name", e.g. CENTOS7 or UBUNTU2204, that is used in other parts of the StorPool build and install infrastructure. Some distributions are also marked as not supported (either no longer supported or not fully supported yet), so that new kernel versions discovered for them do not automatically trigger test runs.

Ignored kernel versions for a Linux distribution

There are various reasons why the automated build and test runs may not need to be invoked for some Linux kernel versions: experimental versions not used by customers, severe bugs found in a specific kernel version, severely outdated versions no longer found on any customer hosts, etc. Pattern matching (whether using SQL LIKE patterns (e.g. 6.5.% or %+debug%) or regular expressions (e.g. ^6\.5\. or [+]debug) must be supported.

Linux kernel version

The main concept that the sqern tool is concerned with: a string (e.g. 6.5.0-1-amd64) that is combined with the distribution version, so that ubuntu-20.04/5.15.0-57-generic and ubuntu-22.04/5.15.0-57-generic are considered different.

Database generation

Changes to the sqern database are performed in batches (mostly equivalent to database transactions), with each batch serving some particular goal, such as "update the information about the supported Linux distribution versions", "add a new kernel version in the 'seen' state", "update the state of these three kernel versions from 'built' to 'tested'". Each batch is recorded, along with a freeform text comment, as a so-called "database generation", and most objects in the database have a "generation" field that is initially set when the object is first created and then updated when the object's attributes (e.g. the kernel version state) are later modified.

Database events

Most changes to the various objects in the sqern database (changing the support status of a Linux distribution version, adding new kernel versions, updating the status of kernel versions) are automatically recorded (using database triggers) as either "create" or "update" events. Each event keeps track of the database table that was affected, a mostly human-readable name of the object (e.g. "centos7" for a distribution or "ubuntu-22.04/5.15.0-57-generic" for a kernel version), and some mostly human-readable representation of what was changed.

Basic principles

The proposed design of the sqern suite is based on two premises:

the vast majority of requests will be "read" ones; e.g. "which Debian 12.x kernels have we built, but not tested, StorPool for?", or "which Ubuntu 22.04 kernels have we seen as available in the upstream repositories, but we have not tried to fetch the requisite files for?"
the "write" requests will only ever change some types of objects, and even for those, the "write" requests will only ever change one or two of their fields. Most often, that would be the state field of the kernel version; rarely, a new kernel version will be added; and very rarely, a new distribution will be added or a distribution will be marked as supported or not supported.
it would be convenient to have some kind of log-like record of when such changes were made to the individual objects and also some human-readable reason for making them, e.g. "Jenkins build 13343 failed" or "the configuration specifies that all 8.12.x kernels should be ignored".

See the Data Model section for an overview of the object types discussed below. Also, see the Implementation section for a discussion of the possible implementations of these ideas.

In an RDBMS-backed implementation, read-only requests will be implemented mostly trivially with a single SQL query possibly involving a join between several tables. The proposed design of write requests involves several steps:

Begin a transaction
Create a new Generation record (insert a Generation record into the corresponding database table, probably only specifying the comment field). In some RDBMS implementations that cannot use a sequence's current value as a default value for a column, fetch the id value for the just-inserted row (e.g. using PostgreSQL's INSERT INTO ... RETURNING id statement or MySQL's LAST_INSERT_ID function) to use in the insert/update step. Alternatively, it may be possible to use LAST_INSERT_ID() directly in the insert/update statement.
If needed, fetch records from one or more tables to determine whether the write request is valid; this step may be omitted for most changes
Make the actual change
- For requests to create new objects, insert new records, specifying the SQL-engine-dependent options to not error out on duplicate primary keys but do nothing instead (PostgreSQL's "INSERT ... ON CONFLICT DO NOTHING", SQLite's "INSERT OR IGNORE ...", MySQL's "INSERT IGNORE ...", etc). In some RDBMS implementations, it may be possible to omit the generation field in this INSERT statement by specifying a default value for that table column that fetches the current value of a sequence.
- For requests to update specific fields of existing objects, perform the update. As above, it may be possible to omit the generation field in this INSERT statement or, e.g. in PostgreSQL, explicitly specify it as generation = DEFAULT.
- The RDBMS-specific definition of the table will, most probably using SQL triggers, ensure that a new log-like event record is inserted with the new or updated value of the fields that are tracked for that specific table.

This sequence of steps ensures that each modification of a "real" object will both record the current generation and note the change in the log-like events table, so that it is be possible to follow the history of changes to a specific record over time.