Skip to content

PostgreSQL database model

Database structure

cust_cluster

The description of a single deployed StorPool cluster, whether in our internal testing environment or at a customer site.

Data model fields:

  • name
  • cltype

Additional fields:

  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Referential integrity:

Automated event journalling:

  • cust_cluster_record_trigger

cust_cluster_type

The type (support state) of a StorPool cluster, whether deployed in our testing environment or in a customer's production site.

Data model fields:

  • id
  • name

Additional fields:

None

Automated event journalling:

None

cust_host_kver

A single kernel version installed on a host in a StorPool cluster.

Data model fields:

  • cluster_name
  • host_name
  • distro
  • version
  • current

Additional fields:

  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Referential integrity:

Automated event journalling:

  • cust_host_kver_record_trigger

distro

A Linux distribution that StorPool can be installed on.

Data model fields:

  • name
  • variant
  • supported

Additional fields:

None

Automated event journalling:

  • distro_record_trigger

event

An automatically-maintained log-like table of changes made to some types of records, e.g. distro, kver, sp_build, etc.

Data model fields:

  • table_name
  • pkey
  • action
  • generation
  • fields
  • tstamp

Additional fields:

  • id (integer): a unique identifier of this event, assigned by the database management system

Functions invoked by client:

  • generation_drop_empty(generation_id)
  • generation_drop_empty_current()

generation

A single set of related changes made to the sqern database.

Data model fields:

  • id
  • comment
  • username
  • ipaddr
  • tstamp

Additional fields:

None

Functions invoked by the clients:

  • generation_id()
  • generation_start(comment)

Internal functions:

  • generation_id_if_assigned()

kver

A Linux kernel version for a specific Linux distribution. These objects carry no state at all by themselves; they exist for the purposes of referential integrity (foreign key constraints).

Informational queries should probably use the kver_not_ignored view instead.

Data model fields:

  • distro
  • version

Additional fields:

  • archived (boolean): is this kernel version no longer taken into consideration

Referential integrity:

Automated event journalling:

  • kver_record_trigger

kver_fetched

A Linux kernel version that our build infrastructure can attempt to build a StorPool package for, i.e. the OS packages containing the kernel header files have been successfully fetched from the upstream repositories of the Linux distribution.

Data model fields:

  • distro
  • version
  • tstamp

Additional fields:

None

Referential integrity:

Automated event journalling:

  • kver_fetched_record_trigger

kver_ignore

A Linux distribution-specific pattern that specifies that some kernel versions should be ignored.

Informational queries should not use this table directly, but the kver_not_ignored view instead.

Data model fields:

  • distro
  • pattern
  • comment

Additional fields:

  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Automated event journalling:

  • kver_ignore_record_trigger

kver_not_ignored

A view over the kver table that skips any versions that match the patterns recorded in the kver_ignore table for the corresponding Linux distribution. This view is automatically refreshed after any changes to the kver and kver_ignore tables by triggers invoking the kver_update_not_ignored() function.

Fields inherited from the kver table:

  • distro
  • pattern
  • comment

Additional fields:

None

Automated refreshing:

  • kver_update_not_ignored()
  • kver_update_not_ignored_kver
  • kver_update_not_ignored_kver_ignore

kver_seen

A Linux kernel version that has been seen in the upstream package repositories of a specific Linux distribution.

Data model fields:

  • distro
  • version
  • tstamp

Additional fields:

None

Referential integrity:

Automated event journalling:

  • kver_seen_record_trigger

sp_available_kver

FIXME: something about StorPool packages uploaded to the vault that may be used for testing or deployment or something.

Data model fields:

FIXME: this one is not even described in the data model...

Additional fields:

  • storpool_revision (string): the "21.0.243.620bc3729" version string of the StorPool source tree
  • release_target (string): the StorPool testing infrastructure suite that this build is initially meant for e.g. "sp-lab"
  • distro (string): the Linux distribution the modules were built for
  • version (string): the kernel version the modules were built for
  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Referential integrity:

Automated event journalling:

  • sp_available_kver_record_trigger

sp_build

An attempt to build StorPool from a specific source tree.

Data model fields:

  • id
  • tstamp
  • result
  • success (generated)
  • jenkins_job
  • storpool_revision
  • change_id
  • official
  • release_target
  • dockernel_profile

Additional fields:

  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Automated event journalling:

  • sp_build_record_trigger

sp_build_kver

A single kernel version that a particular StorPool build attempts to build the StorPool kernel modules for.

Data model fields:

  • build_id
  • distro
  • version

Additional fields:

  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Referential integrity:

Automated event journalling:

  • sp_build_kver_record_trigger

sp_test

An attempt to run the StorPool testing suite on the packages produced by a StorPool build.

Data model fields:

  • id
  • tstamp
  • build_id
  • test_flavor
  • test_cluster

Additional fields:

  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Referential integrity:

Automated event journalling:

  • sp_test_record_trigger

sp_test_kver

A single kernel version that a particular StorPool test run attempts to validate the StorPool kernel modules for.

Data model fields:

  • test_id
  • distro
  • version

Additional fields:

  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Referential integrity:

Automated event journalling:

  • sp_test_kver_record_trigger

sp_test_result

The result of a test run.

Data model fields:

  • test_id
  • tstamp
  • result
  • success

Additional fields:

  • deleted_at (timestamp): the moment this record was marked as deleted, or the special value -infinity otherwise

Referential integrity:

Automated event journalling:

  • sp_test_kver_record_trigger

General workflow recommendations

Use the psql command-line tool in shell programs, but carefully!

When passing string values to INSERT, SELECT, UPDATE, DELETE, etc. queries, the values MUST be quoted appropriately: any single-quote characters must be doubled. The following shell function may come in useful:

psql_qstring()
{
    local value="$1"

    printf -- '%s\n' "$value" | sed -e "s/'/''/g"
}

Newline characters within the string values will still pose a problem. If possible, avoid them. If not, take extra care and use PostgreSQL's string constants with C-style escapes ...but only with extreme care!

For multiline queries, as well as queries that insert more than one row, it is better to construct the query beforehand in a temporary file and use the psql tool's --filename (-f) option.

Run informational queries against the database (Python)

  • connect to the database using the appropriate PostgreSQL service=sqern-... record
  • do not issue a generation_start(...) or generation_id() query
  • issue SELECT ... queries as needed
  • disconnect from the database

Run informational queries against the database (shell programs)

  • use the psql command-line tool
  • specify the appropriate PostgreSQL service=sqern-... record
  • do not issue a generation_start(...) or generation_id() query
  • pass one or more -c 'SELECT ...' query parameters as needed
  • IMPORTANT: quote string values by doubling any single-quote characters!
  • use the psql tool's --no-align (-A), --field-separator (-F), --tuples-only (-t), and other output formatting options as needed
  • IMPORTANT: check the exit code of the psql tool

Add, update, or delete database records (Python)

  • connect to the database using the appropriate PostgreSQL service=sqern-... record
  • start a transaction using a BEGIN query
  • optionally (recommended), issue a generation_start() query with a brief comment describing the purpose of the changes to the database
  • insert new records using INSERT INTO ... (ON CONFLICT DO NOTHING if needed) or COPY ... FROM STDIN queries
  • optionally (recommended), issue a generation_drop_empty_current() or generation_drop_empty(generation_id) query if there is the possibility that no new records were inserted at all
  • commit the transaction using a COMMIT query

Note that several of the above steps (connect, start a transaction, invoke generation_start(), invoke generation_drop_empty(), commit the transaction) are made easier by the sqern_tools.util.connect() helper function:

with util.connect(comment="update customer host kernel versions") as (conn, cur, generation):
    # use the psycopg client cursor object cur's methods

If any of the values returned by the connect() helper function are not needed, _ placeholders may be used (e.g. with connect(...) as (_, cur, _):). FIXME: make connect() return a frozen dataclass instead.

Add, update, or delete database records (shell programs)

  • prepare the queries to run, either in shell variables or, if more than one record should be inserted, into a temporary file
  • IMPORTANT: if using INSERT, UPDATE, or DELETE queries, quote any string values by doubling any single-quote characters!
  • IMPORTANT: if using COPY ... FROM STDIN queries, make sure that the values do not contain the separator character!
  • use the psql command-line tool and pass at least the --single-transaction (-1) and -v 'ON_ERROR_STOP=on' command-line options
  • specify the appropriate PostgreSQL service=sqern-... record
  • optionally (recommended), pass a -c 'SELECT generation_start(...) query with a brief comment describing the purpose of the changes to the database
  • pass one or more -c 'INSERT/UPDATE/DELETE ...' or -f "$copyfile" query parameters as needed
  • optionally (recommended), pass a -c 'SELECT generation_drop_empty_current()' query if there is the possibility that no new records were inserted at all
  • IMPORTANT: check the exit code of the psql tool

Sample workflow scenarios

Record the kernel versions used by customer hosts

  • see the sample load-cust-data Python tool (python/src/sqern_tools/load_cust_data.py)
  • grab data from the monitoring reports
  • insert newly-seen kernel versions into the kver table
  • insert, update, and/or mark as deleted cluster records in the cust_cluster table (one record for each customer cluster)
  • insert, update, and/or mark as deleted host records in the cust_host_kver table: cluster, host, distro, version (many records for each customer host)
  • we do not need a separate cust_host table, right? (aggregate queries exist after all)

The StorPool automated build-and-test pipelines

  • check for new kernel versions in the upstream distros (sh/check-new-distro-kver.sh):
    • insert newly-seen kernel versions into the kver table
    • insert newly-seen kernel versions into the kver_seen table; this is the input table for the next step
  • fetch the development packages for newly-seen kernel versions (sh/fetch-distro-kver.sh):
    • grab kver_seen records that are not yet present in kver_fetched
    • run kernq to fetch the files
    • insert newly-fetched kernel versions into the kver_fetched table; this is the input table for the next step
  • build StorPool for kernel versions that we have the devel files for (not yet implemented, autobuild? the Jenkins pipelines?):
    • if we're building for newly-seen kernel versions, grab kver_fetched records that are not yet present in the sp_build table
    • otherwise, make sure there are kver_fetched records for what we want to build
    • when a build is started, insert a record into the sp_build table with a null result field, and note the id of the record (e.g. PostgreSQL's INSERT INTO ... RETURNING id syntax). Make the build ID part of the build artifacts so that the next step can use it.
    • if we know which kernel versions we are building for, immediately insert records into the sp_build_kver table
    • when a build succeeds or fails, update the result field (this will automatically make the success field non-null)
    • possibly, only update the kernel versions now... figure this out
  • run the autotest suite for ready StorPool builds
    • we should always know what we are testing for; make sure there is a sp_build record for that
    • when a test run is started, insert a record into the sp_test table with a null result field, and note the id of the record (e.g. PostgreSQL's INSERT INTO ... RETURNING id syntax). Make the test run ID part of the run artifacts so that any reports can use it.
    • we should always know which kernel versions we are testing for; immediately insert records into the sp_test_kver table
    • when a test run succeeds or fails, update the result field (this will automatically make the success field non-null)