class documentation

Represents the structure and SQLite connection of one GBD database (or CSV file).

Responsible for:

  • Detecting whether a path is an SQLite database or a CSV file.
  • Creating or introspecting the features table and any 1:n feature tables.
  • Executing DDL and DML within the scope of a single database file.

SQLite database layout:

features(hash UNIQUE NOT NULL, feat1 TEXT DEFAULT 'v', feat2 TEXT DEFAULT 'w', ...)
<feat_1n>(hash TEXT NOT NULL, value TEXT NOT NULL, UNIQUE(hash, value))

The features column for a 1:n feature mirrors the hash value so the separate table is joinable without an explicit FK constraint. An INSERT trigger keeps it in sync. A sentinel row (hash='None', value='None') is present in every 1:n table (see Issues.md #7).

Context detection

The context is inferred from the database name prefix, e.g. cnf_sc2021 -> context cnf. There is no context metadata stored inside the file itself.

Class Method context_from_csv Undocumented
Class Method context_from_database Undocumented
Class Method context_from_name Infer the GBD context from a database name.
Class Method create Auto-detect the file type at path and return the appropriate Schema.
Class Method dbname_from_path Derive a valid SQLite ATTACH identifier from a file path.
Class Method features_from_csv Import a CSV file into an in-memory SQLite table and build feature metadata.
Class Method features_from_database Introspect an SQLite database and build feature metadata.
Class Method from_csv Load a CSV file into a shared in-memory SQLite database and return a Schema.
Class Method from_database Load a Schema from an existing SQLite .db file.
Class Method is_database Return True if path points to an SQLite database file.
Class Method valid_feature_or_raise Undocumented
Method __init__ No summary
Method absorb Undocumented
Method create_feature Create a new feature column or table in this schema.
Method create_main_table_if_not_exists Ensure the central features(hash) table exists.
Method execute Undocumented
Method get_connection Undocumented
Method get_features Undocumented
Method get_tables Undocumented
Method has_feature Undocumented
Method is_in_memory Undocumented
Method set_values Persist value for feature on each hash in hashes.
Instance Variable context Undocumented
Instance Variable csv Undocumented
Instance Variable dbcon Undocumented
Instance Variable dbname Undocumented
Instance Variable features Undocumented
Instance Variable path Undocumented
@classmethod
def context_from_csv(cls, path): (source)

Undocumented

@classmethod
def context_from_database(cls, path): (source)

Undocumented

@classmethod
def context_from_name(cls, name): (source)

Infer the GBD context from a database name.

Expects the format {context}_{rest} where context is a registered GBD context string. Falls back to the default context when no prefix matches.

Parameters
name:strDatabase name (without path or extension).
Returns
strContext name.
@classmethod
def create(cls, path): (source)

Auto-detect the file type at path and return the appropriate Schema.

Parameters
path:strPath to a .db file or CSV file.
Returns
SchemaLoaded schema instance.
Raises
SchemaExceptionIf the file cannot be opened or parsed.
@classmethod
def dbname_from_path(cls, path): (source)

Derive a valid SQLite ATTACH identifier from a file path.

Strips directory and extension, sanitises non-alphanumeric characters to underscores, and prepends the default context if the name starts with a digit.

Parameters
path:strFile-system path.
Returns
strAlphanumeric database name, e.g. "cnf_sc2021".
@classmethod
def features_from_csv(cls, dbname, path, con) -> dict[str, FeatureInfo]: (source)

Import a CSV file into an in-memory SQLite table and build feature metadata.

The CSV must contain a hash column; all columns become 1:1 features stored in a single features table. Column names are sanitised to valid identifiers.

Parameters
dbname:strLogical database name.
path:strPath to the CSV file.
conIn-memory sqlite3 connection.
Returns
dict[str, FeatureInfo]Feature registry for this schema.
Raises
SchemaExceptionIf the CSV lacks a hash column.
@classmethod
def features_from_database(cls, dbname, path, con) -> dict[str, FeatureInfo]: (source)

Introspect an SQLite database and build feature metadata.

Iterates all non-underscore-prefixed tables. Columns that are FK references (a features column whose name matches a table name) and the hash column of non-features tables are skipped. All remaining columns become FeatureInfo entries.

Parameters
dbname:strLogical database name.
path:strFile path (informational only).
conOpen sqlite3 connection.
Returns
dict[str, FeatureInfo]Feature registry for this schema.
@classmethod
def from_csv(cls, path): (source)

Load a CSV file into a shared in-memory SQLite database and return a Schema.

@classmethod
def from_database(cls, path): (source)

Load a Schema from an existing SQLite .db file.

@classmethod
def is_database(cls, path): (source)

Return True if path points to an SQLite database file.

An empty file is accepted as a new database. If the path does not exist, the user is prompted to confirm creation.

Raises
SchemaExceptionIf the path does not exist and creation is declined.
@classmethod
def valid_feature_or_raise(cls, name): (source)

Undocumented

def __init__(self, dbcon, dbname, path, features, context, csv=False): (source)
Parameters
dbconOpen sqlite3 connection for this schema.
dbname:strLogical database name (alphanumeric, derived from filename).
path:strFile-system path to the .db or CSV file.
features:dict[str, FeatureInfo]Feature registry for this schema.
context:strGBD context (e.g. "cnf", "kis").
csv:boolTrue when loaded from a CSV into in-memory SQLite; affects is_in_memory.
def absorb(self, schema): (source)

Undocumented

def create_feature(self, name, default_value=None, permissive=False): (source)

Create a new feature column or table in this schema.

1:1 feature (default_value is not None)
Adds {name} TEXT NOT NULL DEFAULT {default_value} to the features table.
1:n feature (default_value is None)
Creates a separate table {name}(hash, value) with UNIQUE(hash, value), inserts the sentinel row ('None', 'None') (see Issues.md #7), and installs a trigger to keep features.{name} (the FK mirror column) in sync.
Parameters
name:strFeature name; validated against reserved words and SQLite keywords unless permissive is True.
default_value:str|
None
None for 1:n; any string for 1:1.
permissive:boolSkip validation and silently ignore if already exists (used internally by initialisers).
Returns
list[FeatureInfo]
Newly created FeatureInfo objects (may include the
hash FeatureInfo if the main table was created as a side-effect).
Raises
SchemaExceptionIf the name is invalid or already exists (unless permissive).
def create_main_table_if_not_exists(self): (source)

Ensure the central features(hash) table exists.

If absent (new database), creates it, back-fills hashes from all existing 1:n tables, and installs INSERT triggers on those tables to keep features populated automatically.

Returns
list[FeatureInfo]A list containing the hash FeatureInfo if the table was newly created; empty list if it already existed.
def execute(self, sql): (source)

Undocumented

def get_connection(self): (source)

Undocumented

def get_features(self): (source)

Undocumented

def get_tables(self): (source)

Undocumented

def has_feature(self, name): (source)

Undocumented

def is_in_memory(self): (source)

Undocumented

def set_values(self, feature, value, hashes): (source)

Persist value for feature on each hash in hashes.

  • 1:n feature: inserts (hash, value) rows; duplicate pairs are silently ignored (INSERT OR IGNORE); also updates features.{name} = hash to keep the FK mirror column current.
  • 1:1 feature: upserts into the features column; on hash conflict, updates the column to value.
Parameters
feature:strFeature name.
valueValue to store (coerced to TEXT by SQLite).
hashes:list[str]Benchmark hashes to update.
Raises
SchemaExceptionIf the feature does not exist or hashes is empty.

Undocumented

Undocumented

Undocumented

Undocumented

features = (source)

Undocumented

Undocumented