class documentation

Manages multiple ATTACHed SQLite databases as a single virtual feature namespace.

One or more .db files (and/or CSV files loaded into shared in-memory SQLite) are ATTACHed to a central in-memory connection. This allows cross-database SQL JOINs to work transparently within a single cursor.

Data model

Each attached database centres on a features table:

features(hash UNIQUE NOT NULL, feat_a TEXT DEFAULT 'v', feat_b TEXT DEFAULT 'w', ...)
  • 1:1 features (FeatureInfo.default != None): stored as columns in features; each hash has exactly one value.

  • 1:n features (FeatureInfo.default is None): stored in a separate table:

    {name}(hash TEXT NOT NULL, value TEXT NOT NULL, UNIQUE(hash, value))
    

    A same-named column in features echoes the hash value as a foreign-key reference so the table is reachable via a JOIN. A sentinel row (hash='None', value='None') is inserted at creation time (see Issues.md #7).

Feature precedence

When the same feature name exists in multiple databases, the database that appears first in path_list takes precedence. Queries can bypass precedence by using the database:feature or context:feature syntax.

Usage:

with Database(["cnf_sc2021.db", "gate_sc2021.db"]) as db:
    sql = GBDQuery(db, "vars > 1000").build_query(
        resolve=["local"], collapse="group_concat"
    )
    rows = db.query(sql)
Class Method sqlite3_version Undocumented
Method __enter__ Undocumented
Method __exit__ Undocumented
Method __init__ No summary
Method commit Undocumented
Method copy_feature Copy values from old_name into new_name for the given hashes.
Method create_feature Create a new feature in target_db and register it in the global registry.
Method dcontext Return the context name (e.g. "cnf", "kis") of dbname.
Method delete Delete specific (hash, value) pairs or reset values to their default.
Method delete_feature Delete feature fname and all its stored values.
Method delete_hashes_entirely Undocumented
Method dexists Return True if dbname is an attached database.
Method dmain Return True if dbname is the default (first-attached) database.
Method dpath Return the file-system path of dbname.
Method dtables Return the list of SQLite table names in dbname.
Method execute Execute a raw SQL DDL/DML statement and optionally auto-commit.
Method faddr Return the fully-qualified SQL address for fid.
Method faddr_column Return the fully-qualified column address database.table.column for feature.
Method faddr_table Return the fully-qualified table address database.table for feature.
Method find Find a feature by name or qualified identifier.
Method finfo Return the FeatureInfo for fname.
Method get_contexts Return the unique context names of the attached databases.
Method get_databases Return all attached database names, optionally filtered by context.
Method get_features Return the names of all known features, optionally filtered by database.
Method get_tables Return the unique table names across all features, optionally filtered by database.
Method init_features Build the global feature registry across all attached schemas.
Method init_schemas Load each path as a Schema and return a mapping of logical database name → Schema.
Method query Execute a raw SQL SELECT and return all rows.
Method rename_feature Rename feature fname to new_fname in its database.
Method set_auto_commit Undocumented
Method set_values Set value for feature fname on each hash in hashes.
Instance Variable autocommit Undocumented
Instance Variable connection Undocumented
Instance Variable cursor Undocumented
Instance Variable features Undocumented
Instance Variable maindb Undocumented
Instance Variable schemas Undocumented
Instance Variable verbose Undocumented
@classmethod
def sqlite3_version(cls): (source)

Undocumented

def __enter__(self): (source)

Undocumented

def __exit__(self, exception_type, exception_value, traceback): (source)

Undocumented

def __init__(self, path_list: list, verbose=False, autocommit=True): (source)
Parameters
path_list:list[str]Ordered list of paths to .db or CSV files. The first entry becomes the default database for write operations. Ordering also determines feature precedence when names collide.
verbose:boolPrint every executed SQL statement to stderr.
autocommit:boolCommit after every execute call. Set to False for batched writes and call commit manually.
def commit(self): (source)

Undocumented

def copy_feature(self, old_name, new_name, target_db, hashlist=[]): (source)

Copy values from old_name into new_name for the given hashes.

new_name must already exist in target_db.

Parameters
old_name:strSource feature name.
new_name:strDestination feature name.
target_db:strDatabase for the destination feature.
hashlist:list[str]Restrict the copy to these hashes.
def create_feature(self, name, default_value=None, target_db=None, permissive=False): (source)

Create a new feature in target_db and register it in the global registry.

Delegates DDL to Schema.create_feature.

Parameters
name:strFeature name; must be a valid identifier.
default_value:str|
None
None creates a 1:n feature (separate {name}(hash, value) table); any string creates a 1:1 feature (column in features with that default).
target_db:str|
None
Target database name; defaults to the first database.
permissive:boolIf True, silently skip if the feature already exists and bypass name validation (for internal use by initialisers).
def dcontext(self, dbname): (source)

Return the context name (e.g. "cnf", "kis") of dbname.

Raises
DatabaseExceptionIf dbname is not attached.
def delete(self, fname, values=[], hashes=[], target_db=None): (source)

Delete specific (hash, value) pairs or reset values to their default.

For 1:n features: deletes matching rows from the feature table. For any hash that now has no remaining values, resets the FK column in features to 'None'. For 1:1 features: resets the column to its default value for matching hashes.

Parameters
fname:strFeature name.
values:list[str]Value filter; empty list means no value restriction.
hashes:list[str]Hash filter; empty list means no hash restriction.
target_db:str|
None
Restrict to this database when ambiguous.
def delete_feature(self, fname, target_db=None): (source)

Delete feature fname and all its stored values.

For 1:n features, drops the separate table. For 1:1 features, drops the column (requires SQLite >= 3.35).

Parameters
fname:strFeature name to delete.
target_db:str|
None
Restrict to this database when ambiguous.
Raises
DatabaseExceptionIf a 1:1 feature is requested on SQLite < 3.35.
def delete_hashes_entirely(self, hashes, target_db=None): (source)

Undocumented

def dexists(self, dbname): (source)

Return True if dbname is an attached database.

def dmain(self, dbname): (source)

Return True if dbname is the default (first-attached) database.

def dpath(self, dbname): (source)

Return the file-system path of dbname.

Raises
DatabaseExceptionIf dbname is not attached.
def dtables(self, dbname): (source)

Return the list of SQLite table names in dbname.

Raises
DatabaseExceptionIf dbname is not attached.
def execute(self, q): (source)

Execute a raw SQL DDL/DML statement and optionally auto-commit.

Parameters
q:strSQL statement (e.g. INSERT, UPDATE, ALTER TABLE, CREATE TABLE).
def faddr(self, fid: str, with_column=True): (source)

Return the fully-qualified SQL address for fid.

Parameters
fid:strFeature identifier (bare name, db:name, or context:name).
with_column:boolIf True (default), return database.table.column; if False, return database.table.
Returns
strFully-qualified address usable in SQL expressions.
def faddr_column(self, feature): (source)

Return the fully-qualified column address database.table.column for feature.

Parameters
feature:strFeature identifier (bare name, or db:name / context:name).
Returns
stre.g. "cnf_sc2021.local.value"
def faddr_table(self, feature): (source)

Return the fully-qualified table address database.table for feature.

Used to build subquery references in Parser.get_sql for 1:n features.

Parameters
feature:strFeature identifier.
Returns
stre.g. "cnf_sc2021.local"
def find(self, fid: str, db: str = None): (source)

Find a feature by name or qualified identifier.

Parameters
fid:strFeature identifier - one of: "feature" (bare), "database:feature", or "context:feature".
db:str|
None
Restrict lookup to this database name. Raises if fid already contains a different database prefix.
Returns
FeatureInfoInfo object for the highest-precedence matching feature. Precedence follows the order of databases in path_list.
Raises
DatabaseExceptionIf the feature is not found or database identifiers are ambiguous.
def finfo(self, fname, db=None): (source)

Return the FeatureInfo for fname.

Parameters
fname:strBare feature name (no db: prefix).
db:str|
None
Restrict lookup to this database name.
Returns
FeatureInfoHighest-precedence info object for the feature.
Raises
DatabaseExceptionIf the feature does not exist (or not in db).
def get_contexts(self, dbs=[]): (source)

Return the unique context names of the attached databases.

Parameters
dbs:list[str]If non-empty, restrict to these database names.
Returns
list[str]Unique context names (order not guaranteed).
def get_databases(self, context: str = None): (source)

Return all attached database names, optionally filtered by context.

Parameters
context:str|
None
If given, only databases of this context are returned.
Returns
list[str]Database names in attachment order.
def get_features(self, dbs=[]): (source)

Return the names of all known features, optionally filtered by database.

Parameters
dbs:list[str]If non-empty, only features from these databases are included.
Returns
list[str]Feature names (may contain duplicates if a feature spans databases).
def get_tables(self, dbs=[]): (source)

Return the unique table names across all features, optionally filtered by database.

Parameters
dbs:list[str]If non-empty, restrict to these database names.
Returns
list[str]Unique table names.
def init_features(self) -> dict[str, FeatureInfo]: (source)

Build the global feature registry across all attached schemas.

Each feature name maps to an ordered list of FeatureInfo objects; the first entry is used by default (highest precedence, determined by database order in path_list). The hash column of the features table is always placed first when present, as it serves as the primary join key.

Returns
dict[str, list[FeatureInfo]]Feature name -> list of FeatureInfo (highest precedence first).
def init_schemas(self, path_list) -> dict[str, Schema]: (source)

Load each path as a Schema and return a mapping of logical database name → Schema.

In-memory schemas (CSV sources) sharing the same database name are merged via Schema.absorb. On-disk databases with colliding names raise DatabaseException.

Parameters
path_list:list[str]Paths to .db or CSV files.
Returns
dict[str, Schema]Ordered mapping of database name to Schema instance.
def query(self, q): (source)

Execute a raw SQL SELECT and return all rows.

Parameters
q:strSQL SELECT statement.
Returns
list[tuple]All result rows as tuples.
def rename_feature(self, fname, new_fname, target_db=None): (source)

Rename feature fname to new_fname in its database.

For 1:n features, also renames the underlying separate table. Updates the in-memory feature registry accordingly.

Parameters
fname:strCurrent feature name.
new_fname:strNew feature name; must pass Schema.valid_feature_or_raise.
target_db:str|
None
Restrict to this database when the feature name is ambiguous.
def set_auto_commit(self, autocommit): (source)

Undocumented

def set_values(self, fname, value, hashes, target_db=None): (source)

Set value for feature fname on each hash in hashes.

For 1:1 features this is an upsert on the features table column. For 1:n features a new (hash, value) row is inserted (or silently ignored if the pair already exists), preserving all other values for the same hash.

Parameters
fname:strFeature name.
valueValue to assign (coerced to TEXT by SQLite).
hashes:list[str] |
str
One or more benchmark hashes.
target_db:str|
None
Target database; uses the feature's registered database when None.
autocommit = (source)

Undocumented

connection = (source)

Undocumented

Undocumented

features = (source)

Undocumented

Undocumented

Undocumented

Undocumented