Skip to content

[Model] MinimumCardinalityKey #444

@isPANN

Description

@isPANN

Motivation

MINIMUM CARDINALITY KEY (P174) from Garey & Johnson, A4 SR26. A classical NP-complete problem from relational database theory. Given a set of attribute names and functional dependencies, the problem asks whether there exists a candidate key of cardinality at most M. This is fundamental to database normalization: identifying the smallest key determines the most efficient way to uniquely identify rows in a relation. The problem connects graph-theoretic vertex cover to database schema design.

Associated rules:

  • R120: Vertex Cover -> Minimum Cardinality Key (this model is the target)
  • R122: Minimum Cardinality Key -> Prime Attribute Name (this model is the source)

Definition

Name: MinimumCardinalityKey
Canonical name: Minimum Cardinality Key (also: Minimum Key, Least Cardinality Candidate Key)
Reference: Garey & Johnson, Computers and Intractability, A4 SR26, p.232

Mathematical definition:

INSTANCE: A set A of "attribute names," a collection F of ordered pairs of subsets of A (called "functional dependencies" on A), and a positive integer M.
QUESTION: Is there a key of cardinality M or less for the relational system <A,F>, i.e., a minimal subset K ⊆ A with |K| <= M such that the ordered pair (K,A) belongs to the "closure" F* of F defined by (1) F ⊆ F*, (2) B ⊆ C ⊆ A implies (C,B) in F*, (3) (B,C),(C,D) in F* implies (B,D) in F*, and (4) (B,C),(B,D) in F* implies (B,C ∪ D) in F*?

Variables

  • Count: num_attributes binary variables, one per attribute.
  • Per-variable domain: binary {0, 1} — whether the attribute is included in the candidate key K.
  • dims(): vec![2; num_attributes]
  • evaluate(): Let K = {i : config[i] = 1}. Compute the closure of K under F* (repeatedly apply functional dependencies until no new attributes are added). Return true iff closure(K) = A, |K| <= bound_k, and K is minimal (no proper subset of K also has closure = A).

Schema (data type)

Type name: MinimumCardinalityKey
Variants: none (no graph or weight parameters; the problem is purely set-theoretic)

Field Type Description
num_attributes usize Number of attributes
dependencies Vec<(Vec<usize>, Vec<usize>)> Functional dependencies F; each pair (lhs, rhs) means lhs -> rhs
bound_k usize The positive integer M: key must have cardinality <= bound_k

Size fields (getter methods for overhead expressions and declare_variants!):

  • num_attributes() — returns num_attributes
  • num_dependencies() — returns dependencies.len()
  • bound_k() — returns bound_k

Notes:

  • This is a satisfaction (decision) problem: Metric = bool, implementing SatisfactionProblem.
  • No weight type is needed.
  • The closure F* is computed using Armstrong's axioms: reflexivity (B ⊆ C implies C -> B), transitivity ((B,C) and (C,D) imply (B,D)), and augmentation/union ((B,C) and (B,D) imply (B, C ∪ D)).
  • A key K must satisfy: (1) K determines all of A (i.e., (K,A) in F*), and (2) K is minimal (no proper subset of K also determines A).
  • The number of candidate keys can be exponential in |A|.

Complexity

  • Best known exact algorithm: Brute-force enumeration of all subsets of A of size at most M, checking each for the key property via attribute closure computation. Closure computation is O(|F| * |A|) per subset (linear pass through dependencies). Total: O(binom(|A|, M) * |F| * |A|). For M = |A|/2, this is O(2^|A| * |F| * |A|). Lucchesi and Osborn (1978) give an algorithm that finds all candidate keys in time polynomial in |A|, |F|, and the number of keys |K|, but since |K| can be exponential, the worst case remains exponential.
  • NP-completeness: NP-complete [Lucchesi and Osborn, 1978], via transformation from VERTEX COVER.
  • References:
    • C. L. Lucchesi and S. L. Osborn (1978). "Candidate keys for relations." J. Computer and System Sciences, 17(2), pp. 270-279.
    • W. Lipsky, Jr. (1977). "Two NP-complete problems related to information retrieval." Fundamentals of Computation Theory, Springer.

Extra Remark

Full book text:

INSTANCE: A set A of "attribute names," a collection F of ordered pairs of subsets of A (called "functional dependencies" on A), and a positive integer M.
QUESTION: Is there a key of cardinality M or less for the relational system <A,F>, i.e., a minimal subset K ⊆ A with |K| <= M such that the ordered pair (K,A) belongs to the "closure" F* of F defined by (1) F ⊆ F*, (2) B ⊆ C ⊆ A implies (C,B) ∈ F*, (3) (B,C),(C,D) ∈ F* implies (B,D) ∈ F*, and (4) (B,C),(B,D) ∈ F* implies (B,C ∪ D) ∈ F*?
Reference: [Lucchesi and Osborn, 1978] (cited as 1977 preprint in GJ), [Lipsky, 1977a]. Transformation from VERTEX COVER. See [Date, 1975] for general background on relational data bases.

Connection to Vertex Cover: The Minimum Cardinality Key problem generalizes Vertex Cover. In the reduction, vertices become attributes, edges become functional dependencies (each endpoint determines the edge attribute), and a vertex cover corresponds to a set of attributes that determines all edge attributes. The key cardinality bound M corresponds directly to the vertex cover budget k.

How to solve

  • It can be solved by (existing) bruteforce -- enumerate all subsets of A of size <= M, compute attribute closure for each, check if closure equals A.
  • It can be solved by reducing to integer programming -- binary variable x_i per attribute; constraint that the closure of selected attributes covers all attributes; sum constraint sum(x_i) <= M. The closure constraint can be linearized using auxiliary variables.
  • Other: Lucchesi-Osborn algorithm enumerates all candidate keys in output-polynomial time. Can also reduce to Set Cover / Hitting Set and use known solvers.

Example Instance

Instance 1 (YES — has small key):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
Functional dependencies F:

  • {0, 1} -> {2}
  • {0, 2} -> {3}
  • {1, 3} -> {4}
  • {2, 4} -> {5}

bound_k = 2

Analysis of candidate key {0, 1}:

  • Start: {0, 1}
  • Apply {0, 1} -> {2}: closure = {0, 1, 2}
  • Apply {0, 2} -> {3}: closure = {0, 1, 2, 3}
  • Apply {1, 3} -> {4}: closure = {0, 1, 2, 3, 4}
  • Apply {2, 4} -> {5}: closure = {0, 1, 2, 3, 4, 5} = A
  • {0, 1} is a key of cardinality 2 <= bound_k = 2.
  • Check minimality: {0} alone: closure = {0} (no applicable FD). {1} alone: closure = {1}. Neither determines A.
  • So {0, 1} is a minimal key (candidate key) of size 2.

Answer: YES.

Instance 2 (NO — no small key):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
Functional dependencies F:

  • {0, 1, 2} -> {3}
  • {3, 4} -> {5}

bound_k = 2

No subset of size <= 2 can determine all of A:

  • {0, 1, 2} -> {3} requires all three attributes. Any 2-element subset misses at least one, so the FD cannot fire.
  • Even {0, 1}: closure = {0, 1} (cannot fire {0,1,2}->{3} without 2). Not a key.
  • No 2-element subset determines A.

Answer: NO.

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.modelA model problem to be implemented.

    Type

    No type

    Projects

    Status

    Ready

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions