-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Motivation
MINIMUM CARDINALITY KEY (P174) from Garey & Johnson, A4 SR26. A classical NP-complete problem from relational database theory. Given a set of attribute names and functional dependencies, the problem asks whether there exists a candidate key of cardinality at most M. This is fundamental to database normalization: identifying the smallest key determines the most efficient way to uniquely identify rows in a relation. The problem connects graph-theoretic vertex cover to database schema design.
Associated rules:
- R120: Vertex Cover -> Minimum Cardinality Key (this model is the target)
- R122: Minimum Cardinality Key -> Prime Attribute Name (this model is the source)
Definition
Name: MinimumCardinalityKey
Canonical name: Minimum Cardinality Key (also: Minimum Key, Least Cardinality Candidate Key)
Reference: Garey & Johnson, Computers and Intractability, A4 SR26, p.232
Mathematical definition:
INSTANCE: A set A of "attribute names," a collection F of ordered pairs of subsets of A (called "functional dependencies" on A), and a positive integer M.
QUESTION: Is there a key of cardinality M or less for the relational system <A,F>, i.e., a minimal subset K ⊆ A with |K| <= M such that the ordered pair (K,A) belongs to the "closure" F* of F defined by (1) F ⊆ F*, (2) B ⊆ C ⊆ A implies (C,B) in F*, (3) (B,C),(C,D) in F* implies (B,D) in F*, and (4) (B,C),(B,D) in F* implies (B,C ∪ D) in F*?
Variables
- Count:
num_attributesbinary variables, one per attribute. - Per-variable domain: binary {0, 1} — whether the attribute is included in the candidate key K.
dims():vec![2; num_attributes]evaluate(): Let K = {i : config[i] = 1}. Compute the closure of K under F* (repeatedly apply functional dependencies until no new attributes are added). Returntrueiff closure(K) = A, |K| <= bound_k, and K is minimal (no proper subset of K also has closure = A).
Schema (data type)
Type name: MinimumCardinalityKey
Variants: none (no graph or weight parameters; the problem is purely set-theoretic)
| Field | Type | Description |
|---|---|---|
num_attributes |
usize |
Number of attributes |
dependencies |
Vec<(Vec<usize>, Vec<usize>)> |
Functional dependencies F; each pair (lhs, rhs) means lhs -> rhs |
bound_k |
usize |
The positive integer M: key must have cardinality <= bound_k |
Size fields (getter methods for overhead expressions and declare_variants!):
num_attributes()— returnsnum_attributesnum_dependencies()— returnsdependencies.len()bound_k()— returnsbound_k
Notes:
- This is a satisfaction (decision) problem:
Metric = bool, implementingSatisfactionProblem. - No weight type is needed.
- The closure F* is computed using Armstrong's axioms: reflexivity (B ⊆ C implies C -> B), transitivity ((B,C) and (C,D) imply (B,D)), and augmentation/union ((B,C) and (B,D) imply (B, C ∪ D)).
- A key K must satisfy: (1) K determines all of A (i.e., (K,A) in F*), and (2) K is minimal (no proper subset of K also determines A).
- The number of candidate keys can be exponential in |A|.
Complexity
- Best known exact algorithm: Brute-force enumeration of all subsets of A of size at most M, checking each for the key property via attribute closure computation. Closure computation is O(|F| * |A|) per subset (linear pass through dependencies). Total: O(binom(|A|, M) * |F| * |A|). For M = |A|/2, this is O(2^|A| * |F| * |A|). Lucchesi and Osborn (1978) give an algorithm that finds all candidate keys in time polynomial in |A|, |F|, and the number of keys |K|, but since |K| can be exponential, the worst case remains exponential.
- NP-completeness: NP-complete [Lucchesi and Osborn, 1978], via transformation from VERTEX COVER.
- References:
- C. L. Lucchesi and S. L. Osborn (1978). "Candidate keys for relations." J. Computer and System Sciences, 17(2), pp. 270-279.
- W. Lipsky, Jr. (1977). "Two NP-complete problems related to information retrieval." Fundamentals of Computation Theory, Springer.
Extra Remark
Full book text:
INSTANCE: A set A of "attribute names," a collection F of ordered pairs of subsets of A (called "functional dependencies" on A), and a positive integer M.
QUESTION: Is there a key of cardinality M or less for the relational system <A,F>, i.e., a minimal subset K ⊆ A with |K| <= M such that the ordered pair (K,A) belongs to the "closure" F* of F defined by (1) F ⊆ F*, (2) B ⊆ C ⊆ A implies (C,B) ∈ F*, (3) (B,C),(C,D) ∈ F* implies (B,D) ∈ F*, and (4) (B,C),(B,D) ∈ F* implies (B,C ∪ D) ∈ F*?
Reference: [Lucchesi and Osborn, 1978] (cited as 1977 preprint in GJ), [Lipsky, 1977a]. Transformation from VERTEX COVER. See [Date, 1975] for general background on relational data bases.
Connection to Vertex Cover: The Minimum Cardinality Key problem generalizes Vertex Cover. In the reduction, vertices become attributes, edges become functional dependencies (each endpoint determines the edge attribute), and a vertex cover corresponds to a set of attributes that determines all edge attributes. The key cardinality bound M corresponds directly to the vertex cover budget k.
How to solve
- It can be solved by (existing) bruteforce -- enumerate all subsets of A of size <= M, compute attribute closure for each, check if closure equals A.
- It can be solved by reducing to integer programming -- binary variable x_i per attribute; constraint that the closure of selected attributes covers all attributes; sum constraint sum(x_i) <= M. The closure constraint can be linearized using auxiliary variables.
- Other: Lucchesi-Osborn algorithm enumerates all candidate keys in output-polynomial time. Can also reduce to Set Cover / Hitting Set and use known solvers.
Example Instance
Instance 1 (YES — has small key):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
Functional dependencies F:
- {0, 1} -> {2}
- {0, 2} -> {3}
- {1, 3} -> {4}
- {2, 4} -> {5}
bound_k = 2
Analysis of candidate key {0, 1}:
- Start: {0, 1}
- Apply {0, 1} -> {2}: closure = {0, 1, 2}
- Apply {0, 2} -> {3}: closure = {0, 1, 2, 3}
- Apply {1, 3} -> {4}: closure = {0, 1, 2, 3, 4}
- Apply {2, 4} -> {5}: closure = {0, 1, 2, 3, 4, 5} = A
- {0, 1} is a key of cardinality 2 <= bound_k = 2.
- Check minimality: {0} alone: closure = {0} (no applicable FD). {1} alone: closure = {1}. Neither determines A.
- So {0, 1} is a minimal key (candidate key) of size 2.
Answer: YES.
Instance 2 (NO — no small key):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
Functional dependencies F:
- {0, 1, 2} -> {3}
- {3, 4} -> {5}
bound_k = 2
No subset of size <= 2 can determine all of A:
- {0, 1, 2} -> {3} requires all three attributes. Any 2-element subset misses at least one, so the FD cannot fire.
- Even {0, 1}: closure = {0, 1} (cannot fire {0,1,2}->{3} without 2). Not a key.
- No 2-element subset determines A.
Answer: NO.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status