Skip to content

[Model] PrimeAttributeName #446

@isPANN

Description

@isPANN

Motivation

PRIME ATTRIBUTE NAME (P176) from Garey & Johnson, A4 SR28. A classical NP-complete problem from relational database theory. An attribute is "prime" if it belongs to at least one candidate key. Determining whether a given attribute is prime is essential for database normalization: the distinction between prime and non-prime attributes is the foundation of Second Normal Form (2NF) and Third Normal Form (3NF). The NP-completeness of this problem means that even this basic classification task is computationally intractable in general.

Associated rules:

  • R122: Minimum Cardinality Key -> Prime Attribute Name (this model is the target)

Definition

Name: PrimeAttributeName
Canonical name: Prime Attribute Name (also: Prime Attribute Testing)
Reference: Garey & Johnson, Computers and Intractability, A4 SR28, p.232

Mathematical definition:

INSTANCE: A set A of attribute names, a collection F of functional dependencies on A, and a specified name x in A.
QUESTION: Is x a "prime attribute name" for <A,F>, i.e., is there a key K for <A,F> such that x in K?

Variables

  • Count: num_attributes binary variables, one per attribute.
  • Per-variable domain: binary {0, 1} — whether the attribute is included in a candidate key K that contains x.
  • dims(): vec![2; num_attributes]
  • evaluate(): Let K = {i : config[i] = 1}. Compute the closure of K under F*. Return true iff (1) closure(K) = A, (2) K is minimal (no proper subset of K also has closure = A), and (3) query_attribute is in K.

Schema (data type)

Type name: PrimeAttributeName
Variants: none (no graph or weight parameters)

Field Type Description
num_attributes usize Number of attributes
dependencies Vec<(Vec<usize>, Vec<usize>)> Functional dependencies F; each pair (lhs, rhs) means lhs -> rhs
query_attribute usize The specified attribute x (index into the attribute set)

Size fields (getter methods for overhead expressions and declare_variants!):

  • num_attributes() — returns num_attributes
  • num_dependencies() — returns dependencies.len()
  • query_attribute() — returns query_attribute

Notes:

  • This is a satisfaction (decision) problem: Metric = bool, implementing SatisfactionProblem.
  • No weight type is needed.
  • No budget parameter is needed (unlike Minimum Cardinality Key).
  • The problem asks only whether x appears in ANY candidate key, not whether x appears in a key of bounded size.
  • An attribute is "non-prime" if it does not appear in any candidate key. Non-prime attributes are those that are functionally determined by every candidate key but never participate in one.

Complexity

  • Best known exact algorithm: Enumerate all subsets of A containing x, check each for key property (closure = A, minimality). Worst case: O(2^num_attributes * num_dependencies * num_attributes). Can terminate early when a key containing x is found.
  • NP-completeness: NP-complete [Lucchesi and Osborn, 1978], via transformation from MINIMUM CARDINALITY KEY.
  • declare_variants! complexity string: "2^num_attributes * num_dependencies * num_attributes"
  • Relationship to normal forms: An attribute x is prime iff it is relevant to 2NF/3NF decomposition. A relation is in 3NF iff for every non-trivial FD X -> Y, either X is a superkey or Y consists only of prime attributes.
  • References:
    • C. L. Lucchesi and S. L. Osborne (1978). "Candidate keys for relations." J. Computer and System Sciences, 17(2), pp. 270-279.

Extra Remark

Full book text:

INSTANCE: A set A of attribute names, a collection F of functional dependencies on A, and a specified name x ∈ A.
QUESTION: Is x a "prime attribute name" for <A,F>, i.e., is there a key K for <A,F> such that x ∈ K?
Reference: [Lucchesi and Osborne, 1977]. Transformation from MINIMUM CARDINALITY KEY.

Connection to normal forms: In database normalization theory:

  • A "prime attribute" is an attribute that belongs to at least one candidate key.
  • A "non-prime attribute" belongs to no candidate key.
  • 2NF requires that no non-prime attribute is partially dependent on any candidate key.
  • 3NF requires that for every non-trivial FD X -> A, either X is a superkey or A is a prime attribute.
  • Since determining whether an attribute is prime is NP-complete, checking 2NF and 3NF is also intractable in general.

How to solve

  • It can be solved by (existing) bruteforce -- enumerate all subsets of A containing x, check each for key property (closure = A, minimality).
  • It can be solved by reducing to integer programming -- binary variable per attribute; constraint x = 1; constraint that closure covers A; minimality constraint; feasibility check.
  • Other: Use Lucchesi-Osborne key enumeration with early termination when a key containing x is found. Can also reduce to SAT.

Example Instance

Instance 1 (YES — x is prime):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
Functional dependencies F:

  • {0, 1} -> {2, 3, 4, 5}
  • {2, 3} -> {0, 1, 4, 5}
  • {0, 3} -> {1, 2, 4, 5}
    query_attribute = 3

Analysis of candidate keys:

  • {0, 1}: closure = {0,1,2,3,4,5} = A. Key. Does NOT contain 3.
  • {2, 3}: closure = {2,3,0,1,4,5} = A. Key. Contains 3.

Since {2, 3} is a candidate key containing 3, attribute 3 is prime.

Answer: YES.

Instance 2 (NO — x is not prime):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
Functional dependencies F:

  • {0, 1} -> {2, 3, 4, 5}

query_attribute = 3

Analysis:

  • {0, 1}: closure = {0,1,2,3,4,5} = A. Key. Minimal. Does NOT contain 3.
  • All other pairs: {0,2} closure = {0,2}. {0,3} closure = {0,3}. {1,2} closure = {1,2}. etc. None determine A.
  • Triples containing 3: {0,1,3} is a superkey but not minimal (since {0,1} is already a key).
  • The only candidate key is {0, 1}, which does not contain 3.

Answer: NO, attribute 3 is not prime.

Instance 3 (YES — non-trivial, multiple keys):
num_attributes = 8 (attributes: {0, 1, 2, 3, 4, 5, 6, 7})
Functional dependencies F:

  • {0, 1} -> {2}
  • {2, 3} -> {4, 5}
  • {0, 3} -> {6}
  • {1, 6} -> {7}
  • {4, 7} -> {0, 1}

query_attribute = 4

Analysis:

  • Key {0, 1, 3}: closure -> {0,1,2} -> {2,3,4,5} -> {0,3,6} -> {1,6,7} -> all = A. Key. Does NOT contain 4.
  • Key {3, 4, 7}: {4,7} -> {0,1} -> {2} -> {2,3} -> {4,5} -> {0,3} -> {6} -> {1,6} -> {7}. Closure = A. Contains 4. Minimal? Check {4,7}: closure = {4,7,0,1} -> {2} -> no more (no 3). Not a key. Check {3,4}: no FD fires. Check {3,7}: no FD fires. So {3,4,7} is minimal.

Since {3, 4, 7} is a candidate key containing 4, attribute 4 is prime.

Answer: YES.

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.modelA model problem to be implemented.

    Type

    No type

    Projects

    Status

    Ready

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions