Referential Integrity

What is Referential Integrity?

Referential integrity is a database concept that ensures relationships between tables remain consistent. It guarantees that a reference from one table to another is valid by ensuring that the referenced value exists in the target table.

The Foreign Key Rule

Referential integrity is enforced through the foreign key constraint, which states:

If a foreign key exists in a table, either the foreign key value must match a primary key value in the referenced table or it must be NULL (if allowed).

This rule maintains consistency by preventing:

  1. References to non-existent records (orphaned records)
  2. Accidental deletion of data that other records depend on

Components of Referential Integrity

1. Referenced Table (Parent Table)

  • Contains the primary key that is referenced by other tables
  • Usually represents the “one” side in a one-to-many relationship

2. Referencing Table (Child Table)

  • Contains the foreign key that references the primary key in the parent table
  • Usually represents the “many” side in a one-to-many relationship

3. Foreign Key Constraint

  • Defines the relationship between tables
  • Specifies which columns in the child table reference which columns in the parent table
  • Includes rules for what happens when the referenced data changes

Implementing Referential Integrity

Basic Foreign Key Syntax

CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    OrderDate DATE,
    CustomerID INT,
    TotalAmount DECIMAL(10,2),
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
);

In this example:

  • Orders is the child table
  • Customers is the parent table
  • CustomerID in the Orders table is a foreign key that references the CustomerID in the Customers table

Composite Foreign Keys

A foreign key can consist of multiple columns that reference a composite primary key:

CREATE TABLE OrderDetails (
    OrderID INT,
    ProductID INT,
    Quantity INT,
    UnitPrice DECIMAL(10,2),
    PRIMARY KEY (OrderID, ProductID),
    FOREIGN KEY (OrderID) REFERENCES Orders(OrderID),
    FOREIGN KEY (ProductID) REFERENCES Products(ProductID)
);

Referential Actions

When a primary key value in the parent table is updated or deleted, there are several possible actions that can be taken on the corresponding foreign key values:

1. CASCADE

  • ON DELETE CASCADE: When a parent record is deleted, automatically delete all related child records
  • ON UPDATE CASCADE: When a primary key is updated, automatically update all related foreign keys
CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerID INT,
    OrderDate DATE,
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);

2. SET NULL

  • ON DELETE SET NULL: When a parent record is deleted, set the foreign key to NULL
  • ON UPDATE SET NULL: When a primary key is updated, set the foreign key to NULL
CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerID INT,
    OrderDate DATE,
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
        ON DELETE SET NULL
        ON UPDATE SET NULL
);

3. SET DEFAULT

  • ON DELETE SET DEFAULT: When a parent record is deleted, set the foreign key to its default value
  • ON UPDATE SET DEFAULT: When a primary key is updated, set the foreign key to its default value
CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerID INT DEFAULT 0,
    OrderDate DATE,
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
        ON DELETE SET DEFAULT
        ON UPDATE SET DEFAULT
);

4. RESTRICT / NO ACTION

  • ON DELETE RESTRICT: Prevent deletion of a parent record if it has related child records
  • ON UPDATE RESTRICT: Prevent updates to a primary key if it has related foreign keys
CREATE TABLE Orders (
    OrderID INT PRIMARY KEY,
    CustomerID INT,
    OrderDate DATE,
    FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
        ON DELETE RESTRICT
        ON UPDATE RESTRICT
);

Note: NO ACTION is similar to RESTRICT but is checked at the end of the statement, while RESTRICT is checked immediately.

Choosing the Right Referential Action

The choice of referential action depends on business requirements:

ActionWhen to Use
CASCADEWhen child records should follow the parent’s fate
SET NULLWhen child records can exist without a parent
SET DEFAULTWhen child records should revert to a default parent
RESTRICTWhen operations on parent should be prevented if children exist

Referential Integrity Violations

Referential integrity is violated when:

  1. Inserting Invalid References: Adding a record with a foreign key value that doesn’t exist in the parent table
  2. Deleting Referenced Records: Deleting a parent record that has dependent child records (without appropriate action)
  3. Updating Referenced Keys: Changing a primary key value that is referenced by foreign keys (without appropriate action)

Example of a Violation:

-- Assuming there is no customer with CustomerID = 999
INSERT INTO Orders (OrderID, CustomerID, OrderDate)
VALUES (101, 999, '2023-03-15');
-- This will fail with a foreign key violation error

Benefits of Referential Integrity

  1. Data Consistency: Ensures relationships between tables are valid
  2. Prevents Orphaned Records: No references to non-existent data
  3. Maintains Business Rules: Enforces real-world relationships
  4. Centralized Rule Enforcement: Database enforces rules rather than application code
  5. Cascading Operations: Automates related updates and deletes

Referential Integrity Challenges

  1. Performance Impact: Checking constraints adds overhead to data operations
  2. Circular References: Complex when tables reference each other
  3. Distributed Databases: Harder to maintain across multiple database systems
  4. Bulk Operations: Can slow down large data loads or migrations

Implementation Approaches

1. Declarative Referential Integrity (DRI)

  • Implemented using FOREIGN KEY constraints in SQL
  • Enforced automatically by the DBMS
  • Most common and recommended approach

2. Procedural Referential Integrity

  • Implemented using triggers, stored procedures, or application code
  • Used when declarative constraints are insufficient
  • More flexible but requires more maintenance

Deferred Constraint Checking

In some DBMS, you can defer constraint checking until the end of a transaction:

SET CONSTRAINTS ALL DEFERRED;

-- Perform operations that temporarily violate referential integrity

COMMIT; -- Constraints are checked at this point

This is useful for operations where the database passes through an inconsistent state temporarily.

Referential Integrity in Database Design

When designing databases with referential integrity:

  1. Plan the relationships between entities carefully
  2. Choose appropriate referential actions based on business requirements
  3. Consider the impact on data operations, especially for large tables
  4. Document the relationships for clarity
  5. Test thoroughly with different scenarios