What is Referential Integrity?
Referential integrity is a database concept that ensures relationships between tables remain consistent. It guarantees that a reference from one table to another is valid by ensuring that the referenced value exists in the target table.
The Foreign Key Rule
Referential integrity is enforced through the foreign key constraint, which states:
If a foreign key exists in a table, either the foreign key value must match a primary key value in the referenced table or it must be NULL (if allowed).
This rule maintains consistency by preventing:
- References to non-existent records (orphaned records)
- Accidental deletion of data that other records depend on
Components of Referential Integrity
1. Referenced Table (Parent Table)
- Contains the primary key that is referenced by other tables
- Usually represents the “one” side in a one-to-many relationship
2. Referencing Table (Child Table)
- Contains the foreign key that references the primary key in the parent table
- Usually represents the “many” side in a one-to-many relationship
3. Foreign Key Constraint
- Defines the relationship between tables
- Specifies which columns in the child table reference which columns in the parent table
- Includes rules for what happens when the referenced data changes
Implementing Referential Integrity
Basic Foreign Key Syntax
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderDate DATE,
CustomerID INT,
TotalAmount DECIMAL(10,2),
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
);
In this example:
Ordersis the child tableCustomersis the parent tableCustomerIDin the Orders table is a foreign key that references the CustomerID in the Customers table
Composite Foreign Keys
A foreign key can consist of multiple columns that reference a composite primary key:
CREATE TABLE OrderDetails (
OrderID INT,
ProductID INT,
Quantity INT,
UnitPrice DECIMAL(10,2),
PRIMARY KEY (OrderID, ProductID),
FOREIGN KEY (OrderID) REFERENCES Orders(OrderID),
FOREIGN KEY (ProductID) REFERENCES Products(ProductID)
);
Referential Actions
When a primary key value in the parent table is updated or deleted, there are several possible actions that can be taken on the corresponding foreign key values:
1. CASCADE
- ON DELETE CASCADE: When a parent record is deleted, automatically delete all related child records
- ON UPDATE CASCADE: When a primary key is updated, automatically update all related foreign keys
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE,
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
ON DELETE CASCADE
ON UPDATE CASCADE
);
2. SET NULL
- ON DELETE SET NULL: When a parent record is deleted, set the foreign key to NULL
- ON UPDATE SET NULL: When a primary key is updated, set the foreign key to NULL
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE,
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
ON DELETE SET NULL
ON UPDATE SET NULL
);
3. SET DEFAULT
- ON DELETE SET DEFAULT: When a parent record is deleted, set the foreign key to its default value
- ON UPDATE SET DEFAULT: When a primary key is updated, set the foreign key to its default value
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT DEFAULT 0,
OrderDate DATE,
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
ON DELETE SET DEFAULT
ON UPDATE SET DEFAULT
);
4. RESTRICT / NO ACTION
- ON DELETE RESTRICT: Prevent deletion of a parent record if it has related child records
- ON UPDATE RESTRICT: Prevent updates to a primary key if it has related foreign keys
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE,
FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
ON DELETE RESTRICT
ON UPDATE RESTRICT
);
Note: NO ACTION is similar to RESTRICT but is checked at the end of the statement, while RESTRICT is checked immediately.
Choosing the Right Referential Action
The choice of referential action depends on business requirements:
| Action | When to Use |
|---|---|
| CASCADE | When child records should follow the parent’s fate |
| SET NULL | When child records can exist without a parent |
| SET DEFAULT | When child records should revert to a default parent |
| RESTRICT | When operations on parent should be prevented if children exist |
Referential Integrity Violations
Referential integrity is violated when:
- Inserting Invalid References: Adding a record with a foreign key value that doesn’t exist in the parent table
- Deleting Referenced Records: Deleting a parent record that has dependent child records (without appropriate action)
- Updating Referenced Keys: Changing a primary key value that is referenced by foreign keys (without appropriate action)
Example of a Violation:
-- Assuming there is no customer with CustomerID = 999
INSERT INTO Orders (OrderID, CustomerID, OrderDate)
VALUES (101, 999, '2023-03-15');
-- This will fail with a foreign key violation error
Benefits of Referential Integrity
- Data Consistency: Ensures relationships between tables are valid
- Prevents Orphaned Records: No references to non-existent data
- Maintains Business Rules: Enforces real-world relationships
- Centralized Rule Enforcement: Database enforces rules rather than application code
- Cascading Operations: Automates related updates and deletes
Referential Integrity Challenges
- Performance Impact: Checking constraints adds overhead to data operations
- Circular References: Complex when tables reference each other
- Distributed Databases: Harder to maintain across multiple database systems
- Bulk Operations: Can slow down large data loads or migrations
Implementation Approaches
1. Declarative Referential Integrity (DRI)
- Implemented using FOREIGN KEY constraints in SQL
- Enforced automatically by the DBMS
- Most common and recommended approach
2. Procedural Referential Integrity
- Implemented using triggers, stored procedures, or application code
- Used when declarative constraints are insufficient
- More flexible but requires more maintenance
Deferred Constraint Checking
In some DBMS, you can defer constraint checking until the end of a transaction:
SET CONSTRAINTS ALL DEFERRED;
-- Perform operations that temporarily violate referential integrity
COMMIT; -- Constraints are checked at this point
This is useful for operations where the database passes through an inconsistent state temporarily.
Referential Integrity in Database Design
When designing databases with referential integrity:
- Plan the relationships between entities carefully
- Choose appropriate referential actions based on business requirements
- Consider the impact on data operations, especially for large tables
- Document the relationships for clarity
- Test thoroughly with different scenarios