What is an Entity Set?
An entity set is a collection of similar entities that share the same properties or attributes. In database design, an entity set forms the foundation for creating tables in a relational database.
Think of an entity set as a template or blueprint that defines what information will be stored about a particular type of object in the database. Each individual entity within the set is an instance or occurrence of that entity type.
Characteristics of Entity Sets
An entity set has several important characteristics:
-
Common Attributes: All entities in an entity set share the same attributes, though the values of these attributes differ for each entity instance.
-
Identifiable Instances: Each entity in the set can be uniquely identified by the value of its key attribute(s).
-
Well-Defined Boundary: There are clear criteria for determining which entities belong to the set and which do not.
-
Relevant to Domain: Entity sets represent objects or concepts that are important to the business domain being modeled.
-
Persistence: Entity sets typically represent data that needs to be stored persistently in the database.
Types of Entity Sets
Entity sets can be categorized in several ways:
Based on Dependency
1. Strong Entity Set
A strong entity set (also called a regular entity set) exists independently and has a primary key that uniquely identifies each entity instance.
Characteristics:
- Can exist on its own
- Has its own primary key
- Not dependent on other entity sets for identification
- Represented as a single rectangle in ER diagrams
Examples:
- Employee (with Employee_ID as primary key)
- Department (with Department_ID as primary key)
- Customer (with Customer_ID as primary key)
2. Weak Entity Set
A weak entity set cannot exist independently and depends on a strong entity set (called the owner or parent entity set) for its existence and identification.
Characteristics:
- Cannot exist without its owner entity set
- Does not have a complete primary key of its own
- Identified by its partial key plus the primary key of its owner
- Represented as a double rectangle in ER diagrams
- Connected to owner entity through an identifying relationship (double diamond)
Examples:
- Dependent (of an Employee) - identified by Dependent_Name plus Employee_ID
- Order_Line_Item (of an Order) - identified by Line_Number plus Order_ID
- Room (in a Building) - identified by Room_Number plus Building_ID
Based on Role
1. Base Entity Set
Represents the primary or fundamental entity types in the domain.
Examples:
- Person
- Product
- Location
2. Associative Entity Set
Represents a many-to-many relationship between other entity sets, often with attributes of its own.
Examples:
- Enrollment (connecting Student and Course)
- Order_Item (connecting Order and Product)
- Project_Assignment (connecting Employee and Project)
3. Subtype Entity Set
Represents a specialized version of a more general entity set.
Examples:
- Full_Time_Employee and Part_Time_Employee (subtypes of Employee)
- Checking_Account and Savings_Account (subtypes of Account)
- Fiction_Book and Non_Fiction_Book (subtypes of Book)
Entity Sets vs. Entities
It’s important to distinguish between entity sets and individual entities:
Entity Set
- A collection or set of similar entities
- Describes the structure (attributes) common to all instances
- Comparable to a class in object-oriented programming
- Maps to a table in a relational database
- Examples: Employee, Department, Product
Entity
- A single instance within an entity set
- Has specific values for its attributes
- Comparable to an object in object-oriented programming
- Maps to a row in a relational database table
- Examples: Employee “John Smith”, Department “Marketing”, Product “Laptop X1”
Creating Entity Sets in ER Modeling
When defining entity sets in an ER model, follow these steps:
1. Identify Entity Sets
- Look for nouns in the requirements that represent important objects
- Consider what data needs to be stored persistently
- Determine if the object has attributes worth tracking
- Check if multiple instances of the object exist
2. Define Attributes
- Identify properties that describe each entity
- Determine the primary key attribute(s)
- Classify attributes (simple, composite, multi-valued, derived)
- Specify the domain (data type and constraints) for each attribute
3. Establish Relationships
- Identify how different entity sets relate to each other
- Determine the cardinality of relationships
- Identify whether participation is total or partial
- Create relationship sets to connect entity sets
4. Identify Weak Entity Sets
- Determine if any entity sets depend on others for identification
- Establish identifying relationships with owner entity sets
- Define partial keys for weak entity sets
Representing Entity Sets in ER Diagrams
Entity sets are represented differently depending on the ER notation used:
Chen Notation
- Strong entity sets: Rectangles with the entity set name inside
- Weak entity sets: Double rectangles
- Attributes: Ovals connected to the entity rectangle
- Key attributes: Underlined names in ovals
Crow’s Foot Notation
- Entity sets: Rectangles with the entity set name at the top
- Attributes: Listed inside the rectangle
- Primary key attributes: Marked with “PK” or underlined
- Weak entity sets: Sometimes shown with rounded corners or special notation
IDEF1X Notation
- Independent entity sets: Rectangles with square corners
- Dependent entity sets: Rectangles with rounded corners
- Key attributes: Listed above a horizontal line inside the rectangle
- Non-key attributes: Listed below the line
Entity Set Constraints
Entity sets are subject to various constraints that maintain data integrity:
1. Domain Constraints
Restrict the values that attributes can take within an entity set.
Examples:
- Age attribute must be a positive integer
- Gender attribute must be one of predefined values
- Salary attribute must be within a valid range
2. Key Constraints
Ensure that each entity in an entity set can be uniquely identified.
Examples:
- Employee_ID must be unique within the Employee entity set
- Social_Security_Number must be unique for each Person
- The combination of First_Name and Last_Name might serve as a key
3. Entity Integrity Constraint
States that no primary key attribute can have a null value.
Example:
- Every Student must have a Student_ID value
4. Referential Integrity Constraint
Ensures that relationships between entity sets remain consistent.
Example:
- Every Department_ID in an Employee record must refer to a valid Department
Entity Sets in Database Implementation
When implementing a database based on an ER model, entity sets are transformed into tables:
1. For Strong Entity Sets
- Create a table with the same name as the entity set
- Define columns for all simple attributes
- Break down composite attributes into simple component columns
- Implement multi-valued attributes as separate tables
- Designate the primary key column(s)
2. For Weak Entity Sets
- Create a table with the same name as the weak entity set
- Include all attributes of the weak entity set
- Include the primary key of the owner entity set as a foreign key
- Define the primary key as the combination of the owner’s primary key and the weak entity’s partial key
- Create a foreign key constraint referencing the owner table
3. For Associative Entity Sets
- Create a table with columns for the primary keys of the connected entity sets
- Add columns for any attributes of the relationship
- Define the primary key as the combination of the foreign keys
- Create foreign key constraints to the connected tables
Best Practices for Designing Entity Sets
-
Use Clear Naming Conventions:
- Use singular nouns for entity set names (e.g., “Employee” not “Employees”)
- Make names descriptive and self-explanatory
- Be consistent in capitalization and formatting
-
Choose Appropriate Granularity:
- Avoid overly broad entity sets that try to represent too many concepts
- Don’t create unnecessarily specific entity sets that could be combined
-
Proper Attribute Assignment:
- Assign attributes to the most appropriate entity set
- Avoid duplication of attributes across entity sets
- Use relationships rather than duplicating data
-
Thoughtful Key Selection:
- Choose primary keys that are stable and unlikely to change
- Consider using surrogate keys (system-generated IDs) for simplicity
- Ensure keys will uniquely identify entities even as the database grows
-
Consider Future Growth:
- Design entity sets that can accommodate future requirements
- Allow for extensibility where appropriate
- Balance current needs with flexibility for the future
Entity sets form the foundation of ER modeling and database design, providing a way to organize and structure data that reflects the real-world domain being modeled.