Attributes

What are Attributes?

Attributes are properties or characteristics that describe entities in a database. They represent the specific pieces of data that we want to store about each entity and form the basis for columns in database tables.

In simple terms, attributes are the details we need to know about the things (entities) we’re tracking in our database. For example, if we have a “Student” entity, attributes might include name, ID number, date of birth, and address.

Purpose of Attributes

Attributes serve several important purposes in database design:

Data Storage: Define what specific information needs to be stored
Entity Description: Provide detailed characteristics of entities
Differentiation: Help distinguish one entity instance from another
Search Criteria: Allow for querying and retrieving specific entities
Business Rules: Capture constraints and validation rules
Relationships: Sometimes used to establish connections between entities

Types of Attributes

Attributes can be classified into several types based on their characteristics:

1. Simple vs. Composite Attributes

Simple Attribute

Cannot be divided further into smaller components
Represents a single, atomic value
Examples: Age, Gender, Employee ID, Price

Composite Attribute

Can be divided into smaller constituent parts
Made up of multiple simple attributes
Represented in ER diagrams as an attribute with connected sub-attributes
Examples:
- Address (consists of Street, City, State, ZIP)
- Name (consists of First Name, Middle Name, Last Name)
- Date (consists of Day, Month, Year)

2. Single-valued vs. Multi-valued Attributes

Single-valued Attribute

Can hold only one value for each entity instance
The most common type of attribute
Examples: Date of Birth, Social Security Number, Employee ID

Multi-valued Attribute

Can hold multiple values for the same entity instance
Represented with a double oval in ER diagrams
Usually implemented as separate tables in relational databases
Examples:
- Phone Numbers (a person may have home, work, and mobile numbers)
- Skills (an employee may have multiple skills)
- Email Addresses (a user may have multiple email addresses)

3. Stored vs. Derived Attributes

Stored Attribute

Physically stored in the database
Basic attribute whose value is directly entered and maintained
Examples: Date of Birth, Name, Salary

Derived Attribute

Value calculated or derived from other attributes
Not physically stored but computed when needed
Represented with a dashed oval in ER diagrams
Examples:
- Age (derived from Date of Birth and current date)
- Total Price (derived from Quantity and Unit Price)
- Full Name (derived from First Name and Last Name)
- Years of Service (derived from Hire Date)

4. Key vs. Non-key Attributes

Key Attribute

Uniquely identifies each entity instance within an entity set
Used as a primary key in the corresponding database table
Underlined in ER diagrams
Examples: Student ID, Employee ID, Product Code, Social Security Number

Non-key Attribute

Does not uniquely identify an entity
Regular descriptive attribute
Examples: Name, Address, Color, Size

5. Required vs. Optional Attributes

Required Attribute

Must have a value for every entity instance
Cannot be null
Examples: Employee ID, Student Name, Product Code

Optional Attribute

May have no value for some entity instances
Can be null
Examples: Middle Name, Fax Number, Secondary Email

6. Domain-Specific Attributes

Domain-Specific Attribute

Has a specific defined domain or range of allowable values
Often implemented with check constraints in databases
Examples:
- Gender (domain: ‘Male’, ‘Female’, ‘Non-binary’)
- Status (domain: ‘Active’, ‘Inactive’, ‘Pending’)
- Grade (domain: ‘A’, ‘B’, ‘C’, ‘D’, ‘F’)

Attribute Domains

The domain of an attribute defines the set of allowable values that the attribute can take. It includes:

Data Type: The kind of data the attribute can store (e.g., integer, string, date)
Range or List: The specific values or range of values that are valid
Format: Any specific format requirements (e.g., phone number format)
Default Value: The value assigned if none is specified
Constraints: Rules that restrict the possible values

Examples of Domains:

Age: Integer values between 0 and 150
Email: String following a valid email format
US Zip Code: 5-digit or 9-digit format
Credit Card Type: One of ‘Visa’, ‘MasterCard’, ‘American Express’, ‘Discover’

Representation of Attributes in ER Diagrams

Attributes are represented differently in various ER notation styles:

Chen Notation

Attributes shown as ovals/ellipses connected to entity rectangles
Attribute name written inside the oval
Key attributes are underlined
Multi-valued attributes use double ovals
Derived attributes use dashed ovals
Composite attributes have their component attributes connected to them

Crow’s Foot Notation

Attributes listed inside the entity rectangle
Primary key attributes often listed at the top, sometimes marked with PK
Multi-valued attributes often not explicitly shown (implemented as separate tables)
Derived attributes sometimes marked with an asterisk (*) or shown in italics

IDEF1X Notation

Key attributes listed above a horizontal line in the entity box
Non-key attributes listed below the line
Foreign keys often marked with FK
Derived attributes typically not shown

Determining Appropriate Attributes

When designing a database, consider these guidelines for selecting appropriate attributes:

1. Relevance

Only include attributes that are relevant to the purpose of the database
Ask: “Will this information be needed for reporting or decision-making?“

2. Atomicity

Break down attributes to their most basic components when appropriate
Consider how the data will be used to determine level of detail

3. Redundancy

Avoid storing the same information in multiple places
Use derived attributes for data that can be calculated

4. Storage Efficiency

Consider the storage implications of different attribute choices
Balance between storage efficiency and query performance

5. Future Requirements

Consider potential future needs when defining attributes
Allow for flexibility where appropriate

Attribute Implementation in Relational Databases

When implementing attributes in a relational database:

1. Simple Attributes

Become columns in the corresponding table
Data type is determined by the attribute’s domain

2. Composite Attributes

Can be implemented in multiple ways:
- As separate columns for each component (e.g., street, city, state)
- As a single column for the entire composite attribute
- As separate columns plus a calculated column for the full value

3. Multi-valued Attributes

Typically implemented as a separate table
The new table includes:
- The primary key of the main entity
- The multi-valued attribute
- A primary key that combines both columns

4. Derived Attributes

Can be implemented as:
- Virtual columns or computed columns in databases that support them
- Views
- Application logic
- Stored procedures or functions
- Triggers that maintain calculated values

5. Key Attributes

Implemented as primary key columns
Often indexed for performance
May be surrogate keys (system-generated) or natural keys (based on business data)

Common Attribute Design Issues

1. Overloaded Attributes

Problem: Using one attribute to store multiple types of information
Example: Using “Phone” to store both home and work numbers
Solution: Create separate attributes for each type of information

2. Splitting Attributes Inappropriately

Problem: Breaking down attributes too far
Example: Splitting “Name” into too many components (prefix, first, middle, last, suffix)
Solution: Balance atomicity with practicality based on how the data will be used

3. Encoding Information in Attributes

Problem: Embedding multiple pieces of information in a single value
Example: Employee ID “IT-1067-NY” (department, number, location)
Solution: Separate the encoded information into distinct attributes

4. Inconsistent Data Types

Problem: Using different data types for similar attributes across entities
Example: “Price” as decimal in one entity and integer in another
Solution: Standardize data types for conceptually similar attributes

5. Naming Inconsistencies

Problem: Using different names for the same concept
Example: “Customer_ID” in one entity and “Cust_Num” in another
Solution: Adopt and enforce naming conventions

Best Practices for Attribute Design

Use Descriptive Names:
- Choose clear, self-explanatory names
- Follow consistent naming conventions
- Avoid abbreviations unless they are standard and well-understood
Define Domains Clearly:
- Specify data types, lengths, and constraints
- Document allowed values and formats
- Include validation rules
Consider Performance:
- Understand how attribute choices affect query performance
- Choose appropriate data types for efficiency
- Consider indexing needs for attributes used in searches
Plan for Nulls:
- Decide which attributes can have null values
- Consider using default values instead of nulls where appropriate
- Understand the implications of nulls for queries and constraints
Document Attributes:
- Maintain a data dictionary with attribute definitions
- Include business rules and validation criteria
- Document the source of derived attributes
Balance Flexibility and Control:
- Design attributes to accommodate future needs
- Use constraints to maintain data integrity
- Consider using enumerated types for limited value sets

Attributes form the fundamental building blocks of database design, defining exactly what information will be stored about each entity. Careful attention to attribute design ensures that a database can efficiently and accurately represent the real-world objects and concepts it is intended to model.