Data

What is Data?

Data refers to raw, unorganized facts and figures that, on their own, have little meaning. Data can be in the form of numbers, text, images, audio, video, or other formats. In the context of databases, data represents facts about entities that can be recorded and have implicit meaning.

Types of Data

  1. Structured Data: Organized in a predefined format

    • Examples: Spreadsheets, relational databases, phone numbers, dates
    • Easy to search and analyze
    • Usually stored in relational databases
  2. Semi-structured Data: Has some organizational properties but doesn’t conform to a rigid structure

    • Examples: XML files, JSON documents, email
    • More flexible than structured data
    • Often stored in NoSQL databases
  3. Unstructured Data: Has no predefined format or organization

    • Examples: Text documents, social media posts, images, videos
    • Most difficult to search and analyze
    • Typically stored in NoSQL databases or specialized systems

Data Hierarchy

Data in a database is typically organized in a hierarchy:

  1. Bit: The smallest unit of data (0 or 1)
  2. Byte: A collection of 8 bits
  3. Field/Attribute: A single piece of information (e.g., name, age)
  4. Record/Tuple: A collection of related fields (e.g., all information about one person)
  5. Table/Relation: A collection of records of the same type
  6. Database: A collection of related tables or other data structures

Data Properties

Good data has several important properties:

  1. Accuracy: Data should correctly represent the real-world entity or event
  2. Completeness: All required data should be present
  3. Consistency: Data should be consistent across the database
  4. Timeliness: Data should be up-to-date
  5. Validity: Data should conform to the defined format or rules
  6. Uniqueness: No unnecessary duplications

Data vs. Information

It’s important to understand the difference between data and information:

  • Data: Raw facts without context (e.g., “25”)
  • Information: Data that has been processed, organized, or given context to make it meaningful (e.g., “John is 25 years old”)

In a DBMS, raw data is stored in databases and then transformed into useful information through queries and processing.

The Role of Data in DBMS

In a DBMS:

  • Data is stored in an organized manner
  • Data is protected from unauthorized access
  • Data integrity is maintained
  • Duplicate data is minimized
  • Data can be shared among multiple users
  • Data can be easily retrieved using queries

Understanding data is fundamental to working with databases, as the primary purpose of any DBMS is to effectively store, manage, and retrieve data.