Introduction to File System

Definition

File System: Software layer managing storage and retrieval of data on disk. Provides abstraction from raw blocks to logical files and directories.

Need for File System

Raw Disk Problems

Raw disk: Just blocks (512B or 4KB units)
Block 0, Block 1, Block 2, ..., Block N

Problems:
❌ No structure (which blocks belong together?)
❌ No naming (which block is "myfile.txt"?)
❌ No protection (can delete others' data)
❌ No convenience (manage blocks yourself?)

Solution: File System

File Abstraction

File

Named collection of related data:

File: myfile.txt
Content: "Hello world!"
Size: 12 bytes
Created: 2024-01-15
Modified: 2024-01-16
Permissions: read/write

Directories

Hierarchical organization:

/home/user/documents/
├── file1.txt
├── file2.doc
└── photos/
    ├── vacation.jpg
    └── family.png

Abstraction Benefits

User perspective:
Open file "myfile.txt"
Read data
Close file

Actual work:
1. Find file (directory lookup)
2. Find blocks on disk
3. Allocate buffer
4. Read blocks from disk
5. Copy to user memory
6. Free buffer

Transparent to user!

File System Components

1. Boot Block

System startup code:

Disk Layout:
┌─────────────────┐
│ Boot Block      │ Block 0 (startup code)
├─────────────────┤
│ Superblock      │ File system metadata
├─────────────────┤
│ Inode List      │ File metadata
├─────────────────┤
│ Data Blocks     │ Actual file data
└─────────────────┘

2. Superblock

File system metadata:

Superblock contains:
- Total blocks count
- Block size
- Inode count
- Free blocks count
- Free inodes count
- Block group size
- Timestamps
- Mount count
- Dirty bit (clean shutdown?)

3. Inode Table

File metadata:

Inode (Index Node):
- File size
- Owner (user ID)
- Permissions (read/write/execute)
- Creation/modification time
- Disk block locations
- Link count
- Type (file/directory/symlink)

Example inode:
Inode 1234:
  Size: 4096 bytes
  Owner: user_id 500
  Permissions: 644 (rw-r--r--)
  Blocks: [100, 101, 102, 103]
  Links: 1
  Type: Regular file

4. Data Blocks

Actual file content:

File: myfile.txt
Content: "Hello world! This is a test."

Stored in blocks:
Block 100: "Hello world! Thi"
Block 101: "s is a test."

Inode 1234 points to blocks [100, 101]

5. Free Space Management

Track available blocks:

Methods:
1. Bitmap: One bit per block (1=free, 0=used)
   1GB disk, 4KB blocks = 262,144 blocks
   Bitmap: 262,144 bits = 32KB

2. Free list: Linked list of free blocks

3. Free block pool: Keep set of free blocks handy

File Access Methods

Sequential Access

Read file from beginning to end:

Reading file "document.txt":
Byte 0: Read
Byte 1: Read
Byte 2: Read
...
End: Read

Efficient for disk (no seeking)

Direct Access

Jump to specific location:

Random access database:
Record 1000: Seek to byte 1000000
Record 2000: Seek to byte 2000000

Inefficient on disk (lots of seeking)
Modern systems handle with caching

Indexed Access

Use index structure:

Index:
"Smith" → Block 50
"Jones" → Block 100
"Brown" → Block 75

Lookup Smith → Read block 50
Fast for large files!

File Types (Unix/Linux)

Regular File

-rw-r--r--  user  file.txt
(- = regular file)

Contains arbitrary data
Text or binary

Symbolic Link

lrw-r--r--  user  link → /path/to/file
(l = link)

Points to another file

Block Device

brw-rw----  root  /dev/sda1
(b = block device)

Disk, partition (block-addressable)

Character Device

crw-rw-rw-  root  /dev/tty
(c = character device)

Terminal, serial port (character-addressable)

File Operations

Create

Create file "newfile.txt":
1. Allocate inode
2. Record file metadata
3. Add directory entry
4. Mark inode as allocated

Open

Open file "myfile.txt":
1. Find file in directory
2. Load inode into memory
3. Check permissions (can I read/write?)
4. Create file descriptor (handle)
5. Initialize file pointer (position 0)

Read

Read 100 bytes from open file:
1. Check file pointer position
2. Determine which block needed
3. Load block if not in cache
4. Copy bytes to user buffer
5. Update file pointer
6. Return bytes read

Write

Write 100 bytes to open file:
1. Determine which block needed
2. Allocate new block if needed
3. Copy bytes to block
4. Mark block dirty
5. Update file size
6. Update modification time

Close

Close file:
1. Flush dirty blocks to disk
2. Free file descriptor
3. Update metadata (modification time)
4. Release resources

File Names

Conventions

Case sensitivity:
Unix: Case-sensitive (file.txt ≠ FILE.TXT)
Windows: Case-insensitive (same files)

Extensions:
.txt, .doc, .jpg (conventions, not enforced)
(Unix doesn't require extensions)

Max length:
255 characters typical (varies by filesystem)

Paths

Absolute path: /home/user/documents/file.txt
(starts with /)

Relative path: documents/file.txt
(from current directory)

Current: ~
Parent: ..

File Protection

Ownership

File attributes:
Owner: user_id (user who created)
Group: group_id (owning group)

Three levels:
Owner permissions
Group permissions
Others permissions

Permissions

Unix permissions: rwx (read, write, execute)

Examples:
755: rwxr-xr-x (owner: all, group: read+execute, others: read+execute)
644: rw-r--r-- (owner: read+write, group: read, others: read)
700: rwx------ (owner: all, group/others: none)

Execute on files: Run as program
Execute on directories: Can enter directory

File System Structure

Single-Level Directory

All files in one directory:
/
├── file1.txt
├── file2.txt
├── file3.txt

Problem: Name conflicts (two file1.txt?)
Ancient systems only

Two-Level Directory

Each user has own directory:
/user1/
├── file1.txt
└── file2.txt

/user2/
├── file1.txt
└── project/

Hierarchical (Tree)

Modern structure:
/
├── home/
│   ├── user1/
│   │   ├── documents/
│   │   └── downloads/
│   └── user2/
├── var/
├── etc/
└── usr/
    └── bin/

Most flexible, most complex

Summary

File system abstracts disk blocks into named files and directories. Provides protection, convenience, and organization. Key components: boot block, superblock, inode table, data blocks. Inodes store file metadata. Directories provide hierarchical structure. File operations: create, open, read, write, close. Permissions enforce access control. Modern systems use hierarchical directories. Understanding file systems essential for appreciating how OS manages persistent data. Different file systems (NTFS, ext4, APFS) implement concepts differently but principles same.