Distributed Systems

Definition

A distributed system is a collection of independent computers connected together through a network that communicate and coordinate with each other to achieve a common goal. To the user, the collection of computers appears as a single coherent system even though they are geographically distributed and have independent processing power, memory, and clock.

Basic Concept

In a distributed system:

  • Multiple independent computers (nodes) with separate memories and processors
  • Connected via network (LAN, WAN, Internet)
  • No shared memory - only message passing communication
  • No shared clock - each computer has own clock
  • Autonomous operation - each can work independently
  • Transparent to user - appears as single system
  • Loosely coupled - computers operate independently

Key Differences from Parallel Systems

AspectParallelDistributed
MemorySharedSeparate
CouplingTightLoose
CommunicationShared memoryNetwork messages
ClockSynchronizedIndependent
LocationSame placeDifferent locations
FailureAffects othersMay be isolated

System Architectures

1. Client-Server Architecture

Structure:

  • One or more powerful server computers providing services
  • Multiple client computers requesting services
  • Centralized resource management

How it works:

  1. Client sends request to server
  2. Server processes request
  3. Server sends response back to client
  4. Client receives response

Characteristics:

  • Clear separation of roles
  • Server is centralized
  • Multiple clients
  • Asynchronous communication

Examples:

  • Web servers (client = browser, server = web server)
  • Email systems (client = email client, server = mail server)
  • Database servers
  • File servers

Advantages:

  • Centralized management
  • Easy to understand
  • Security at server
  • Easy to scale clients

Disadvantages:

  • Server is bottleneck
  • Server failure affects all clients
  • Single point of failure

2. Peer-to-Peer (P2P) Architecture

Structure:

  • All nodes are equal peers
  • No central server
  • Each node acts as both client and server
  • Direct communication between peers

How it works:

  • Peer A requests data from Peer B
  • Peer B provides data
  • Peer B requests data from Peer C
  • All peers cooperate

Characteristics:

  • No hierarchy
  • Decentralized
  • Shared resources
  • Self-organizing

Examples:

  • BitTorrent (file sharing)
  • Blockchain (cryptocurrency)
  • Skype (VoIP)
  • Instant messaging

Advantages:

  • Highly scalable
  • No single point of failure
  • Better load distribution
  • Robust and resilient

Disadvantages:

  • More complex
  • Harder to control quality
  • Security challenges
  • Harder to implement

3. Hybrid Architecture

Structure:

  • Combines client-server with P2P
  • Some centralized components
  • Some distributed components

Example:

  • Torrent downloading (P2P) with tracker server (client-server)

Communication in Distributed Systems

Message Passing

Only way for independent computers to communicate:

Components:

  • Sender: Computer initiating communication
  • Message: Data to be transmitted
  • Network: Medium of transmission
  • Receiver: Computer receiving message

Communication Types

Synchronous Communication:

  • Sender waits for receiver response
  • Sender blocked until response arrives
  • Simple but slower

Asynchronous Communication:

  • Sender doesn’t wait for response
  • Sender continues working
  • Faster but complex coordination

Reliable Communication:

  • Messages guaranteed to arrive
  • TCP/IP protocol
  • Slower due to acknowledgments

Unreliable Communication:

  • Messages may be lost
  • UDP protocol
  • Faster but less safe

Network Protocols

TCP/IP:

  • Reliable ordered delivery
  • Used for email, web, file transfer
  • Slower but guaranteed

UDP:

  • Fast unreliable delivery
  • Used for video, audio streaming
  • Fast but may lose data

HTTP/HTTPS:

  • Web protocol
  • Client-server over TCP
  • Request-response pattern

RPC (Remote Procedure Call):

  • Call functions on remote computer
  • Appears like local call
  • Abstracts network complexity

Distributed System Challenges

1. Heterogeneity

Problem: Different hardware, software, networks

  • Different processors (Intel, ARM, etc.)
  • Different operating systems (Windows, Linux, etc.)
  • Different network technologies
  • Different file formats and protocols

Solution: Standardized protocols and middleware

  • Use standard TCP/IP
  • Use standard file formats
  • Use middleware to hide differences

2. Transparency

Goal: Hide distributed nature from users

Types:

  • Location Transparency: User doesn’t know where resource is
  • Migration Transparency: Resources can move without affecting users
  • Replication Transparency: Multiple copies appear as one
  • Concurrency Transparency: Multiple users don’t interfere

3. Reliability

Problem: Components may fail

  • Computer crashes
  • Network disconnection
  • Software errors
  • Data corruption

Solutions:

  • Redundancy: Duplicate computers and data
  • Replication: Multiple copies of data
  • Backup: Regular backups
  • Error Detection: Find and fix errors

4. Synchronization and Ordering

Problem: No shared clock, so ordering is difficult

  • Which event happened first?
  • Need to coordinate actions

Solutions:

  • Logical Clocks: Order based on causality
  • Vector Clocks: Track causality relationships
  • Timestamps: Use NTP for clock synchronization

5. Security

Problems:

  • Network eavesdropping
  • Unauthorized access
  • Data corruption
  • Denial of service attacks

Solutions:

  • Encryption: Scramble data in transit
  • Authentication: Verify user identity
  • Authorization: Control access rights
  • Firewall: Filter network traffic
  • Intrusion Detection: Detect attacks

6. Fault Tolerance and Availability

Problem: Failures happen in distributed systems

Availability: Percentage of time system is operational

  • 99% availability = ~3.5 days downtime per year
  • 99.9% availability = ~8 hours downtime per year
  • 99.99% availability = ~50 minutes downtime per year

Fault Tolerance: System continues despite failures

  • Replication: Multiple copies continue service
  • Redundancy: Backup systems take over
  • Graceful Degradation: Reduced service continues

Distributed System Advantages

  1. Resource Sharing - Share expensive resources across network
  2. Cost Effective - Use cheaper computers instead of mainframe
  3. Reliability - System survives individual failures
  4. Performance - Distribute processing load
  5. Scalability - Easy to add more computers
  6. Flexibility - Can add/remove computers dynamically
  7. Accessibility - Access resources from anywhere
  8. Incremental Growth - Grow system gradually

Distributed System Disadvantages

  1. Complexity - Much more complex than single computer systems
  2. Difficult Debugging - Hard to diagnose problems across network
  3. Network Dependency - Depends on reliable network
  4. Security Issues - More vulnerable to attacks
  5. Synchronization - Coordinating is difficult
  6. Data Consistency - Keeping copies identical is hard
  7. Latency - Network communication is slow
  8. Overhead - Message passing overhead
  9. Difficult Testing - Hard to test all scenarios

Real Examples of Distributed Systems

Internet

  • Largest distributed system
  • Billions of computers
  • TCP/IP based
  • DNS for name resolution

World Wide Web

  • Distributed information system
  • HTTP protocol
  • Millions of servers
  • Client browsers request resources

Cloud Computing

  • AWS (Amazon Web Services)
  • Google Cloud
  • Microsoft Azure
  • On-demand computing resources

Email Systems

  • Distributed message delivery
  • SMTP for sending
  • IMAP/POP for receiving
  • Messages routed through network

Content Delivery Networks

  • Akamai, CloudFlare
  • Replicate content across world
  • Serve from nearest location
  • Reduce latency

Databases

  • Distributed databases (MongoDB, Cassandra)
  • Data replicated across nodes
  • Queries processed locally
  • Synchronization across replicas

Blockchain and Cryptocurrency

  • Bitcoin, Ethereum
  • Decentralized ledger
  • All nodes have copy
  • Consensus mechanism

Distributed Operating System Functions

1. Process Management

  • Create processes on different computers
  • Process migration between computers
  • Remote execution of programs
  • Load balancing across computers

2. Memory Management

  • Distributed memory allocation
  • Cache management across network
  • Distributed virtual memory
  • Page replacement across nodes

3. File Management

  • Distributed file systems (NFS, AFS)
  • Replicate files across computers
  • Consistency maintenance
  • Access control

4. Communication Management

  • Message passing infrastructure
  • Network protocol management
  • Data serialization
  • Latency hiding

5. Synchronization

  • Distributed mutual exclusion
  • Deadlock detection
  • Clock synchronization
  • Concurrency control

Types of Distributed Systems

Cluster Computing

  • Multiple computers in same location
  • High-speed network connection
  • Tightly coupled
  • Shared storage
  • Example: Supercomputer clusters

Grid Computing

  • Computers in different locations
  • Variable speed network
  • Loosely coupled
  • Shared resources
  • Example: SETI@home

Cloud Computing

  • On-demand computing resources
  • Shared infrastructure
  • Pay per use
  • Elastic scaling

Edge Computing

  • Processing at network edge
  • Closer to data sources
  • Lower latency
  • Reduced bandwidth

Important Concepts

Consistency

Multiple copies have same data value.

Availability

System accessible when needed.

Partition Tolerance

System continues despite network partition.

CAP Theorem

Can have only 2 of 3: Consistency, Availability, Partition tolerance

Eventual Consistency

All copies become consistent eventually, not immediately.

Quorum

Minimum number of nodes needed for decision.

Exam Important Points

  1. Define distributed system
  2. Difference from parallel systems
  3. Client-server vs P2P vs hybrid architecture
  4. Communication in distributed systems
  5. Main challenges (heterogeneity, transparency, reliability, synchronization, security)
  6. Advantages and disadvantages
  7. Real examples (Internet, Web, Cloud, Email)
  8. Fault tolerance and availability
  9. CAP theorem
  10. Distributed system functions