SYLLABUS
Module – 1 (Distributed systems basics and Computation model)
Distributed System – Definition, Relation to computer system components, Motivation, Primitives for distributed communication, Design issues, Challenges and applications. A model of distributed computations – Distributed program, Model of distributed executions, Models of communication networks, Global state of a distributed system, Cuts of a distributed computation, Past and future cones of an event, Models of process communications.
Module – 2 (Election algorithm, Global state and Termination detection)
Logical time – A framework for a system of logical clocks, Scalar time, Vector time. Leader election algorithm – Bully algorithm, Ring algorithm. Global state and snapshot recording algorithms – System model and definitions, Snapshot algorithm for FIFO channels – Chandy Lamport algorithm. Termination detection – System model of a distributed computation, Termination detection using distributed snapshots, Termination detection by weight throwing,Spanning-tree-based algorithm.
Module – 3 (Mutual exclusion and Deadlock detection)
Distributed mutual exclusion algorithms – System model, Requirements of mutual exclusion algorithm. Lamport’s algorithm, Ricart–Agrawala algorithm, Quorum-based mutual exclusion algorithms – Maekawa’s algorithm. Token-based algorithm – Suzuki–Kasami’s broadcast algorithm. Deadlock detection in distributed systems – System model, Deadlock handling strategies, Issues in deadlock detection, Models of deadlocks.
Module – 4 (Distributed shared memory and Failure recovery)
Distributed shared memory – Abstraction and advantages. Shared memory mutual exclusion – Lamport’s bakery algorithm. Check pointing and rollback recovery – System model, consistent and inconsistent states, different types of messages, Issues in failure recovery, checkpoint based recovery, log based roll back recovery.
Module – 5 (Consensus and Distributed file system)
Consensus and agreement algorithms – Assumptions, The Byzantine agreement and other problems, Agreement in (message-passing) synchronous systems with failures – Consensus algorithm for crash failures. Distributed file system – File service architecture, Case studies: Sun Network File System, Andrew File System, Google File System.
(Note: Proof of correctness and performance analysis are not expected for any of the algorithms in the syllabus).
Text Books