DEPARTMENT OF COMPUTING

Home / cs / 3410 / index.php [pdf]

CS 3410: Distributed Systems

Topics

Go, RPC
Go examples
more Go, CAP, replicated state machines
TCP, sockets, clusters
coherent caching, CAP
transactions, 2-phase commit
time, clocks, snapshots
peer to peer
concurrency, actors
databases
big data
SOA, microservices
eventual consistency

Resources

Syllabus
Examples from class
Effective Go
Recommended book: The Go Programming Language
Go package docs
Screencast on setting up Go and vim-go
TCP videos
RPC demo app in Go using Go RPC
Paxos assignment slides
RPC chat assignment

Recommended papers

The Google File System
Bigtable: A Distributed Storage System for Structured Data
Paxos
- Skim the original Paxos paper: The Part-Time Parliament
- Read the simplified version in detail: Paxos Made Simple
- We will use this bare-bones protocol description for our assignment: Paxos in 25 lines
- See how Paxos is implemented in modern systems: Paxos vs Raft: Have we reached consensus on distributed consensus?
The Chubby lock service for loosely-coupled distributed systems
Spanner: Google’s Globally-Distributed Database
Calvin: Fast Distributed Transactions for Partitioned Database Systems
- Recommended: skim this paper first: The Case for Determinism in Database Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
Dynamo: Amazon’s Highly-available Key-value Store
MapReduce: Simplified Data Processing on Large Clusters
Resilient Distributed Datasets: A Fault-Tolerant Abstration for In-Memory Cluster Computing
Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3

Additional reading

Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System
Practical Byzantine Fault Tolerance
Impossibility of Distributed Consensus with One Faulty Process
The Byzantine Generals Problem
Session Guarantees for Weakly Consistent Replicated Data
CAP Twelve Years Later: How the “Rules” Have Changed
Distributed Snapshots: Determining Global States of Distributed Systems
Life beyond Distributed Transactions: an Apostate’s Opinion
Scale and Performance in a Distributed File System (AFS)
Petal: Distributed Virtual Disks (Ethan)
On Designing and Deploying Internet-Scale Services
Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
PNUTS: Yahoo!’s hosted data serving platform (Lily, Linda)
Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing
High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads
Twitter Heron: Stream Processing at Scale
Large-scale Incremental Processing Using Distributed Transactions and Notifications (Dason, Joe, Luke)
F1: A Distributed SQL Database That Scales (Christian, Carter)
Paxos Made Live—An Engineering Perspective
Flexible Paxos: Quorum intersection revisited
Large-scale cluster management at Google with Borg (Trenonn, Braden, Sasha)
Time, Clocks, and the Ordering of Events in a Distributed System
Exploiting virtual synchrony in distributed systems
Conflict-free Replicated Data Types
Foundational distributed systems papers
Hall of fame awards. These are systems papers that have been recognized as especially important, though note that only some of them are distributed systems papers.

Last Updated 01/13/2026