CS 3410: Distributed Systems
Topics
- Go, RPC
- Go examples
- more Go, CAP, replicated state machines
- TCP, sockets, clusters
- coherent caching, CAP
- transactions, 2-phase commit
- time, clocks, snapshots
- peer to peer
- concurrency, actors
- databases
- big data
- SOA, microservices
- eventual consistency
Resources
- Syllabus
- Examples from class
- Effective Go
- Recommended book: The Go Programming Language
- Go package docs
- Screencast on setting up Go and vim-go
- TCP videos
- RPC demo app in Go using Go RPC
- Paxos assignment slides
- RPC chat assignment
Recommended papers
- The Google File System
- Bigtable: A Distributed Storage System for Structured Data
- Paxos
- Skim the original Paxos paper: The Part-Time Parliament
- Read the simplified version in detail: Paxos Made Simple
- We will use this bare-bones protocol description for our assignment: Paxos in 25 lines
- See how Paxos is implemented in modern systems: Paxos vs Raft: Have we reached consensus on distributed consensus?
- The Chubby lock service for loosely-coupled distributed systems
- Spanner: Google’s Globally-Distributed Database
- Calvin: Fast Distributed Transactions for Partitioned Database Systems
- Recommended: skim this paper first: The Case for Determinism in Database Systems
- Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
- Dynamo: Amazon’s Highly-available Key-value Store
- MapReduce: Simplified Data Processing on Large Clusters
- Resilient Distributed Datasets: A Fault-Tolerant Abstration for In-Memory Cluster Computing
- Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3
Additional reading
- Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System
- Practical Byzantine Fault Tolerance
- Impossibility of Distributed Consensus with One Faulty Process
- The Byzantine Generals Problem
- Session Guarantees for Weakly Consistent Replicated Data
- CAP Twelve Years Later: How the “Rules” Have Changed
- Distributed Snapshots: Determining Global States of Distributed Systems
- Life beyond Distributed Transactions: an Apostate’s Opinion
- Scale and Performance in a Distributed File System (AFS)
- Petal: Distributed Virtual Disks (Ethan)
- On Designing and Deploying Internet-Scale Services
- Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
- PNUTS: Yahoo!’s hosted data serving platform (Lily, Linda)
- Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing
- High-Availability at Massive Scale: Building Google’s Data Infrastructure for Ads
- Twitter Heron: Stream Processing at Scale
- Large-scale Incremental Processing Using Distributed Transactions and Notifications (Dason, Joe, Luke)
- F1: A Distributed SQL Database That Scales (Christian, Carter)
- Paxos Made Live—An Engineering Perspective
- Flexible Paxos: Quorum intersection revisited
- Large-scale cluster management at Google with Borg (Trenonn, Braden, Sasha)
- Time, Clocks, and the Ordering of Events in a Distributed System
- Exploiting virtual synchrony in distributed systems
- Conflict-free Replicated Data Types
- Foundational distributed systems papers
- Hall of fame awards. These are systems papers that have been recognized as especially important, though note that only some of them are distributed systems papers.
Last Updated 01/13/2026

