← Back to Databases

Database Replication Explained: Master-Slave and Multi-Master

Database replication copies data across multiple servers to improve availability, distribute read load, and enable disaster recovery. Master-slave replication uses one primary server for writes, while multi-master allows writes on multiple nodes—each with different tradeoffs in consistency, complexity, and failure resilience.

What is Database Replication?

Database replication is the process of maintaining synchronized copies of data across separate database instances. It's the backbone of modern distributed systems, ensuring your application can survive server failures and handle traffic spikes without going dark.

At its core, replication works by capturing changes from a source database and applying them to target databases. The source is called the primary (or master), and the targets are replicas (or slaves). The magic happens in the replication lag—the time between a write on the primary and when it reaches replicas.

Without replication, a single database failure means total data loss or extended downtime. With it, you're buying redundancy and read scalability for the cost of eventual consistency and operational complexity.

Master-Slave Replication Architecture

Master-slave (now often called primary-replica) is the traditional model. One primary server accepts all writes, and one or more replica servers receive those writes asynchronously or synchronously.

How Master-Slave Works

The flow is straightforward:

  1. Client sends a write query to the master
  2. Master executes the write and records it to its binary log (or similar transaction log)
  3. Slave connects to master, reads new log entries, and applies them locally
  4. Read queries can be directed to slaves to distribute load

Most relational databases like MySQL, PostgreSQL, and MariaDB implement master-slave natively. It's simple, battle-tested, and works well for read-heavy workloads.

Setting Up MySQL Master-Slave

Here's a practical example with MySQL:

On the Master:

# my.cnf configuration
[mysqld]
server-id = 1
log_bin = mysql-bin
binlog_format = ROW

# Create replication user
CREATE USER 'repl_user'@'%' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';

# Get master status
SHOW MASTER STATUS;
# Output: File: mysql-bin.000001, Position: 154

On the Slave:

# my.cnf configuration
[mysqld]
server-id = 2
relay-log = mysql-relay-bin

# Configure replication
CHANGE MASTER TO
  MASTER_HOST='master-ip',
  MASTER_USER='repl_user',
  MASTER_PASSWORD='password',
  MASTER_LOG_FILE='mysql-bin.000001',
  MASTER_LOG_POS=154;

# Start replication
START SLAVE;

# Check status
SHOW SLAVE STATUS\G

Strengths of Master-Slave

Weaknesses and Challenges

Multi-Master Replication Architecture

Multi-master (also called active-active) lets two or more nodes accept writes simultaneously. Changes from any master are replicated to all others. It's attractive for distributed systems but introduces complexity.

How Multi-Master Works

Instead of one primary, every node is a primary:

  1. Client writes to any master node
  2. That master logs the write and replicates it to peer masters
  3. Peer masters apply the same write locally
  4. All nodes eventually hold identical data

MySQL Group Replication, PostgreSQL's Patroni with etcd, and Galera Cluster are popular implementations. They use consensus algorithms to prevent conflicts.

Conflict Resolution Strategies

Multi-master risks write conflicts. Two masters might execute conflicting changes. Here's how they're handled:

Last-Write-Wins (LWW)

Whichever write has the newest timestamp wins. Loser's change is silently discarded.

# Simulating LWW conflict
Master A: UPDATE users SET status='active' WHERE id=1 at timestamp=1000
Master B: UPDATE users SET status='inactive' WHERE id=1 at timestamp=1005

# Result: inactive (B's write wins)

Custom Application Logic

Your app detects conflicts and decides which write to keep. More complex but gives you control.

Operational Transformation (OT)

Advanced systems like CRDTs (Conflict-free Replicated Data Types) ensure conflicts resolve deterministically without losing data. Databases like DynamoDB and Cassandra use variants of this.

Pro tip: Most production multi-master setups avoid conflicts by partitioning writes. One master handles user accounts, another handles orders. This gives you the benefits without the pain.

Strengths of Multi-Master

Weaknesses and Challenges

Master-Slave vs Multi-Master: Side-by-Side Comparison

Aspect Master-Slave Multi-Master
Write location Master only Any node
Consistency Strong (master truth) Eventual
Failover Manual or automated Automatic
Conflict handling N/A (single writer) Required (LWW, OT, etc.)
Write latency Low Low (local writes)
Operational complexity Low-moderate High
Best for Read-heavy, single region Multi-region, high availability

Replication Lag: The Hidden Challenge

Replication lag is the delay between a write on the primary and when it appears on replicas. It's measured in milliseconds or seconds, but it matters.

If your application reads from a replica right after writing, you might get stale data. This is called a read-after-write inconsistency.

Solutions include:

When to Use Each Model

Choose Master-Slave If:

Choose Multi-Master If: