MySQL Master-Slave Replication Principles

1. Setting the Slave Node as Read-Only

  • Node B, even if not directly accessed, should be set to read-only (readonly) mode.

  • Reasons:

    1. Some operational queries might be executed on the slave. Setting to readonly prevents mishaps.
    2. To prevent inconsistencies during the switch-over due to bugs or dual writes.
    3. Readonly status can be used to determine the role of the node.
  • A long connection is maintained between Slave B and Master A.

  • Replication Process:

    1. On Slave B, execute the change master command with Master A’s IP, port, username, password, and the position in binlog.
    2. Execute the start slave command on Slave B to initiate the io_thread and sql_thread.
    3. Master A verifies credentials and sends the appropriate binlog entries to Slave B.
    4. Binlogs received by Slave B are saved as relay logs.
    5. The sql_thread on Slave B processes and executes commands from the relay logs.

Binlog Formats Comparison

1. Statement Format

  • Records the original SQL statement.

2. Row Format

  • Replaces the SQL statement with two events: Table_map and Delete_rows.
  • Ensures consistent replication by capturing the exact primary key of the changed row.

3. Mixed Format

  • Combines Statement and Row formats.
  • MySQL automatically determines the format based on the possibility of causing inconsistencies.

Replication Loop Issues

  • Replication loops can occur when binlogs are replicated back to the originator.
  • Solutions:
    1. Each server must have a unique server id.
    2. Binlogs created by a slave during replication should have the same server id as the original binlog.
    3. Nodes should discard binlogs with their own server id.

Master-Slave Delay

  • Important Timestamps:

    1. Master A finishes a transaction and writes to binlog (T1).
    2. Slave B receives the binlog (T2).
    3. Slave B executes the transaction (T3).
  • Delay = T3 - T1.

  • Causes of Delay:

    1. The slave’s hardware might be inferior.
    2. Read operations during replication can cause additional load on the slave.
    3. Queries on the slave consuming excessive resources.
  • Solutions:

    1. One master to multiple slaves to distribute read loads.
    2. Output binlog to external systems like Hadoop for statistical queries.
    3. Large transactions cause delays. Avoid long-running transactions.