翻译或纠错本页面
  • 复制 >
  • 复制集的数据同步

复制集的数据同步

In order to maintain up-to-date copies of the shared data set, secondary members of a replica set sync or replicate data from other members. MongoDB uses two forms of data synchronization: initial sync to populate new members with the full data set, and replication to apply ongoing changes to the entire data set.

初始同步

初始同步会将完整的数据集复制到各个节点上。当一个节点没有数据的适合,就会进行初始同步,比如,当它是新加的节点,或者它的数据已经无法通过复制追上最新的数据了,也会进行初始同步。

Process

When you perform an initial sync, MongoDB:

  1. 复制所有的数据库。 mongod 会查询所有的表和数据库,然后将所有的数据插入这些表的备份中,同时也会建立_id的索引。

    在 3.4 版更改: 应用数据集中所有的数据变动。 mongod 通过oplog来更新数据,从而让数据集保持最新的状态。

    在 3.4 版更改: Initial sync pulls newly added oplog records during the data copy. Ensure that the target member has enough disk space in the local database to temporarily store these oplog records for the duration of this data copy stage.

  2. mongod 完成了所有的索引的建立,该节点将会变为正常的状态i.e. secondary

    When the initial sync finishes, the member transitions from STARTUP2 to SECONDARY.

参见 复制集成员的重新同步 以获得更多有关初始同步的信息。

Fault Tolerance

To recover from transient network or operation failures, initial sync has built-in retry logic.

在 3.4 版更改: MongoDB 3.4 improves the retry logic to be more resilient to intermittent failures on the network.

MongoDB允许通过多线程进行批量写操作来提高并发能力。MongoDB将批操作通过命名空间来分组,

Secondary members replicate data continuously after the initial sync. Secondary members copy the oplog from their sync from source and apply these operations in an asynchronous process.

Secondaries may automatically change their sync from source as needed based on changes in the ping time and state of other members’ replication.

在 3.2 版更改: MongoDB 3.2 replica set members with 1 vote cannot sync from members with 0 votes.

Secondaries avoid syncing from delayed members and hidden members.

If a secondary member has members[n].buildIndexes set to true, it can only sync from other members where buildIndexes is true. Members where buildIndexes is false can sync from any other member, barring other sync restrictions. buildIndexes is true by default.

Multithreaded Replication

MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB groups batches by namespace (MMAPv1) or by document id (WiredTiger) and simultaneously applies each group of operations using a different thread. MongoDB always applies write operations to a given document in their original write order.

While applying a batch, MongoDB blocks all read operations. As a result, secondary read queries can never return data that reflect a state that never existed on the primary.

Pre-Fetching Indexes to Improve Replication Throughput

注解

Applies to MMAPv1 only.

With the MMAPv1 storage engine, MongoDB fetches memory pages that hold affected data and indexes to help improve the performance of applying oplog entries. This pre-fetch stage minimizes the amount of time MongoDB holds write locks while applying oplog entries. By default, secondaries will pre-fetch all Indexes.

Optionally, you can disable all pre-fetching or only pre-fetch the index on the _id field. See the secondaryIndexPrefetch setting for more information.