翻译或纠错本页面

FAQ: MongoDB存储

This document addresses common questions regarding MongoDB’s storage system.

Storage Engine Fundamentals

内存映射会将文件以字节的形式之间分配到虚拟内存。一旦产生映射,MongoDB会通过文件与内存之间的关联与之进行交互,因为文件就像在内存里一样。

A storage engine is the part of a database that is responsible for managing how data is stored, both in memory and on disk. Many databases support multiple storage engines, where different engines perform better for specific workloads. For example, one storage engine might offer better performance for read-heavy workloads, and another might support a higher-throughput for write operations.

Can you mix storage engines in a replica set?

MongoDB使用内存映射文件来管理数据并与之进行交互。MongoDB会使内存与数据文件进行关联,就像连接文档。数据也会映射到内存中。

When designing these multi-storage engine deployments consider the following:

  • 缺页中断会导致MongoDB不在当前的物理内存进行读写操作。然而,操作系统页面错误是因为物理内存被耗尽或超出磁盘空间。

  • 如果有闲置内存,操作系统将会在磁盘上找到该页面并直接将其加载到内存里。当然,如果没有闲置内存,操作系统将会:

WiredTiger Storage Engine

How much compression does WiredTiger provide?

缺页中断 会导致MongoDB在需要连接数据时不在当前活动的内存中工作。当MongoDB需要通过磁盘连接数据时,页面”硬”错误代表当前的状态。

To what size should I set the WiredTiger internal cache?

With WiredTiger, MongoDB utilizes both the WiredTiger internal cache and the filesystem cache.

Starting in 3.4, the WiredTiger internal cache, by default, will use the larger of either:

  • 50% of RAM minus 1 GB, or
  • 256 MB.

Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes. Data in the filesystem cache is compressed.

To adjust the size of the WiredTiger internal cache, see storage.wiredTiger.engineConfig.cacheSizeGB and --wiredTigerCacheSizeGB. Avoid increasing the WiredTiger internal cache size above its default value.

注解

The storage.wiredTiger.engineConfig.cacheSizeGB limits the size of the WiredTiger internal cache. The operating system will use the available free memory for filesystem cache, which allows the compressed MongoDB data files to stay in memory. In addition, the operating system will use any free RAM to buffer file system blocks and file system cache.

To accommodate the additional consumers of RAM, you may have to decrease WiredTiger internal cache size.

The default WiredTiger internal cache size value assumes that there is a single mongod instance per machine. If a single machine contains multiple MongoDB instances, then you should decrease the setting to accommodate the other mongod instances.

If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less than the amount of RAM available in the container. The exact amount depends on the other processes running in the container.

To view statistics on the cache and eviction rate, see the wiredTiger.cache field returned from the serverStatus command.

How frequently does WiredTiger write to disk?

MongoDB configures WiredTiger to create checkpoints (i.e. write the snapshot data to disk) at intervals of 60 seconds or 2 gigabytes of journal data.

For journal data, MongoDB writes to disk according to the following intervals or condition:

  • 3.2 新版功能: Every 50 milliseconds.

  • MongoDB sets checkpoints to occur in WiredTiger on user data at an interval of 60 seconds or when 2 GB of journal data has been written, whichever occurs first.

  • If the write operation includes a write concern of j: true, WiredTiger forces a sync of the WiredTiger journal files.

  • Because MongoDB uses a journal file size limit of 100 MB, WiredTiger creates a new journal file approximately every 100 MB of data. When WiredTiger creates a new journal file, WiredTiger syncs the previous journal file.

工作复制集代表应用在正常操作进程中使用的所有数据。大部分情况下它是所有数据大小的子集,但特殊情况下工作复制集的大小取决于当前正在使用的数据库。

What are memory mapped files?

如果你想让MongoDB扫描集合中的所有文档,工作集将会扩展至每一个文档。基于物理内存的大小,这个可能会导致文档在工作集中 “溢出,” 或者被操作系统从物理内存中移除。当下次MongoDB需要连接这些文档时,MongoDB可能会引发一个硬缺页中断。

How do memory mapped files work?

如果你想让MongoDB扫描集合中的所有 document, 工作集的内存中会含有所有活动的文档。

Memory mapping assigns files to a block of virtual memory with a direct byte-for-byte correlation. MongoDB memory maps data files to memory as it accesses documents. Unaccessed data is not mapped to memory.

Once mapped, the relationship between file and memory allows MongoDB to interact with the data in the file as if it were memory.

How frequently does MMAPv1 write to disk?

In the default configuration for the MMAPv1 storage engine, MongoDB writes to the data files on disk every 60 seconds and writes to the journal files roughly every 100 milliseconds.

To change the interval for writing to the data files, use the storage.syncPeriodSecs setting. For the journal files, see storage.journal.commitIntervalMs setting.

These values represent the maximum amount of time between the completion of a write operation and when MongoDB writes to the data files or to the journal files. In many cases MongoDB and the operating system flush data to disk more frequently, so that the above values represents a theoretical maximum.

Why are the files in my data directory larger than the data in my database?

默认存储目录 /data/db 中的数据文件可能会比数据库中的数据大,思考一下下面可能的原因:

Preallocated data files

MongoDB preallocates its data files to avoid filesystem fragmentation, and because of this, the size of these files do not necessarily reflect the size of your data.

The storage.mmapv1.smallFiles option will reduce the size of these files, which may be useful if you have many small databases on disk.

在类Unix系统中, mongod 预先配置了一个另外的数据文件并将磁盘空间初始化为0。当创建一个新的数据库时,预配置文件对于避免延迟有非常大的作用。

你可以通过 preallocDataFiles 设置将预配置设为 false。但是千万不要设置 preallocDataFiles ,只需使用 preallocDataFiles 调试你经常删除的小数据库。

The default allocation is approximately 5% of disk space on 64-bit installations. In most cases, you should not need to resize the oplog. See Oplog Sizing for more information.

The journal

如果 mongod 是复制集成员,数据目录包括 local 数据库中 固定集合oplog.rs 文件。在64位安装包中,默认分配空间大约为磁盘空间的5%,通过 <replica-set-oplog-sizing>` 查看更多信息。在大部分实例中,你不需要调整oplog的大小。当然,如果你想调整,可以查看 修改Oplog大小

Empty records

数据目录含有journal文件,它将MongoDB在操作数据库中的写入操作存储在磁盘上。查看 Journaling

To allow MongoDB to more effectively reuse the space, you can de-fragment your data. To de-fragment, use the compact command. The compact requires up to 2 gigabytes of extra disk space to run. Do not use compact if you are critically low on disk space. For more information on its behavior and other considerations, see compact.

当删除文档和集合时,MongoDB将会继续列出数据文件中的空记录。MongoDB可以重复使用这部分空间,但是不会释放这部分空间给操作系统。

How do I reclaim disk space?

使用 compact`可以重组分配存储空间。通过重置存储,MongoDB可以有效的利用分配空间。 :dbcommand:`compact 需要额外2GB磁盘空间来运行。如果磁盘空间较少,禁止使用 compact

注解

You do not need to reclaim disk space for MongoDB to reuse freed space. See Empty records for information on reuse of freed space.

repairDatabase

You can use repairDatabase on a database to rebuilds the database, de-fragmenting the associated storage in the process.

使用 repairDatabase 重置存储来重建数据库可以再次利用删除的空间。 repairDatabase 需要至少2G额外磁盘空间来运行。如果磁盘空间较少,禁止使用 repairDatabase

警告

Do not use repairDatabase if you are critically low on disk space.

repairDatabase will block all other operations and may take a long time to complete.

You can only run repairDatabase on a standalone mongod instance.

当使用 repairDatabase 进行修复时,需要足够的磁盘空间来支持新旧数据库。可以使用 repairDatabase 锁定其它操作,可能需要一定的时间。

Resync the Member of the Replica Set

For a secondary member of a replica set, you can perform a resync of the member by: stopping the secondary member to resync, deleting all data and subdirectories from the member’s data directory, and restarting.

For details, see 复制集成员的重新同步.

What is the working set?

Working set represents the total body of data that the application uses in the course of normal operation. Often this is a subset of the total data size, but the specific size of the working set depends on actual moment-to-moment use of the database.

If you run a query that requires MongoDB to scan every document in a collection, the working set will expand to include every document. Depending on physical memory size, this may cause documents in the working set to “page out,” or to be removed from physical memory by the operating system. The next time MongoDB needs to access these documents, MongoDB may incur a hard page fault.

For best performance, the majority of your active set should fit in RAM.

What are page faults?

With the MMAPv1 storage engine, page faults can occur as MongoDB reads from or writes data to parts of its data files that are not currently located in physical memory. In contrast, operating system page faults happen when physical memory is exhausted and pages of physical memory are swapped to disk.

If there is free memory, then the operating system can find the page on disk and load it to memory directly. However, if there is no free memory, the operating system must:

  • find a page in memory that is stale or no longer needed, and write the page to disk.
  • read the requested page from disk and load it into memory.

This process, on an active system, can take a long time, particularly in comparison to reading a page that is already in memory.

See Page Faults for more information.

如果想查看索引的数据分配大小,可以在 mongo 命令行中参照下面的步骤:

Page faults occur when MongoDB, with the MMAP storage engine, needs access to data that isn’t currently in active memory. A “hard” page fault refers to situations when MongoDB must access a disk to access the data. A “soft” page fault, by contrast, merely moves memory pages from one list to another, such as from an operating system file cache.

See Page Faults for more information.

Can I manually pad documents to prevent moves during updates?

在 3.0.0 版更改.

With the MMAPv1 storage engine, an update can cause a document to move on disk if the document grows in size. To minimize document movements, MongoDB uses padding.

You should not have to pad manually because by default, MongoDB uses Power of 2 Sized Allocations to add padding automatically. The Power of 2 Sized Allocations ensures that MongoDB allocates document space in sizes that are powers of 2, which helps ensure that MongoDB can efficiently reuse free space created by document deletion or relocation as well as reduce the occurrences of reallocations in many cases.

However, if you must pad a document manually, you can add a temporary field to the document and then $unset the field, as in the following example.

警告

Do not manually pad documents in a capped collection. Applying manual padding to a document in a capped collection can break replication. Also, the padding is not preserved if you re-sync the MongoDB instance.

var myTempPadding = [ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
                      "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
                      "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
                      "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"];

db.myCollection.insert( { _id: 5, paddingField: myTempPadding } );

db.myCollection.update( { _id: 5 },
                        { $unset: { paddingField: "" } }
                      )

db.myCollection.update( { _id: 5 },
                        { $set: { realField: "Some text that I might have needed padding for" } }
                      )

Data Storage Diagnostics

通过 db.collection.stats() 方法使用索引命名空间。使用以下命令可以恢复命名空间列表:

db.collection.stats() 命令中查看 indexSizes 的值。

db.orders.stats();

MongoDB also provides the following methods to return specific sizes for the collection:

如果你的服务器中的数据文件在运行中超出磁盘空间,你将会看到类似下面的日志:

db._adminCommand("listDatabases").databases.forEach(function (d) {
   mdb = db.getSiblingDB(d.name);
   printjson(mdb.stats());
})

服务器会一直保持这种状态,锁定所有写入和删除操作。当然,读取还是可以的。如果你想删除一些数据和压缩数据,可以在重启服务器后使用 compact 命令。

db._adminCommand("listDatabases").databases.forEach(function (d) {
   mdb = db.getSiblingDB(d.name);
   mdb.getCollectionNames().forEach(function(c) {
      s = mdb[c].stats();
      printjson(s);
   })
})

How can I check the size of indexes for a collection?

To view the size of the data allocated for an index, use the db.collection.stats() method and check the indexSizes field in the returned document.

如果journal文件在你的服务运行过程中超出磁盘空间,服务进程将会结束。 mongod 默认在:setting:~storage.dbPath 目录中创建journal文件。你可以使用资源管理器或symlink将其移动到其它磁盘。

The db.stats() method in the mongo shell returns the current state of the “active” database. For the description of the returned fields, see dbStats Output.