翻译或纠错本页面

mongos

On this page

在集群中, mongos 负责将查询与写入分发到 分片 中.使用 mongos,应用有了访问集群的统一入口,而不需要直接访问集群的每个分片.

通过缓存 配置服务器 中集群的元信息, mongos 可以得知数据所位于的分片. mongos 使用这些元信息将应用的读写请求分发到不同的分片, mongos 不存储集群 持续 的状态(意思是, mongos 可以随时被重启或者添加,而不会造成集群的数据丢失,也不会造成集群的异常.),并且占有较少的系统资源.

最常见的做法是将 mongos 运行在应用所在的系统上,不过在分片上或者其他专用的机器上运行也是可以的.

Routing And Results Process

cluster <sharded cluster>`中, :program:`mongos 使用以下步骤分发请求:

  1. Determining the list of shards that must receive the query.
  2. 在所有目标分片上建立游标.

The mongos then merges the data from each of the targeted shards and returns the result document. Certain query modifiers, such as sorting, are performed on a shard such as the primary shard before mongos retrieves the results.

在 3.2 版更改: 在某些情况下,当查询条件包含 shard key 或者 shard key 的前缀时,:program:mongos 可以将请求分发到部分分片上,否则, mongos 会将请求分发到 所有 存储这个集合的分片上.

In some cases, when the shard key or a prefix of the shard key is a part of the query, the mongos performs a targeted operation, routing queries to a subset of shards in the cluster.

mongos performs a broadcast operation for queries that do not include the shard key, routing queries to all shards in the cluster. Some queries that do include the shard key may still result in a broadcast operation depending on the distribution of data in the cluster and the selectivity of the query.

依靠集群中数据块的分布,如果请求中包含一下字段, mongos 可以将请求分发到部分分片上:

``mongos``如何处理查询修饰符

Sorting

如果查询结果没有排序, mongos 会打开一个结果游标,对所有分片的游标依次轮询取得数据.

如果查询通过 sort() 指明要排序, mongos 会将 $orderby 选项发送给所有分片,当 mongos 接收到结果之后,会先进行 合并排序 再返回给应用程序.

Limits

如果查询通过 limit() 限制了返回文档的数量, mongos 会将这个限制发送到所有分片,并且在返回给应用程序之前再次使用这个限制对结果进行过滤.

Skips

If the query specifies a number of records to skip using the skip() cursor method, the mongos cannot pass the skip to the shards, but rather retrieves unskipped results from the shards and skips the appropriate number of documents when assembling the complete result.

When used in conjunction with a limit(), the mongos will pass the limit plus the value of the skip() to the shards to improve the efficiency of these operations.

为了检测应用连接的是不是 mongos,可以使用 isMaster 命令.如果应用连接的是一个 mongos , isMaster 返回一个包含 isdbgrid 字符串的 msg ,比如:

如果应用连接的是 mongod ,返回的文档中不包含 isdbgrid 字符串.

{
   "ismaster" : true,
   "msg" : "isdbgrid",
   "maxBsonObjectSize" : 16777216,
   "ok" : 1
}

If the application is instead connected to a mongod, the returned document does not include the isdbgrid string.

Query Isolation

Generally, the fastest queries in a sharded environment are those that mongos route to a single shard, using the shard key and the cluster meta data from the config server. These targeted operations use the shard key value to locate the shard or subset of shards that satisfy the query document.

为了获得更好的性能,最好在任何可能的时候都使用具有特定目标的操作.虽然有些操作不得不使用广播发送的形式,你也应该尽可能在进行操作时带有片键来尽可能使用具有特定目标的操作.

广播发送的操作

mongos instances broadcast queries to all shards for the collection unless the mongos can determine which shard or subset of shards stores this data.

Read operations to a sharded cluster. Query criteria does not include the shard key. The query router ``mongos`` must broadcast query to all shards for the collection.

Once the mongos has received a response from all shard, it merges the data and returns the result document. The performance of a broadcast operation depends on the overall load of the cluster, as well as variables like network latency, individual shard load, and number of documents returned per shard. Whenever possible, favor operations that result in targeted operation over those that result in a broadcast operation.

多文档更新操作总是会被分发到所有分片.

除非操作指定了完整的片键,否则 remove() 将总是广播式操作.

具有特定目标的操作

mongos can route queries that include the shard key or the prefix of a compound shard key a specific shard or set of shards. mongos uses the shard key value to locate the chunk whose range includes the shard key value and directs the query at the shard containing that chunk.

Read operations to a sharded cluster. Query criteria includes the shard key. The query router ``mongos`` can target the query to the appropriate shard or shards.

For example, if the shard key is:

{ a: 1, b: 1, c: 1 }

所有的单个 update() (包括 upsert 操作) 与 remove() 操作都会被发往一个分片.

{ a: 1 }
{ a: 1, b: 1 }

在分片的集群中通过指定|single-modification-operation-option|选项的操作,都必须在请求中带有 shard key 或者 _id ,两者都不带的此类操作会返回错误.

包含片键或部分片键的查询, mongos 可以将查询分发到特定的一个分片或几个分片上.

Depending on the distribution of data in the cluster and the selectivity of the query, mongos may still perform a broadcast operation to fulfill these queries.

Index Use

根据数据在集群中的分布特性与查询的选择性, mongos 为了完成查询,有可能将请求分发到多个分片 [#possible-all]_ 中.

If the query includes multiple sub-expressions that reference the fields indexed by the shard key and the secondary index, the mongos can route the queries to a specific shard and the shard will use the index that will allow it to fulfill most efficiently.

Sharded Cluster Security

分片行为是以集合为基本单位的,你可以在一个数据库中对多个集合开启分片,也可以拥有多个打开分片的数据库. [#sharding-databases]_ 不过,在生产环境中,会存在一些数据库和集合开启了分片,另一些数据库和集合没有开启分片的情况.

不管 sharded cluster 中数据结构如何,应该一直使用 mongos 访问集群数据,即使对于未分片的数据也应当如此.

Cluster Users

在你配置分片的时候,应该使用 enableSharding 对数据库开启分片,才能在之后使用 shardCollection 为某个集合开启分片.

With RBAC enforced, clients must specify a --username, --password, and --authenticationDatabase when connecting to the mongos in order to access cluster resources.

Each cluster has its own cluster users. These users cannot be used to access individual shards.

See Enable Auth for a tutorial on enabling adding users to an RBAC-enabled MongoDB deployment.

←   配置服务器 片键  →