- Reference >
- Collation
Collation¶
On this page
Collation allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.
You can specify collation for a collection or a view, an index, or specific operations that support collation.
Collation Document¶
A collation document has the following fields:
{
locale: <string>,
caseLevel: <boolean>,
caseFirst: <string>,
strength: <int>,
numericOrdering: <boolean>,
alternate: <string>,
maxVariable: <string>,
backwards: <boolean>
}
When specifying collation, the locale field is mandatory; all other collation fields are optional. For descriptions of the fields, see Collation Document.
Default collation parameter values vary depending on which locale you specify. For a complete list of default collation parameters and the locales they are associated with, see Collation Default Parameters.
Field | Type | Description | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
locale | string | The ICU locale. See Supported Languages and Locales for a list of supported locales. To specify simple binary comparison, specify locale value of "simple". |
||||||||||||
strength | integer | Optional. The level of comparison to perform. Corresponds to ICU Comparison Levels. Possible values are:
See ICU Collation: Comparison Levels for details. |
||||||||||||
caseLevel | boolean | Optional. Flag that determines whether to include case comparison at strength level 1 or 2. If true, include case comparison; i.e.
If false, do not include case comparison at level 1 or 2. The default is false. For more information, see ICU Collation: Case Level. |
||||||||||||
caseFirst | boolean | Optional. A flag that determines sort order of case differences during tertiary level comparisons. Possible values are:
|
||||||||||||
numericOrdering | boolean | Optional. Flag that determines whether to compare numeric strings as numbers or as strings. If true, compare as numbers; i.e. "10" is greater than "2". If false, compare as strings; i.e. "10" is less than "2". Default is false. |
||||||||||||
alternate | string | Optional. Field that determines whether collation should consider whitespace and punctuation as base characters for purposes of comparison. Possible values are:
See ICU Collation: Comparison Levels for more information. Default is "non-ignorable". |
||||||||||||
maxVariable | string | Optional. Field that determines up to which characters are are considered ignorable when alternate: "shifted". Has no effect if alternate: "non-ignorable" Possible values are:
|
||||||||||||
backwards | boolean | Optional. Flag that determines whether strings with diacritics sort from back of the string, such as with some French dictionary ordering. If true, compare from back to front. If false, compare from front to back. The default value is false. |
||||||||||||
normalization | boolean | Optional. Flag that determines whether to check if text require normalization and to perform normalization. Generally, majority of text does not require this normalization processing. If true, check if fully normalized and perform normaliztion to compare text. If false, does not check. The default value is false. See http://userguide.icu-project.org/collation/concepts#TOC-Normalization for details. |
Operations that Support Collation¶
You can specify collation for the following operations:
Commands | mongo Shell Methods |
---|---|
create | |
createIndexes | db.collection.createIndex() |
aggregate | db.collection.aggregate() |
distinct | db.collection.distinct() |
findAndModify | |
find | cursor.collation() to specify collation for db.collection.find() |
mapReduce | db.collection.mapReduce() |
delete | |
update | |
Individual update, replace, and delete operations in db.collection.bulkWrite(). |
Behavior¶
Some collation locales have variants, which employ special language-specific rules. To specify a locale variant, use the following syntax:
{ "locale" : "<locale code>@collation=<variant>" }
For example, to use the pinyin variant of the Chinese collation:
{ "locale" : "zh@collation=pinyin" }
For a complete list of all collation locales and their variants, see Collation Locales.