You are on page 1of 18

Scaling Database Operations

☰ Menu

MongoDB Index type and Properties

 Chandrika Pokuri  MongoDB, Open Source, optimisation, Performance,


troubleshooting  December 19, 2022December 19, 2022 11 Minutes
I’m hoping that after reading the previous blog
(https://wordpress.com/post/mydbops.wordpress.com/11749), you will have a better
understanding of indexes and the guidelines for building one.

In this blog, I am going to share index types and their characteristics.

(https://mydbops.files.wordpress.com/2022/11/image-2.png)

1. Index Types (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-


properties/#index-types)
1. Single Field Index (https://mydbops.wordpress.com/2022/12/19/mongodb-index-
type-and-properties/#single-field-index)
2. Compound Index (https://mydbops.wordpress.com/2022/12/19/mongodb-index-
type-and-properties/#compound-index)
1. Sort Order (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-
and-properties/#sort-order)
3. Multikey Index (https://mydbops.wordpress.com/2022/12/19/mongodb-index-
type-and-properties/#multikey-index)
1. Limitations of MultiKey Index
:
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#limitations-of-multikey-index)
1. Compound Multikey Indexes
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#compound-multikey-indexes)
4. Geospatial Index (https://mydbops.wordpress.com/2022/12/19/mongodb-index-
type-and-properties/#geospatial-index)
1. 2dsphere (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-
and-properties/#2dsphere)
2. 2d (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#2d)
3. Limitations of the Geospatial Index
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#limitations-of-the-geospatial-index)
5. Text Indexes (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-
and-properties/#text-indexes)
1. Limitations of Text Indexes
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#limitations-of-text-indexes)
2. Performance consideration with text indexes
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#performance-consideration-with-text-indexes)
6. Hashed Indexes (https://mydbops.wordpress.com/2022/12/19/mongodb-index-
type-and-properties/#hashed-indexes)
1. Limitations of Hashed Index
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#limitations-of-hashed-index)
7. Wildcard Indexes (https://mydbops.wordpress.com/2022/12/19/mongodb-index-
type-and-properties/#wildcard-indexes)
1. Create a Wildcard Index on All Fields
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#create-a-wildcard-index-on-all-fields)
2. Limitations of Wildcard Indexes
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#limitations-of-wildcard-indexes)
8. Hidden Index (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-
and-properties/#hidden-index)
1. Limitation of Hidden Index
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#limitation-of-hidden-index)
2. Index Properties (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-
and-properties/#index-properties)
1. Unique Index (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-
and-properties/#unique-indexes)
1. Unique Partial Indexes
:
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#unique-partial-indexes)
2. Limitation of Unique Index
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#limitation-of-unique-index)
2. Partial Index (https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-
and-properties/#partial-index)
1. Limitations of Partial Index
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#limitations-of-partial-index)
3. Sparse Indexes (https://mydbops.wordpress.com/2022/12/19/mongodb-index-
type-and-properties/#sparse-indexes)
1. Sparse Index and Incomplete Results
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#sparse-index-and-incomplete-results)
2. Comparison Of Partial Indexes with Sparse Indexes
(https://mydbops.wordpress.com/2022/12/19/mongodb-index-type-and-
properties/#comparison-of-partial-indexes-with-sparse-indexes)
3. Hybrid Index Build (https://mydbops.wordpress.com/2022/12/19/mongodb-index-
type-and-properties/#hybrid-index-build)

Index Types

Let’s drive through the index types prior to version 4.2.

(https://mydbops.files.wordpress.com/2022/11/screenshot-2022-11-22-at-9.38.17-pm.png)

Single Field Index


:
MongoDB supports single field indexes, A single-field index is used to create an index on
the single field of a document.

Regardless of the specified index order, the unique characteristics of a single field index
allow traversal in both ascending and descending order.

For example, consider we are having an inventory collection with the following documents.

{ _id: 5, type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ] }

{ _id: 6, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] }

{ _id: 7, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ] }

{ _id: 8, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ] }

{ _id: 9, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ] }

db.inventory.createIndex({item:1},{background:true})

Compound Index

A compound index is a single index that includes multiple fields. A compound index works
by storing a subset of a collection’s data in a sorted B-Tree data structure.

A compound index can have a maximum of 32 fields indexed.

For example, consider we are having a people collection with the following documents.

{ name:”Ram”,type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ] }

{ name:”Jhonson”, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] }

{ name:”James”, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ] }

{ name:”Priya”, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ] }

{ name:”Kalyan”, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ] }


:
db.people.createIndex({name:1, item:-1},{background:true})

This compound index creates a sorted B-Tree structure where records are first stored by
name in ascending order. The sorted records then store nested item values in descending
order.

Sort Order

Indexes store references to fields in either ascending (1) or descending (-1) sort order. For
single-field indexes, the sort order of keys doesn’t matter because MongoDB can traverse
the index in either direction. However, for compound indexes, sort order can matter in
determining whether the index can support a sort operation.

(https://mydbops.files.wordpress.com/2022/10/image-with-description.jpeg)

(https://mydbops.files.wordpress.com/2022/10/image-with-description-2.jpeg)
:
Multikey Index

MongoDB generates an index key for each element of an array to use when indexing a field
that contains an array value. These multikey indexes enable effective array field queries.

Multikey indexes can be built over arrays that contain nested documents as well as scalar
variables like characters and numbers.

For example, consider we have an inventory collection with the following documents.

{ _id: 5, type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ],category:[Indian] }

{ _id: 6, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] ,category:[Chinese]}

{ _id: 7, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ],category:[Indian,Chinese]}

{ _id: 8, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ],category:[Indian] }

{ _id: 9, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ],category:[Indian]}

db.inventory.createIndex( { ratings: 1 },{background:true} )

Where ratings and category are the array field.

Limitations of MultiKey Index

Compound Multikey Indexes

1. Compound multikey indexes with multiple array fields are not permitted in MongoDB.

Consider the inventory collection.

db.inventory.createIndex( { ratings: 1,category:1 },


{background:true} )

Where ratings and category are the array field.


:
You are not permitted to insert a document that will violate the limitations if the compound
multikey index already exists.

2. You are not allowed to specify a multikey index as the shard key index.

3. In MongoDB, hashed indexes are not multikey indexes.

4. The multikey index cannot support the $expr operator.

Geospatial Index

MongoDB’s geospatial indexing allows you to efficiently execute spatial queries on a


collection that contains geospatial shapes and points. To showcase the capabilities of
geospatial features and compare different approaches.

For example, consider we have restaurants collection with the following documents.

{ name:”Ram”, item: “aaa”,location:[-50.456,9.6789] }

{ name:”Jhonson”, item: “bbb”,location:[-79.7761,8.3456] }

{ name:”James”, item: “ccc”,location:[65.6413,10.3421] }

(https://www.mongodb.com/docs/manual/geospatial-queries/#2dsphere)

2dsphere

Queries that compute geometries on an earth-like sphere are supported by 2dsphere


indexes.To create a 2dsphere index, use the db.collection.createIndex() method and specify
the string literal “2dsphere” as the index type

db.restaurants.createIndex({ location: "2dsphere" },


{background:true})

2d
:
Queries that compute geometries on a two-dimensional plane are supported by 2d indexes.
To create a 2d index, use the db.collection.createIndex() method, specifying the location field
as the key and the string literal “2d” as the index type.

db.restaurants.createIndex({ location: "2d" },


{background:true})

Limitations of the Geospatial Index

You cannot use a geospatial index as a shard key when sharding a collection.

Text Indexes

Any field whose value is a string or an array of string elements can be included in text
indexes. A collection can only have one text search index, although that index can have
several fields.

To drop a text index pass the name of the index to the


db.collection.dropIndex(“TextIndexname”) method.

For example, consider we have a stores collection with the following documents.

{ name: “Ram”, description:”luggage storage”,phoneno:7103456421 (tel:7103456421),


Area:chennai}

{name: “Jhonson”, description:”Absolutely everything is new”,phoneno:9104356782


(tel:9104356782), Area:chennai}

{ name: “James”, description:”Confirmation of the storage”,phoneno:8978316736


(tel:8978316736), Area: Chennai }

db.stores.createIndex({name:"text"},{background:true})

We can also create a compound text index on multiple fields.

db.stores.createIndex({name:"text",description:"text"},
{background:true})
:
We can also create compound indexes, including a mix of text and traditional indexes.

db.stores.createIndex({name:"text",phoneno:1},
{background:true})

Limitations of Text Indexes

1. At most one text index is allowed per collection


2. With $text query expression, we cannot use hint()
3. Together Text Index and Sort cannot give the required results. The sort operations cannot
use the ordering in the text index.

Performance consideration with text indexes

Text index can be very large and can take a long time to create.
MongoDB recommends having enough memory on your system to keep the text index
in memory otherwise there may be significant IO involved during the search.

Hashed Indexes

To maintain the entries with hashes of the values of the indexed field (mostly _id field in all
collections), we use Hash Index. These indexes have a more random distribution of values
along their range, but only support equality matches and cannot support range-based
queries.

For example, consider we have an inventory collection with the following documents.

{ _id: 5, type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ] }

{ _id: 6, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] }

{ _id: 7, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ] }

{ _id: 8, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ] }

{ _id: 9, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ] }


:
db.inventory.createIndex( { _id: "hashed "} ,{background:true})

Hashed indexes tend to be smaller than scalar indexes because only a hash of the key is
stored instead of the full key. E.g. For a simple test with 100k documents, we added hashed
and scalar indexes on a string field – fieldName. As shown below, the hashed index tends to
be considerably smaller than the scalar indexes

“indexSizes” : {

“_id_” : 811008, “firstName_1″ : 441548”, “firstName_hashed” : 1490944 (tel:1490944)

Limitations of Hashed Index

Hashed indexes do not support arrays.


Hashed indexes cannot be compound indexes.
You cannot add unique constraints on hashed indexes.

Wildcard Indexes

MongoDB 4.2 introduces wildcard indexes for supporting queries against unknown or
arbitrary fields.

A wildcard index is a type of filter that automatically matches any field, sub-document or
array in a collection and indexes those matches.

For example, suppose we have a data collection with the following documents.

{ _id: 5, Data:{A:2145,B:740,C:2013}}

{ _id: 5, Data:{A:1405,B:6740,C:213}}

{ _id: 5, Data:{A:1245,B:7140,C:1213}}

{ _id: 5, Data:{A:1465,B:7030,C:21673}}
:
Queries might be issued against any one of the attributes in the data sub-document.
Furthermore, the application may add new fields that we can’t estimate. We need to create a
separate index for each field to optimise performance.

db.data.createIndex({_id:1,”Data.A”:1},{background:true})

db.data.createIndex({_id:1,”Data.B”:1},{background:true})

db.data.createIndex({_id:1,”Data.C”:1},{background:true})

It requires creating too many indexes but even this won’t work unless we know for sure if
no new fields are added under Data.

In this scenario, the wildcard indexes come to improve the performance.

The following operations create a wildcard index on the Data field.

db.data.createIndex( { "Data.$**" : 1 } )

The statement creates an index on every field in the Data sub-document. Even if new fields
are created by an application after the index is created.

Create a Wildcard Index on All Fields


(https://www.mongodb.com/docs/manual/core/index-
wildcard/#create-a-wildcard-index-on-all-fields)

To index the value of all fields in a document (excluding _id ), specify "$**" as the
index key.

db.collection.createIndex( { "$**" : 1 } )

With this wildcard index, MongoDB indexes all fields for each document in the collection. If
a given field is a nested document or array, the wildcard index recurses into the
document/array and stores the value for all fields in the document/array.

Limitations of Wildcard Indexes

1. MongoDB cannot use a non-wildcard index to satisfy one part of a query predicate and a
wildcard index to satisfy another.
:
2. MongoDB cannot use one wildcard index to satisfy one part of a query predicate and
another wildcard index to satisfy another.
3. Even if a single wildcard index could support multiple query fields, MongoDB can use
the wildcard index to support only one of the query fields. All remaining fields are
resolved without an index.
4. Wildcard indexes don’t support compound, TTL, Text, Geospatial, Hashed and Unique
indexes.

Hidden Index

MongoDB 4.4 introduces Hidden indexes.

A hidden index is simply a regular index that is not visible to the query planner. When
evaluating the execution plans, MongoDB ignores such kinds of indexes.

Building an index on MongoDB is quite expensive, in particular for large collections or


when you don’t have enough available memory. Disabling indexes are useful for testing
different execution plans without dropping any index for real. You can hide or unhide an
index at any time at no cost to the database.

To hide index feature compatibility version set for 4.4.

For example, consider we have an inventory collection with the following documents.
:
{ _id: 5, type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ] }

{ _id: 6, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] }

{ _id: 7, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ] }

{ _id: 8, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ] }

{ _id: 9, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ] }

To create the hidden Index:

db.inventory.createIndex({type:1},{hidden:true})

To Hide an Existing Index:

db.inventory.hideIndex( {type: 1} )

To unhide the hidden index:

db.inventory.unhideIndex( {type: 1} )

Limitation of Hidden Index

You cannot hide the _id index.


During the writes, MongoDB maintains the hidden indexes the same as any regular
index.
The hidden indexes are immediately available after unhidden.
A unique index provides uniqueness constraint even when hidden.
A TLL index drops documents even when hidden.

Index Properties

Unique Index
:
A unique index ensures that the indexed fields do not store duplicate values i.e. enforces
uniqueness for the indexed fields. By default, MongoDB creates a unique index on the _id
field during the creation of a collection.

Often, you want to ensure that the values of a field are unique across documents in a
collection, such as a code or a username.

Use the createIndex() method with the option { unique: true } to create a unique index and
compound a unique index.

db.people.createIndex( { code:1} ,
{unique:true,background:true})

Unique Partial Indexes

Partial indexes index only documents in a collection that matches a specific filter
expression. If you specify both a partial filter expression and a unique constraint, the
unique constraint applies only to documents that match the filter expression.
Unique Partial indexes introduce in version 3.2.

Limitation of Unique Index

1. MongoDB cannot create a unique index on the specified index fields if the collection
already contains data that would violate the unique constraint for the index.
2. You may not specify a unique constraint on a hashed index.

Partial Index

Partial indexes only index the documents in a collection that meet a specified filter
expression. By indexing a subset of the documents in a collection, partial indexes have lower
storage requirements and reduced performance costs for index creation and maintenance.

The partialFilterExpression option accepts a document that specifies the filter condition
using:

equality expressions (i.e. field: value or using the $eq operator),


$exists: true expression,
:
$gt, $gte, $lt, $lte expressions,
$type expressions,
$and operator at the top-level only

Example:

db.contacts.createIndex( { name: 1 },{ partialFilterExpression:


{ email: { $exists: true } } ,background:true})

Limitations of Partial Index

In 5.0 earlier versions of MongoDB, creating multiple partial indexes is not allowed when
using the same key pattern with different partialFilterExpressions.

You cannot specify both the partialFilterExpression option and the sparse option.

_id indexes cannot be partial indexes.

Shard key indexes cannot be partial indexes.

Sparse Indexes

The sparse property of an index ensures that the index only contains entries for documents
that have the indexed field. The index skips documents that do not have the indexed field.

Example:

db.addresses.createIndex( { "Area": 1 }, { sparse: true } )

Sparse Index and Incomplete Results

If a sparse index would result in an incomplete result set for queries and sort operations,
MongoDB will not use that index unless a hint() explicitly specifies the index.

For example, the query { x: { $exists: false } } will not use a sparse index on the x field unless
explicitly hinted. See Sparse Index On A Collection Cannot Return Complete Results for an
example that details the behaviour.
:
Changed in version 3.4.

If you include a hint() that specifies a sparse index when you perform a count() of all
documents in a collection (i.e. with an empty query predicate), the sparse index is used even
if the sparse index results in an incorrect count.

db.collection.insertOne( { _id: 1, y: 1 } )

db.collection.createIndex( { x: 1 }, { sparse: true } )

db.collection.find().hint( { x: 1 } ).count()

To obtain the correct count, do not hint() with a sparse index when performing a count of all
documents in a collection.

db.collection.find().count()

db.collection.createIndex( { y: 1 } )

db.collection.find().hint( { y: 1 } ).count()

Comparison Of Partial Indexes with Sparse Indexes

Partial indexes should be preferred over sparse indexes. Partial indexes provide the
following benefits:

Greater control over which documents are indexed.


A superset of the functionality offered by sparse indexes.

Sparse indexes select documents to index only based on the existence of the indexed field, or
for compound indexes, the existence of the indexed fields.

The partial index can also specify filter expressions on fields other than the index key. For
example, the following operation creates a partial index, where the index is on the
“name” field but the filter expression is on the “email” field.

db.contacts.createIndex( { name: 1 },{ partialFilterExpression:


{ email: { $exists: true } } ,background:true})

You are strongly encouraged to consider partial indexes if you have one or more of these
use cases.
:
Hybrid Index Build

As of MongoDB 4.2, there are no longer foreground/background indexes, just one


hybrid index function.
This is the only index build type in MongoDB.
We can now build the index quickly and without the need to lock the database.

Comparison to Foreground and Background Builds

Foreground Index Builds Background Index Builds


Foreground index builds were fast and
Background index builds were slower and
produced more efficient index data
had less efficient results
structures
Lock the entire database for the duration of Incremental approach. Periodically locks
the index build. database,
Foreground index blocking all read-write Background index allowed all read-write
access to the parent database of the access to the parent database of the
collection being indexed for the duration of collection being indexed for the duration of
the build the build
db. inventory.createIndex( { ratings: 1 },
db. inventory.createIndex( { ratings: 1 })
{background:true} )

Starting in MongoDB 4.2, index builds obtain an exclusive lock on only the collection
being indexed during the start and end of the build process to protect metadata changes.
The rest of the build process uses the yielding behaviour of background index builds to
maximize read-write access to the collection during the build.
4.2 index builds still produce efficient index data structures despite the more permissive
locking behaviour.

We hope that after reading these blogs you have an idea about the types of indexes and
their properties. In upcoming blogs, we will come up with a terminate-in-progress index
build. Happy Learning!!!

Published by Chandrika Pokuri

View all posts by Chandrika Pokuri


:
:

You might also like