Computer scienceSoftware development foundationsSQL and DatabasesDBMSMongoDBOptimization

Explain

7 minutes read

MongoDB is a popular NoSQL database that has been widely adopted for its flexibility and scalability. One of the key features of MongoDB is its ability to perform complex queries using a variety of operators. In this article, we will focus on the "explain" operator and how it can be used to optimize queries.

Explain essence

The explain operator is a command that can be used to provide detailed information about how a query is executed. When we use it within a query, the explain operator returns a document that describes the steps taken by the query optimizer to execute the query. This information can be usefull to optimize the query and improve its performance.

Let's look at it's syntax with an example below:

db.collection.find(query).explain()

In this example, db.collection is the name of the collection to query and query is the query object to execute. When we call the explain, MongoDB will return a document that describes how the query will be executed.

Explain structure

The document returned by the explain operator contains a lot of information about the query execution. Here are some of the key fields:

"queryPlanner": This field describes how the query planner will execute the query. It includes information about the indexes that will be used, the number of documents that will be scanned, and the number of documents that will be returned.
"winningPlan": This field describes the plan that the query planner has chosen to execute the query. It includes information about the indexes that will be used, the order in which the documents will be scanned, and any filters that will be applied.
"executionStats": This field describes the statistics for the query execution. It includes information about the number of documents that were scanned, the time taken to execute the query, and the amount of memory used.
"serverInfo": This field describes information about the MongoDB server, such as the version number and the operating system.

Query example

Let's say we have a collection called products that contains information about various products. We want to analyze the performance of a query that finds all products with a price greater than or equal to 100, sorted by price in descending order.

Here's what the code structure would look like:

db.products.find(
    { price: { $gte: 100 } }
).sort(
    { price: -1 }
).explain()

In this example, we're using the find() method to search for all products with a price greater than or equal to 100. We're also using the sort() method to sort the results by price in descending order. Finally, we're calling the explain() method to analyze the performance of the query.

When we call explain(), MongoDB will provide us with detailed information about how the query was executed. This information can be very helpful in optimizing the performance of our queries. Here's an example of what the output might look like:

{
    "queryPlanner": {
        "plannerVersion": 1,
        "namespace": "test.products",
        "indexFilterSet": false,
        "parsedQuery": {
            "price": { "$gte": 100 }
        },
        "winningPlan": {
            "stage": "SORT",
            "sortPattern": { "price": -1 },
            "inputStage": {
                "stage": "SORT_KEY_GENERATOR",
                "inputStage": {
                    "stage": "COLLSCAN",
                    "filter": {
                        "price": { "$gte": 100 }
                    },
                    "direction": "forward"
                }
            }
        },
        "rejectedPlans": []
    },
    "serverInfo": {
        "host": "localhost",
        "port": 27017,
        "version": "4.4.4",
        "gitVersion": "8db30a63db1aabf0a0c1af2d8b4b4c0a1f801388"
    },
    "ok": 1
}

In this output, we can see information about the query planner, the namespace being queried, the parsed query, and the winning plan that was chosen to execute the query. We can also see server information and whether the query was successful.

Query stages

If we look at the query stages in the output of the explain() method in MongoDB, we can gain insights into how the query is being executed and how we can optimize it for better performance. Here are some examples of what we can learn from the different query stages:

Query Stage	Description	Insights
COLLSCAN	Scans the entire collection to find matching documents.	This can be slow for large collections, so we may want to consider creating an index to speed up the query.
IXSCAN	Uses an index to find matching documents.	This is generally faster than a COLLSCAN, but the index must be appropriately designed for the query to be effective. If we see an IXSCAN stage followed by a FETCH stage, it means that the query is using the index to find documents, but is still fetching them from the database. In this case, we may want to consider adding the desired fields to the index to avoid the FETCH stage.
FETCH	Retrieves matching documents from the database.	This can be slow if there are a large number of documents to retrieve, so you may want to consider optimizing your query to reduce the number of documents being fetched.
SORT	Sorts the results according to a specified sort order.	Sorting can be expensive for large result sets, so we may want to consider adding an index that includes the sort order.
PROJECTION	Projects or selects only certain fields from the documents in the collection.	This can be useful for reducing network traffic and improving query performance, especially if the selected fields are much smaller than the full document.
SHARD_MERGE	Merges result from multiple shards in a sharded cluster.	This can be slow, so we may want to consider optimizing our sharding strategy or reducing the amount of data being queried.
COUNT	Counts the number of matching documents.	This can be useful for determining the cardinality of a query, but it may not always be necessary.
SKIP	Skips a certain number of matching documents, based on a specified skip value.	This can be useful for pagination, but it can be slow if the skip value is large.
LIMIT	Limits the number of matching documents returned, based on a specified limit value.	This can be useful for reducing network traffic and improving query performance, especially if the limit value is much smaller than the full result set.
SORT_KEY_GENERATOR	Generates a sort key for the SORT stage.	This can be useful if we need to sort on a calculated field or if the sort order is not available in an index. However, it can be slow if the sort key generation is complex or involves a large number of documents.

Explain verbosity level

The explain() method accepts an optional verbosity parameter that controls the amount of detail included in the output.

If we pass in a verbosity value of "queryPlanner", the output will include more detailed information on how the query was planned, such as which indexes were considered and how the query optimizer decided on the winning plan.

Here is an example:

db.myCollection.find({ name: "John" }).explain("queryPlanner")

If we pass in a verbosity value of "executionStats", the output will include detailed information on the execution of the query, such as how many documents were returned, how long the query took to execute, and how many resources were used during the execution.

See the code below:

db.myCollection.find({ name: "John" }).explain("executionStats")

If we pass in a verbosity value of "allPlansExecution", the output will include information on all possible query plans, as well as their associated execution statistics. This can be useful for comparing the performance of different query plans.

The request will look as follows:

db.myCollection.find({ name: "John" }).explain("allPlansExecution")

We should note that higher verbosity levels can produce more detailed information, but may also result in larger output and longer query execution times.

Conclusion

To sum up, MongoDB's explain operator is an effective tool that provides valuable information about query performance. By using the explain operator, developers can gain insights into how queries are executed and identify areas for optimization. The output of the explainoperator includes details about the query planner, the winning plan, and server information. Understanding the different query stages and their implications on query performance can help developers optimize queries to achieve better performance.

So, now that we know how the operator works – it is high time for practice!

10 learners liked this piece of theory. 0 didn't like it. What about you?

Report a typo