Optimize an Azure Cosmos DB for NoSQL container’s indexing policy for common operations

For write-heavy workloads or workloads with large JSON objects, it can be advantageous to optimize the indexing policy to only index properties that you know you will want to use in your queries.

In this lab, we will use a test .NET application to insert a large JSON item into an Azure Cosmos DB for NoSQL container using the default indexing policy, and then using an indexing policy that has been tuned slightly.

Prepare your development environment

If you have not already cloned the lab code repository for DP-420 to the environment where you’re working on this lab, follow these steps to do so. Otherwise, open the previously cloned folder in Visual Studio Code.

  1. Start Visual Studio Code.

    📝 If you are not already familiar with the Visual Studio Code interface, review the Getting Started documentation

  2. Open the command palette and run Git: Clone to clone the https://github.com/microsoftlearning/dp-420-cosmos-db-dev GitHub repository in a local folder of your choice.

    💡 You can use the CTRL+SHIFT+P keyboard shortcut to open the command palette.

  3. Once the repository has been cloned, open the local folder you selected in Visual Studio Code.

Create an Azure Cosmos DB for NoSQL account

Azure Cosmos DB is a cloud-based NoSQL database service that supports multiple APIs. When provisioning an Azure Cosmos DB account for the first time, you will select which of the APIs you want the account to support (for example, Mongo API or NoSQL API). Once the Azure Cosmos DB for NoSQL account is done provisioning, you can retrieve the endpoint and key and use them to connect to the Azure Cosmos DB for NoSQL account using the Azure SDK for .NET or any other SDK of your choice.

  1. In a new web browser window or tab, navigate to the Azure portal (portal.azure.com).

  2. Sign into the portal using the Microsoft credentials associated with your subscription.

  3. Select + Create a resource, search for Cosmos DB, and then create a new Azure Cosmos DB for NoSQL account resource with the following settings, leaving all remaining settings to their default values:

    Setting Value
    Subscription Your existing Azure subscription
    Resource group Select an existing or create a new resource group
    Account Name Enter a globally unique name
    Location Choose any available region
    Capacity mode Serverless

    📝 Your lab environments may have restrictions preventing you from creating a new resource group. If that is the case, use the existing pre-created resource group.

  4. Wait for the deployment task to complete before continuing with this task.

  5. Go to the newly created Azure Cosmos DB account resource and navigate to the Data Explorer pane.

  6. In the Data Explorer pane, select New Container.

  7. In the New Container popup, enter the following values for each setting, and then select OK:

    Setting Value
    Database id Create new | cosmicworks
    Container id products
    Partition key /categoryId
  8. Back in the Data Explorer pane, expand the cosmicworks database node and then observe the products container node within the hierarchy.

  9. In the resource blade, navigate to the Keys pane.

  10. This pane contains the connection details and credentials necessary to connect to the account from the SDK. Specifically:

    1. Notice the URI field. You will use this endpoint value later in this exercise.

    2. Notice the PRIMARY KEY field. You will use this key value later in this exercise.

  11. Return to Visual Studio Code.

Run the test .NET application using the default indexing policy

This lab has a pre-built test .NET application that will take a large JSON object and create a new item in the Azure Cosmos DB for NoSQL container. Once the single write operation is complete, the application will output the item’s unique identifier and RU charge to the console window.

  1. In the Explorer pane, browse to the 23-index-optimization folder.

  2. Open the context menu for the 23-index-optimization folder and then select Open in Integrated Terminal to open a new terminal instance.

    📝 This command will open the terminal with the starting directory already set to the 23-index-optimization folder.

  3. Build the project using the [dotnet build][docs.microsoft.com/dotnet/core/tools/dotnet-build] command:

     dotnet build
    

    📝 You may see a compiler warning that the endpoint and key variables are current unused. You can safely ignore this warning as you will use these variable in this task.

  4. Close the integrated terminal.

  5. Open the script.cs code file.

  6. Locate the string variable named endpoint. Set its value to the endpoint of the Azure Cosmos DB account you created earlier.

     string endpoint = "<cosmos-endpoint>";
    

    📝 For example, if your endpoint is: https­://dp420.documents.azure.com:443/, then the C# statement would be: string endpoint = “https­://dp420.documents.azure.com:443/”;.

  7. Locate the string variable named key. Set its value to the key of the Azure Cosmos DB account you created earlier.

     string key = "<cosmos-key>";
    

    📝 For example, if your key is: fDR2ci9QgkdkvERTQ==, then the C# statement would be: string key = “fDR2ci9QgkdkvERTQ==”;.

  8. Save the script.cs code file.

  9. In Visual Studio Code, open the context menu for the 23-index-optimization folder and then select Open in Integrated Terminal to open a new terminal instance.

  10. Build and run the project using the dotnet run command:

     dotnet run
    
  11. Observe the output from the terminal. The item’s unique identifier and the operation’s request charge (in RUs) should be printed to the console.

  12. Build and run the project at least two more times using the dotnet run command. Observe the RU charge in the console output:

     dotnet run
    
  13. Leave the integrated terminal open.

    📝 You will re-use this terminal later in this exercise. It’s important to leave the terminal open so you can compare the original and updated RU charges.

Update the indexing policy and rerun the .NET application

This lab scenario will assume that our future queries focus primarily on the name and categoryName properties. To optimize for our large JSON item, you will exclude all other fields from the index by creating an indexing policy that starts by excluding all paths. Then the policy will selectively include specific paths.

  1. Return to your web browser.

  2. Within the Azure Cosmos DB account resource, navigate to the Data Explorer pane.

  3. In the Data Explorer, expand the cosmicworks database node, expand the products container node, and then select Settings.

  4. In the Settings tab, navigate to the Indexing Policy section.

  5. Observe the default indexing policy:

     {
       "indexingMode": "consistent",
       "automatic": true,
       "includedPaths": [
         {
           "path": "/*"
         }
       ],
       "excludedPaths": [
         {
           "path": "/\"_etag\"/?"
         }
       ]
     }    
    
  6. Replace the indexing policy with this modified JSON object and then Save the changes:

     {
       "indexingMode": "consistent",
       "automatic": true,
       "includedPaths": [
         {
           "path": "/name/?"
         },
         {
           "path": "/categoryName/?"
         }
       ],
       "excludedPaths": [
         {
           "path": "/*"
         },
         {
           "path": "/\"_etag\"/?"
         }
       ]
     }
    
  7. Return to Visual Studio Code. Return to the open terminal.

  8. Build and run the project at least two more times using the dotnet run command. Observe the new RU charge in the console output, which should be significantly less than the original charge. Since you are not indexing all the item properties, your writes’ cost is significantly lower when updating the index. This however, can cost you greatly if your reads will need to query on properties that are not indexed.

     dotnet run
    

    📝 If you are not seeing an updated RU charge, you may need to wait a couple of minutes.

  9. Return to your web browser.

    📝 If the Indexing Policy page is not open, go to Data Explorer, expand the cosmicworks database node, expand the products container node, select Settings and navigate to the Indexing Policy section.

  10. Replace the indexing policy with this modified JSON object and then Save the changes:

     {
       "indexingMode": "none"
     }
    
  11. Close your web browser window or tab.

  12. Return to Visual Studio Code. Return to the open terminal.

  13. Build and run the project at least two more times using the dotnet run command. Observe the new RU charge in the console output, which should much less than the original charge. How can this be? Since this script is measuring the RUs when you write the item, by choosing to have no index, there is no overhead mantaining that index. The flipside to this is that while your writes will generate less RUs, your reads will be very costly.

     dotnet run
    

    📝 If you are not seeing an updated RU charge, you may need to wait a couple of minutes.

  14. Close Visual Studio Code.