Build a semantic search application with Azure Cosmos DB for NoSQL

In this exercise, you implement vector similarity search using Azure Cosmos DB for NoSQL. Vector search enables semantic matching by comparing high-dimensional vector representations of text, finding relevant results even when exact terms don't match. You configure a container with vector embedding and indexing policies, load support tickets with pre-computed embeddings, and execute similarity queries using the VectorDistance function. This pattern provides a foundation for building AI applications that perform semantic search, such as finding similar support cases to help resolve customer issues faster.

Tasks performed in this exercise:

Download project starter files and configure the deployment script
Deploy an Azure Cosmos DB for NoSQL account with vector search capability
Build Python functions for vector similarity search
Create a container with vector embedding and indexing policies
Test vector search using a Flask web application

This exercise takes approximately 30 minutes to complete.

Before you start

To complete the exercise, you need:

An Azure subscription with the permissions to deploy the necessary Azure services. If you don't already have one, you can sign up for one.
Visual Studio Code on one of the supported platforms.
The latest version of the Azure CLI.
Python 3.12 or greater.

Download project starter files and deploy Azure services

In this section you download the project starter files and use a script to deploy the necessary services to your Azure subscription. The Cosmos DB account deployment takes a few minutes to complete.

Open a browser and enter the following URL to download the starter file. The file will be saved in your default download location.
```
https://github.com/MicrosoftLearning/mslearn-azure-ai/raw/main/downloads/python/cosmosdb-implement-vector-python.zip
```
Copy, or move, the file to a location in your system where you want to work on the project. Then unzip the file into a folder.
Launch Visual Studio Code (VS Code) and select File > Open Folder... in the menu, then choose the folder containing the project files.
The project contains deployment scripts for both Bash (azdeploy.sh) and PowerShell (azdeploy.ps1). Open the appropriate file for your environment and change the two values at the top of the script to meet your needs, then save your changes. Note: Do not change anything else in the script.
```
"<your-resource-group-name>" # Resource Group name
"<your-azure-region>" # Azure region for the resources
```
In the menu bar select Terminal > New Terminal to open a terminal window in VS Code.
Run the following command to login to your Azure account. Answer the prompts to select your Azure account and subscription for the exercise.
```
az login
```
Run the following command to ensure your subscription has the necessary resource provider for the exercise.
```
az provider register --namespace Microsoft.DocumentDB
```

Create resources in Azure

In this section you run the deployment script to deploy the Cosmos DB account with vector search capability.

Make sure you are in the root directory of the project and run the appropriate command in the terminal to launch the deployment script.

Bash
```
bash azdeploy.sh
```
PowerShell
```
./azdeploy.ps1
```
When the script menu appears, enter 1 to launch the Create Cosmos DB account option. This creates the Cosmos DB for NoSQL account with the EnableNoSQLVectorSearch capability and a database. Note: Deployment can take 5-10 minutes to complete.

IMPORTANT: Leave the terminal running the deployment open for the duration of the exercise. You can move on to the next section of the exercise while the deployment continues in the terminal.

Complete the apps

In this section you complete the Python code for both the vector search functions and the container setup script. The vector functions perform similarity searches using the VectorDistance function, while the setup script creates the container with the necessary vector policies.

Complete the vector search functions

In this section you complete the vector_functions.py file by adding functions that perform vector similarity search. These functions use the VectorDistance function to calculate similarity between query vectors and ticket embeddings. A support application could use these functions to find similar tickets when a new issue is reported.

Open the client/vector_functions.py file in VS Code.

Search for the BEGIN STORE VECTOR DOCUMENT FUNCTION comment and add the following code directly after the comment. This function stores a support ticket with its vector embedding for similarity search.

def store_vector_document(
    document_id: str,
    chunk_id: str,
    content: str,
    embedding: list,
    metadata: dict = None
) -> dict:
    """Store a document with its vector embedding for similarity search."""
    container = get_container()

    # Build the document structure with embedding for vector search
    # The 'id' field is required by Cosmos DB and must be unique within the partition
    # The 'documentId' field is our partition key - chunks from the same source document
    # are stored together for efficient retrieval
    # The 'embedding' field contains the vector that will be used for similarity search
    document = {
        "id": chunk_id,
        "documentId": document_id,
        "content": content,
        "embedding": embedding,  # 256-dimensional vector for similarity search
        "metadata": metadata or {},
        "createdAt": datetime.utcnow().isoformat(),
        "chunkIndex": metadata.get("chunkIndex", 0) if metadata else 0
    }

    # upsert_item inserts if new, updates if exists (based on id + partition key)
    # This is idempotent - safe to call multiple times with the same data
    response = container.upsert_item(body=document)

    # Request Units (RUs) measure the cost of database operations in Cosmos DB
    # Tracking RU consumption helps optimize queries and estimate costs
    ru_charge = response.get_response_headers()['x-ms-request-charge']

    return {
        "chunk_id": chunk_id,
        "document_id": document_id,
        "ru_charge": float(ru_charge)
    }

Search for the BEGIN VECTOR SIMILARITY SEARCH FUNCTION comment and add the following code directly after the comment. This function finds tickets most similar to a query using vector distance.

def vector_similarity_search(
    query_embedding: list,
    top_n: int = 5
) -> list:
    """
    Find documents most similar to the query using vector distance.

    Uses the VectorDistance function to calculate cosine similarity between
    the query embedding and document embeddings stored in Cosmos DB.
    Results are ordered by similarity (lowest distance = most similar).
    """
    container = get_container()

    # The VectorDistance function calculates the distance between two vectors
    # Using cosine distance: 0 = identical, 2 = opposite
    # We order by distance ascending so most similar results come first
    # The @queryVector parameter contains our 256-dimensional query embedding
    query = """
        SELECT TOP @topN
            c.id,
            c.documentId,
            c.content,
            c.metadata,
            VectorDistance(c.embedding, @queryVector) AS similarityScore
        FROM c
        ORDER BY VectorDistance(c.embedding, @queryVector)
    """

    items = container.query_items(
        query=query,
        parameters=[
            {"name": "@topN", "value": top_n},
            {"name": "@queryVector", "value": query_embedding}
        ],
        enable_cross_partition_query=True
    )

    return [
        {
            "chunk_id": item["id"],
            "document_id": item["documentId"],
            "content": item["content"],
            "metadata": item["metadata"],
            "similarity_score": item["similarityScore"]
        }
        for item in items
    ]

Search for the BEGIN FILTERED VECTOR SEARCH FUNCTION comment and add the following code directly after the comment. This function combines vector similarity search with metadata filtering for hybrid queries.

def filtered_vector_search(
    query_embedding: list,
    category: str = None,
    top_n: int = 5
) -> list:
    """
    Combine vector similarity search with metadata filtering.

    This hybrid approach first filters documents by category (or other metadata),
    then ranks the filtered results by vector similarity. This is useful for
    narrowing results to a specific domain before applying semantic search.
    """
    container = get_container()

    # Build WHERE clause for metadata filtering
    # The filter is applied BEFORE vector ranking, reducing the search space
    where_clause = ""
    parameters = [
        {"name": "@topN", "value": top_n},
        {"name": "@queryVector", "value": query_embedding}
    ]

    if category:
        where_clause = "WHERE c.metadata.category = @category"
        parameters.append({"name": "@category", "value": category})

    # Filtered vector search: apply metadata filter, then rank by similarity
    query = f"""
        SELECT TOP @topN
            c.id,
            c.documentId,
            c.content,
            c.metadata,
            VectorDistance(c.embedding, @queryVector) AS similarityScore
        FROM c
        {where_clause}
        ORDER BY VectorDistance(c.embedding, @queryVector)
    """

    items = container.query_items(
        query=query,
        parameters=parameters,
        enable_cross_partition_query=True
    )

    return [
        {
            "chunk_id": item["id"],
            "document_id": item["documentId"],
            "content": item["content"],
            "metadata": item["metadata"],
            "similarity_score": item["similarityScore"]
        }
        for item in items
    ]

Save your changes to the vector_functions.py file.
Take a few minutes to review all of the code in the file.

Complete the setup container code

In this section you complete the setup_container.py script used to create a Cosmos DB container with vector embedding and indexing policies. These policies enable the VectorDistance function for similarity search.

Open the client/setup_container.py file in VS Code.

Search for the BEGIN CREATE VECTOR CONTAINER FUNCTION comment and review the code. This function creates a container configured for vector search:

def create_vector_container():
    """
    Create a container with vector embedding and indexing policies.

    The vector embedding policy defines:
    - path: JSON path where vector embeddings are stored
    - dataType: Data type for vector components (float32)
    - distanceFunction: Similarity metric (cosine: 0=identical, 2=opposite)
    - dimensions: Number of dimensions in each vector (256)

    The indexing policy includes:
    - Standard indexing for all paths except embeddings
    - DiskANN vector index for efficient similarity search
    """
    database = get_database()
    container_name = os.environ.get("COSMOS_CONTAINER", "vectors")

    # Define the vector embedding policy
    # This tells Cosmos DB how to handle vector data at the /embedding path
    vector_embedding_policy = {
        "vectorEmbeddings": [
            {
                "path": "/embedding",
                "dataType": "float32",
                "distanceFunction": "cosine",
                "dimensions": 256
            }
        ]
    }

    # Define the indexing policy with vector index
    # - DiskANN provides efficient approximate nearest neighbor search
    # - Exclude /embedding/* from standard indexing (vectors use their own index)
    indexing_policy = {
        "indexingMode": "consistent",
        "automatic": True,
        "includedPaths": [
            {"path": "/*"}
        ],
        "excludedPaths": [
            {"path": "/embedding/*"}
        ],
        "vectorIndexes": [
            {
                "path": "/embedding",
                "type": "diskANN"
            }
        ]
    }

    # Create the container with vector policies
    # partition_key determines how data is distributed across physical partitions
    container = database.create_container_if_not_exists(
        id=container_name,
        partition_key=PartitionKey(path="/documentId"),
        indexing_policy=indexing_policy,
        vector_embedding_policy=vector_embedding_policy
    )

    return container

Take a moment to understand the key configuration elements:

Policy	Setting	Purpose
vectorEmbeddings	path: /embedding	Location where vector data is stored
vectorEmbeddings	dimensions: 256	Must match your embedding model output
vectorEmbeddings	distanceFunction: cosine	Similarity metric for VectorDistance
vectorIndexes	type: diskANN	Efficient approximate nearest neighbor algorithm
excludedPaths	/embedding/*	Vectors use specialized index, not standard

Next, you finalize the Azure resource deployment.

Complete the Azure resource deployment

In this section you return to the deployment script to configure Entra ID access and retrieve the connection information.

When the Create Cosmos DB account operation has completed, enter 2 to launch the Configure Entra ID access option. This assigns your user account the necessary role to access the Cosmos DB data plane.
Enter 3 to launch the Check deployment status option. Verify the Cosmos DB account shows as ready with the vector search capability enabled.
Enter 4 to launch the Retrieve connection info option. This creates a file with the necessary environment variables.
Enter 5 to exit the deployment script.
Run the following command to load the environment variables into your terminal session from the file created in a previous step.

Bash
```
source .env
```
PowerShell
```
. .\.env.ps1
```
Note: Keep the terminal open. If you close it and create a new terminal, you might need to run the command to create the environment variable again.

Next, you set up the Python environment and create the vector container.

Set up the Python environment

In this section you create a Python virtual environment and install the dependencies needed for both the container setup script and the Flask application.

Run the following command to navigate to the client directory.
```
cd client
```
Run the following command to create a virtual environment for the Python scripts. Depending on your environment the command might be python or python3.
```
python -m venv .venv
```
Run the following command to activate the Python environment. Note: On Linux/macOS, use the Bash command. On Windows, use the PowerShell command. If using Git Bash on Windows, use source .venv/Scripts/activate.

Bash
```
source .venv/bin/activate
```
PowerShell
```
.\.venv\Scripts\Activate.ps1
```
Run the following command to install the Python dependencies. This installs the flask, azure-cosmos, and azure-identity libraries.
```
pip install -r requirements.txt
```

Next, you create the vector container with the required policies.

Create the vector container

In this section you run the setup script to create the Cosmos DB container with the vector policies you reviewed earlier.

Run the following command to execute the setup script and create the container. Ensure you are still in the client directory with the virtual environment activated.
```
python setup_container.py
```
Verify the output shows the container was created successfully with the vector policies configured.

Next, you test the vector search functions using the Flask application.

Test the vector search functions with the Flask app

In this section you start the Flask web application and use its interface to test the vector search functions you created. The app provides a visual way to load sample support tickets and execute vector similarity searches.

Ensure you are still in the client directory with the virtual environment activated. You should see (.venv) in your terminal prompt.
Run the following command to start the Flask application.
```
python app.py
```
Open a browser and navigate to http://127.0.0.1:5000 to view the application.

Load sample data

In this section you use the app to load sample support tickets with pre-computed embeddings into the Cosmos DB container. The sample data includes 12 support tickets across different categories (billing, technical, account, shipping), each with a 256-dimensional embedding vector. The app calls the store_vector_document() function you created in vector_functions.py.

In the Load Sample Data section, select Load Vector Data. This inserts tickets with their pre-computed embeddings from the sample_vectors.json file.
Verify that the success message appears in the Results section showing the number of tickets loaded and the total RU (Request Unit) charge.

Vector similarity search

In this section you perform semantic searches using pre-computed query vectors. The app calls the vector_similarity_search() function you created in vector_functions.py.

In the Vector Similarity Search section, select I can't login to my account from the Select Query dropdown.
Keep the default Top 5 results and select Search.
Review the results showing tickets ranked by similarity score. Notice that tickets about authentication and account access appear first, even though they may use different terminology than the query.
Try selecting different queries such as My payment was charged twice or Package hasn't arrived yet to see how the semantic search finds relevant support cases.

Filtered vector search

In this section you combine metadata filtering with vector similarity ranking. The app calls the filtered_vector_search() function you created in vector_functions.py. You observe how filtering narrows results to a specific category.

In the Filtered Vector Search section, select I can't login to my account from the Select Query dropdown.
Select technical from the Filter by Category dropdown.
Select Search with Filter to execute the filtered search.
Review the results. Notice that only tickets with the technical category are returned, ranked by similarity to the query.
Try the same query with the account category to see different results that are still semantically relevant but limited to account-related issues.
Return to the terminal and press Ctrl+C to stop the Flask application.

Summary

In this exercise, you implemented vector similarity search using Azure Cosmos DB for NoSQL. You deployed an Azure Cosmos DB account with the EnableNoSQLVectorSearch capability and configured Entra ID authentication. You created a container using the Python SDK with vector embedding and indexing policies that enable the VectorDistance function. You built Python functions that store support tickets with embeddings, perform vector similarity search, and combine vector search with metadata filters. You tested the workflow using a Flask web application. This pattern enables applications to perform semantic search over support data, finding similar tickets based on meaning rather than exact keyword matches.

Clean up resources

Now that you finished the exercise, you should delete the cloud resources you created to avoid unnecessary resource usage.

Run the following command in the VS Code terminal to delete the resource group, and all resources in the group. Replace <rg-name> with the name you chose earlier in the exercise. The command will launch a background task in Azure to delete the resource group.
```
az group delete --name <rg-name> --no-wait --yes
```

CAUTION: Deleting a resource group deletes all resources contained within it. If you chose an existing resource group for this exercise, any existing resources outside the scope of this exercise will also be deleted.

Troubleshooting

If you encounter issues during this exercise, try these steps:

Flask app fails to start

Ensure Python virtual environment is activated (you should see (.venv) in your terminal prompt)
Ensure dependencies are installed: pip install -r requirements.txt
Ensure environment variables are loaded by running source .env (Bash) or . ..env.ps1 (PowerShell)
Ensure you are in the client directory when running python app.py

Authentication or access denied errors

Ensure Entra ID access was configured by running the deployment script option 2
Verify your user has both the Contributor role and the Cosmos DB Built-in Data Contributor role
Ensure COSMOS_ENDPOINT is set correctly in your terminal session

Vector search returns no results or errors

Verify the vector container was created by running python setup_container.py
Ensure the container has the vector embedding policy configured (check status with deployment script option 3)
Verify sample tickets were loaded before running searches

setup_container.py fails

Ensure Python virtual environment is activated
Ensure environment variables are set (COSMOS_ENDPOINT, COSMOS_DATABASE, COSMOS_CONTAINER)
If container already exists, the script will use the existing container

Cosmos DB operations fail

Verify the Cosmos DB account is ready by running the deployment script option 3
Ensure the database was created during deployment
Check that the account has the EnableNoSQLVectorSearch capability

Environment variable issues

Ensure the .env file was created by running the deployment script option 4
Run source .env (Bash) or . ..env.ps1 (PowerShell) after creating a new terminal
Verify variables are set by running echo $COSMOS_ENDPOINT (Bash) or $env:COSMOS_ENDPOINT (PowerShell)

Python venv activation issues

On Linux/macOS, use: source .venv/bin/activate
On Windows PowerShell, use: ..venv\Scripts\Activate.ps1
If activate script is missing, reinstall python3-venv package and recreate the venv