Develop a Content Understanding client application
In this exercise, you use the Azure Content Understanding Python SDK to create an analyzer that extracts information from business cards. You’ll then develop a client application that uses the analyzer to extract contact details from scanned business cards.
This exercise takes approximately 30 minutes.
Create a Microsoft Foundry resource and project
The features we’re going to use in this exercise require a Microsoft Foundry resource and project.
- In a web browser, open the Microsoft Foundry portal at
https://ai.azure.comand sign in using your Azure credentials. Close any tips or quick start panes that are opened the first time you sign in. - Make sure the New Foundry toggle is on so that you’re using Foundry (new).
- Select the project name in the upper-left corner, and then select Create new project.
- Give your project a name and expand Advanced options to specify the following settings:
- Subscription: Your Azure subscription
- Resource group: Create or select a resource group
- Location: Choose one of the following supported regions:*
- Australia East
- East US
- East US 2
- Japan East
- North Europe
- South Central US
- Southeast Asia
- Sweden Central
- UK South
- West Europe
- West US
- West US 3
*Azure Content Understanding is available in selected regions. See the region support documentation for the latest availability.
- Select Create and wait for your project to be created. This will create a project and the parent resource.
- Once created, select the project name at the top of the page, and select Project details. On that page, follow the link to the parent resource. Leave this browser tab open.
Configure Content Understanding models and connection
Connect foundry thought CU portal
Content Understanding uses OpenAI models for analysis that are deployed in your project. You need to deploy these models before using analyzers, and set up the connection between Content Understanding and your Foundry resource. The easiest way is through the Content Understanding Studio.
- In a new tab, navigate to Content Understanding Studio at
https://contentunderstanding.ai.azure.com/homeand sign in with your credentials. - Select the settings gear icon on the top navigation bar, and select + Add resource.
- Select your subscription and resource group where you created your Foundry resource, then select your Foundry resource name from the dropdown. This resource is the parent resource to the project you previously created.
- Ensure the Enable auto-deployment box is checked, then select Next and Save to create the configuration.
- Wait while it deploys the required models for Content Understanding.
Prepare the development environment
You’ll use Visual Studio Code as your development environment.
- Start Visual Studio Code.
- Open the Command Palette (press Ctrl+Shift+P), type Git: Clone, and select it.
-
In the URL bar, paste the following repository URL and press Enter:
https://github.com/microsoftlearning/mslearn-ai-information-extraction - Choose a local folder to clone into, and then when prompted, select Open to open the cloned repository in VS Code.
- In the VS Code Explorer pane, navigate to Labfiles/02-content-understanding-api. The folder contains two scanned business card images as well as the Python code files you need to build your app.
-
Open a new terminal and navigate to the app folder:
cd Labfiles/02-content-understanding-api -
Install the required libraries:
python -m venv labenv labenv\Scripts\activate pip install -r requirements.txt azure-ai-contentunderstanding - In the VS Code Explorer pane, open the .env file in the Labfiles/02-content-understanding-api folder.
-
In the file, replace the YOUR_ENDPOINT placeholder with your Microsoft Foundry resource endpoint and ensure that ANALYZER_NAME is set to
businesscardanalyzer.Tip: You can also find the endpoint in the Azure portal by navigating to your Microsoft Foundry resource and viewing Resource Management > Keys and Endpoint.
- Save the file (CTRL+S).
Create an analyzer with the Python SDK
Now you’ll use the Content Understanding Python SDK to create an analyzer that can extract information from images of business cards.
-
In the VS Code Explorer pane, open the biz-card.json file and review its contents. This JSON defines an analyzer schema for a business card, specifying the fields to extract (Company, Name, Title, Email, Phone).
-
Open the create-analyzer.py file in VS Code.
- Review the code, which:
- Imports the
ContentUnderstandingClientandAzureKeyCredentialfrom the Azure Content Understanding SDK. - Loads the analyzer schema from the biz-card.json file.
- Retrieves the endpoint, key, and analyzer name from the environment configuration file.
- Calls a function named create_analyzer, which is currently not implemented.
- Imports the
-
In the create_analyzer function, find the comment Create a Content Understanding analyzer and add the following code (being careful to maintain the correct indentation):
# Create a Content Understanding analyzer print(f"Creating {analyzer}") # Create the Content Understanding client client = ContentUnderstandingClient( endpoint=endpoint, credential=DefaultAzureCredential() ) # Parse the schema JSON into a ContentAnalyzer object analyzer_definition = json.loads(schema) # Create the analyzer using the SDK (long-running operation) poller = client.begin_create_analyzer( analyzer_id=analyzer, resource=analyzer_definition, allow_replace=True ) # Wait for the operation to complete result = poller.result() print(f"Analyzer '{analyzer}' created successfully.") print(f"Status: {result['status'] if isinstance(result, dict) else 'Succeeded'}") - Review the code you added, which:
- Creates a
ContentUnderstandingClientinstance with the endpoint and API key. - Parses the analyzer schema JSON.
- Uses
begin_create_analyzerto start the long-running operation to create the analyzer. - Calls
.result()to wait for the operation to complete.
Note: The SDK handles polling automatically through the
LROPollerpattern — no manual polling is needed! - Creates a
- Save the file (CTRL+S).
-
In the VS Code terminal (make sure the virtual environment is still activated and you’re in the Labfiles/02-content-understanding-api folder), run the Python code:
python create-analyzer.py - Review the output from the program, which should indicate that the analyzer has been created.
Analyze content using the Python SDK
Now that you’ve created an analyzer, you can consume it from a client application through the Content Understanding Python SDK.
-
In VS Code, open the read-card.py file.
- Review the code, which:
- Imports the
ContentUnderstandingClientandAzureKeyCredentialfrom the SDK. - Identifies the image file to be analyzed, with a default of biz-card-1.png.
- Retrieves the endpoint and key from the environment configuration file.
- Calls a function named analyze_card, which is currently not implemented.
- Imports the
-
In the analyze_card function, find the comment Use Content Understanding to analyze the image and add the following code (being careful to maintain the correct indentation):
# Use Content Understanding to analyze the image print(f"Analyzing {image_file}") # Create the Content Understanding client client = ContentUnderstandingClient( endpoint=endpoint, credential=DefaultAzureCredential() ) # Read the image data with open(image_file, "rb") as file: image_data = file.read() # Submit the image for analysis print("Submitting request...") poller = client.begin_analyze_binary( analyzer_id=analyzer, binary_input=image_data ) # Wait for the analysis to complete result = poller.result() print("Analysis succeeded:\n") # Save JSON results to a file output_file = "results.json" with open(output_file, "w") as json_file: json.dump(dict(result), json_file, indent=4, default=str) print(f"Response saved in {output_file}\n") # Iterate through the contents and extract fields for content in result.contents: if hasattr(content, 'fields') and content.fields: for field_name, field_data in content.fields.items(): value = field_data.value if hasattr(field_data, 'value') else None print(f"{field_name}: {value}") - Review the code you added, which:
- Creates a
ContentUnderstandingClientinstance. - Reads the content of the image file as bytes.
- Calls
begin_analyze_binaryto submit the image to the analyzer (the SDK handles the asynchronous polling automatically). - Calls
.result()to wait for and retrieve the analysis results. - Saves the JSON response and parses the extracted fields.
- Creates a
- Save the file (CTRL+S).
-
In the VS Code terminal, run the Python code:
python read-card.py biz-card-1.png -
Review the output from the program, which should show the values for the fields in the following business card:

-
Run the program again with a different business card:
python read-card.py biz-card-2.png -
Review the results, which should reflect the values in this business card:

-
To view the full JSON response that was returned, open the results.json file in VS Code, or run the following command in the terminal:
cat results.json
Clean up
If you’ve finished working with the Content Understanding service, you should delete the resources you have created in this exercise to avoid incurring unnecessary Azure costs.
- In the Azure portal, delete the resource group you created for this exercise.