Challenge 1: Create an Azure Machine Learning job

Challenge scenario

To automate machine learning workflows, you can define machine learning tasks in scripts. To execute any workflow consisting of Python scripts, use Azure Machine Learning jobs. Azure Machine Learning jobs store all metadata of a workflow, including input parameters and output metrics. By running scripts as jobs, it’s easier to track and manage your machine learning models.

Prerequisites

If you haven’t, complete the previous challenge before you continue.

Objectives

By completing this challenge, you’ll learn how to:

  • Define an Azure Machine Learning job in YAML.
  • Run an Azure Machine Learning job with the CLI v2.

Important! Each challenge is designed to allow you to explore how to implement DevOps principles when working with machine learning models. Some instructions may be intentionally vague, inviting you to think about your own preferred approach. If for example, the instructions ask you to create an Azure Machine Learning workspace, it’s up to you to explore and decide how you want to create it. To make it the best learning experience for you, it’s up to you to make it as simple or as challenging as you want.

Challenge Duration

  • Estimated Time: 30 minutes

Instructions

In the src/model folder, you’ll find a Python script which reads CSV files from a folder and uses the data to train a classification model. In the src folder, you’ll find a YAML file to define a job. There are values missing in the YAML file. It’s up to you to complete it.

  • Create an Azure Machine Learning workspace and a compute instance.
  • Use the CLI (v2) to create a registered data asset with the following configuration:
    • Name: diabetes-dev-folder
    • Path: The data folder in the experimentation folder which contains the CSV file to train the model. The path should point to the folder, not to the specific file.
Hint
Using the CLI (v2) you can create a data asset by defining the configuration in a YAML file or by specifying the configuration in the CLI command.
  • Complete the job.yml file to define the Azure Machine Learning job to run the train.py script, with the registered data asset as input.
  • Use the CLI (v2) to run the job.

Tip: Whether you’re working from the Cloud Shell, compute instance or a local terminal, make sure to update the Azure Machine Learning extension for the CLI to the latest version.

Success criteria

To complete this challenge successfully, you should be able to show:

  • A successfully completed job in the Azure Machine Learning workspace. The job should contain all input parameters and output metrics for the model you trained.

Note: If you’ve used a compute instance for experimentation, remember to stop the compute instance when you’re done.

Useful resources