Perform hyperparameter tuning with a sweep job
Hyperparameters are variables that affect how a model is trained, but which can’t be derived from the training data. Choosing the optimal hyperparameter values for model training can be difficult, and usually involved a great deal of trial and error.
In this exercise, you’ll use Azure Machine Learning to tune hyperparameters by performing multiple training trials in parallel.
Before you start
You’ll need an Azure subscription in which you have administrative-level access.
Provision an Azure Machine Learning workspace
An Azure Machine Learning workspace provides a central place for managing all resources and assets you need to train and manage your models. You can interact with the Azure Machine Learning workspace through the studio, Python SDK, and Azure CLI.
You’ll use the Azure CLI to provision the workspace and necessary compute, and you’ll use the Python SDK to run a command job.
Create the workspace and compute resources
To create the Azure Machine Learning workspace, a compute instance, and a compute cluster, you’ll use the Azure CLI. All necessary commands are grouped in a Shell script for you to execute.
- In a browser, open the Azure portal at
https://portal.azure.com/
, signing in with your Microsoft account. - Select the [>_] (Cloud Shell) button at the top of the page to the right of the search box. This opens a Cloud Shell pane at the bottom of the portal.
- Select Bash if asked. The first time you open the cloud shell, you will be asked to choose the type of shell you want to use (Bash or PowerShell).
- Check that the correct subscription is specified and that No storage account required is selected. Select Apply.
-
In the terminal, enter the following commands to clone this repo:
rm -r azure-ml-labs -f git clone https://github.com/MicrosoftLearning/mslearn-azure-ml.git azure-ml-labs
Use
SHIFT + INSERT
to paste your copied code into the Cloud Shell. -
After the repo has been cloned, enter the following commands to change to the folder for this lab and run the setup.sh script it contains:
cd azure-ml-labs/Labs/09 ./setup.sh
Ignore any (error) messages that say that the extensions were not installed.
- Wait for the script to complete - this typically takes around 5-10 minutes.
Clone the lab materials
When you’ve created the workspace and necessary compute resources, you can open the Azure Machine Learning studio and clone the lab materials into the workspace.
- In the Azure portal, navigate to the Azure Machine Learning workspace named mlw-dp100-….
- Select the Azure Machine Learning workspace, and in its Overview page, select Launch studio. Another tab will open in your browser to open the Azure Machine Learning studio.
- Close any pop-ups that appear in the studio.
- Within the Azure Machine Learning studio, navigate to the Compute page and verify that the compute instance and cluster you created in the previous section exist. The compute instance should be running, the cluster should be idle and have 0 nodes running.
- In the Compute instances tab, find your compute instance, and select the Terminal application.
-
In the terminal, install the Python SDK on the compute instance by running the following commands in the terminal:
pip uninstall azure-ai-ml pip install azure-ai-ml
Ignore any (error) messages that say that the packages couldn’t be found and uninstalled.
-
Run the following command to clone a Git repository containing notebooks, data, and other files to your workspace:
git clone https://github.com/MicrosoftLearning/mslearn-azure-ml.git azure-ml-labs
- When the command has completed, in the Files pane, click ↻ to refresh the view and verify that a new Users/your-user-name/azure-ml-labs folder has been created.
Tune hyperparameters with a sweep job
Now that you have all the necessary resources, you can run the notebook to submit a sweep job.
-
Open the Labs/09/Hyperparameter tuning.ipynb notebook.
Select Authenticate and follow the necessary steps if a notification appears asking you to authenticate.
- Verify that the notebook uses the Python 3.8 - AzureML kernel.
- Run all cells in the notebook.
Delete Azure resources
When you finish exploring Azure Machine Learning, you should delete the resources you’ve created to avoid unnecessary Azure costs.
- Close the Azure Machine Learning studio tab and return to the Azure portal.
- In the Azure portal, on the Home page, select Resource groups.
- Select the rg-dp100-… resource group.
- At the top of the Overview page for your resource group, select Delete resource group.
- Enter the resource group name to confirm you want to delete it, and select Delete.