Running experiments in Azure Machine Learning
Machine Learning is primarily about training models that you can use to provide predictive services to applications. In this exercise, you will learn to run experiments in Azure Machine Learning from Azure Databricks.
Prerequisites
Before starting this lab, complete the Getting Started with Azure Databricks lab to set up your Azure Databricks environment and import the data and notebooks you require.
Install libraries on the Azure Databricks Cluster
The notebooks you will run depends on certain Python libraries that will need to be installed in your cluster. The following steps walk you through adding these dependencies.
- From within the Azure Databricks workspace, from the Clusters section, select your cluster. Make sure the state of the cluster is Running.
- Select the Libraries link and then select Install New.
- In the Library Source, select PyPi and in the Package text box type
azureml-sdk[databricks]
and select Install. - Next install
sklearn-pandas==2.1.0
- Next install
azureml-mlflow
Note: If the packages don’t get installed, create a cluster with an older runtime (for example ML runtime 9.1) and try installing the packages on the new cluster.
Deploy an Azure Machine Learning workspace
-
If you have already created an Azure Machine Learning workspace in your subscription, you can skip to the section Exercise: Running experiments in Azure Machine Learning.
-
In the Azure Portal, create a new resource: Machine Learning
-
In the Create Machine Learning Workspace dialog that appears, provide the following values:
- Subscription: Choose your Azure subscription.
- Resource group: Select the resource group in which you deployed your Azure Databricks workspace.
- Workspace Name: aml-ws
- Region: Choose a region closest to you (it is OK if the Azure Databricks Workspace and the Azure Machine Learning Workspace are in different locations).
-
Review and complete the creation of Azure Machine Learning workspace.
Run an experiment in Azure Machine Learning
In this exercise, you will learn how to load and manipulate data inside the Azure Databricks environment.
-
In a web browser, open your Azure Databricks workspace.
-
If your cluster is not running, on the Compute page, select your cluster and use the ▶ Start button to start it
-
In the Azure Databricks Workspace, using the command bar on the left, select Workspace. Then select Users, and ☗ your_user_name. Then in the folder named 04 - Integrating Azure Databricks and Azure Machine Learning, open the 1.0 Running experiments in Azure Machine Learning notebook.
-
Attach the notebook to your cluster. Then read the notes in the notebook, running each code cell in turn.
Clean-up
If you’re finished working with Azure Databricks for now, in Azure Databricks workspace, on the Compute page, select your cluster and select ■ Terminate to shut it down. Otherwise, leave it running for the next exercise.