Understanding the workspace interior

By Ashwin Venugopal - February 16, 2023

User Roles

For ML workspace, we can also state an identity to the Azure predefined base roles (Owner, Contributor, or Reader) along with the two custom roles AzureML Data Scientist and AzureML Metrics Writer. These are described as follows:

Reader- Although this role allows you to look at everything, but, it cannot change any data, action, or anything that could change the state of the resource.

Contributor- This one lets you to look as well as change everything except the user roles and rights on the resource.

Owner- This role permits you to use any action on a specific resource.

AzureML Data Scientist- This one can only create or delete compute resources or modify the workspace settings.

AzureML Metrics Writer- It can only write metrics to the workspace.

Experiments

The main function of an ML is to find a mathematical function, which would be hard to find algorithmically, that when given specific input results in as many cases as possible in the expected output. This function is typically called ML model and we have to train them via the already existing ML algorithms in order to lower the output of the so-called loss function of the said model.

Hence, to keep track of the iterations of our model training, we define them as runs and align them to a construct called an experiment, that can collect all the information concerning a specific model we want to train. To achieve this, we have to connect every training script run to a specific experiment.

Datasets & Datastores

An ML model have to operate data with either for training or for testing purposes. We can reference datasets that can be defined inside the workspace instead of linking data sources and different data files directly in our scripts. While database curate data from datastores which can be defined and attached in the workspace.

Compute Targets

If you wan to run experiments as well as host models for inferencing, you will need a compute target which is of two types:

Compute instance- it's a single virtual machine generally used for development, as a notebook server, or as a target for training and inference.
Compute cluster- it's a multi-node cluster of machines typically used for complex training and production environments for inference.

Environments

The workspace offers us the ability to define as well as register ML environments, that are generally Docker containers encompassing the OS and every runtime, library, and the required dependency. Some of the different types of environments are as follows:

Curated environments uses predefined environments having typical runtimes as well as ML frameworks.
System-managed environments (use default behavior) builds environments starting from a base image with dependency management via Conda.
User-managed environments builds environments by either starting from a base image through Docker steps (while still allowing you to handle all libraries as well as dependencies by yourself), or by creating a complete custom Docker image.

Runs

It is the actual execution of a model training on a compute target which generally requires a so-called run configuration before anything else. The configuration is composed of the following:

A training script- It performs the actual ML training which basically takes your source folder with all source files, zips it, and sends it to the compute target.
An environment- The ML environment described previously.
A compute target- The target compute instance or cluster that the run will be executed in.

While during and after the execution, the run tracks as well as collects the following information:

Log files- contains the log files generated during the execution and any statement we add to the logging.
Metrics- includes standard run metrics and any type of object (values, images, and tables) that we want to track specifically during the run.
Snapshots- consists of a copy of the source directory containing our training scripts (using the ZIP file that we already require for the run configuration).
Output files- have the files generated by the algorithm (the model) and any file we additionally want to attach to the run.

Registered Models

Since all the models from various runs are stored in the output files of the run itself, the workspace provides the ability to register a model to the model registry where they stored with a name and a version. If you add a model with the same name, the registry will automatically add a new version of the existing model along with a new version number while also permitting you to tag them with metainformation, like the framework utilized.

Hence, the model registry allows you to keep track of the different results you achieved through training and also helps you to deploy different versions of the model production, development, and test environments.

Deployments & Deployment Endpoints

After training and registering of a model, it can be packaged as a service (by defining an entry script and environment) and deployed to a compute target. The entry script loads the model during initialization, as well as parse user inputs, evaluate the model, and return the results for a user request. This process is called deployment in Azure Machine Learning.

However, if you want to abstract multiple model deployments behind a common endpoint, you can define an endpoint service which is a separate service in Azure ML that can offer a common domain for multiple model deployments, perform Secure Socket Layer (SSL)/Transport Layer Security (TLS) termination. and also permits traffic allocation between deployments.

Pipelines

Pipelines are used to facilitate workflows and bring automation to every step of the ML chain.

Search This Blog

Blogs by Ashwin