Performing Data Analysis & Visualization (Part 1 of 3)
To read part 2, please click here
To read part 3,please click here
Technical Requirements
The following Python libraries and versions will be used to perform data pre-processing as well as high-dimensional visualizations:
- azurem1-sdk 1 . 34 . 0
- azurem1-widgets 1 . 34 . 0
- azurem1-dataprep 2 . 20 . 0
- pandas 1 . 3 . 2
- numpy 1 . 19 . 5
- scikit-learn 0 . 24 . 2
- seaborn 0 . 11 . 2
- plotly 5 . 3 . 1
- umap_learn 0 . 5 . 1
- statsmodels 0 . 13 . 0
- missingno 0 . 5 . 0
Understanding Data Exploration Techniques
Data Exploration is an important analytical step to understand that whether your data is at the very least informative enough to build an ML model, and the possible tasks we will perform are all related to the the different type of datasets (where we can save our data) given below:
- TabularDataset- This class provides methods for performing basic transformations on tabular data and converting them into known formats, like pandas (https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset).
- FileDataset- This class primarily offers filtering methods on file metadata (https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.filedataset).
Both of the above datasets can be registered to the Azure Machine Learning Dataset Registry for further use after preprocessing.
To read part 2, please click here
To read part 3,please click here
Comments
Post a Comment