Performing Data Analysis & Visualization (Part 1 of 3)

 





To read part 2, please click here
To read part 3,please click here



Technical Requirements

The following Python libraries and versions will be used to perform data pre-processing as well as high-dimensional visualizations:
  1. azurem1-sdk 1 . 34 . 0
  2. azurem1-widgets 1 . 34 . 0
  3. azurem1-dataprep 2 . 20 . 0
  4. pandas 1 . 3 . 2
  5. numpy 1 . 19 . 5
  6. scikit-learn 0 . 24 . 2
  7. seaborn 0 . 11 . 2
  8. plotly 5 . 3 . 1
  9. umap_learn 0 . 5 . 1
  10. statsmodels 0 . 13 . 0
  11. missingno 0 . 5 . 0

Understanding Data Exploration Techniques

Data Exploration is an important analytical step to understand that whether your data is at the very least informative enough to build an ML model, and the possible tasks we will perform are all related to the the different type of datasets (where we can save our data) given below:
  • TabularDataset- This class provides methods for performing basic transformations on tabular data and converting them into known formats, like pandas (https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset).

  • FileDataset- This class primarily offers filtering methods on file metadata (https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.filedataset).

Both of the above datasets can be registered to the Azure Machine Learning Dataset Registry for further use after preprocessing.








To read part 2, please click here
To read part 3,please click here

























Comments

Popular posts from this blog

Deployment (Part 3)

Deployment (Part 1)

Project Resourcing (Part 2)