Understanding End-to-End Machine Learning Process (Part 3 of 5)
To read part 1, please click here To read part 2, please click here To read part 4, please click here To read part 5, please click here Excavating Data & Sources When you start an ML project, you might realize the need of additional data points to increase the quality of your result. The following options will give you an overview of acquiring additional data carefully: In-house data sources- If the project is run in or with the company, then, firstly look internally. It's advantageous in the fact that it is free of cost, often standardized, and it is easier to find a person with the knowledge of this data as well as how to obtain it. However, it's very difficult to find whatever you are looking for, as it is poorly documented with questionable quality due to bias in data. Open data sources- You can also use freely available datasets as they are typically gigantic in size (terabytes (TB) of data), can cover different time periods, and generally well structured and doc