Data science is the process of extracting knowledge from data. The most common variables used in data science are:
- Data: Data is the raw material for a data scientist to work with. It can be numerical or categorical and it can be structured or unstructured.
- Dataset: A dataset is a collection of data that has been collected, organized, and formatted in such a way that it can be analyzed by a machine learning algorithm.
- Machine Learning Algorithm: A machine learning algorithm is an algorithm that learns from training datasets to make predictions or decisions without being explicitly programmed by humans.