The act of partitioning available data into two pieces, commonly for cross-validation purposes, is known as data splitting. A portion of the information is used to create a predictive model. one to assess the model's performance, and the other to assess the model's performance
The major goal of dividing the dataset into a validation set is to avoid overfitting, which occurs when a model gets extremely good at categorising samples in the training set but is unable to generalise and make accurate classifications on data it has never seen before.
To detect the behaviour of a machine learning model, we must use observations that were not used during the training phase. The model's judgement would be influenced otherwise.