Namaskar I am Kushal from learnvern.
In the session on Machine learning today we will see the second part of interview questions. So let us start with the first question out of the ten questions to be discussed today.
The first question is to mention the difference between data mining and machine learning.
This question is a conceptual question and can be asked from fresher level to experienced level. The answer to this question is that data mining is identifying pattern from the data , so to identify patterns from data and what is machine learning, machine learning is to make a system or to give ability to the machine to learn automatically without human intervention , to give the machine the ability, to develop a system, to develop an algorithm so that we do not have to give logic explicitly again and again so means this is the difference between the two but there is a relation between the two and the relation is that data mining is a broad concept OK, so while using data mining the algorithms that are used we can also use machine learning algorithms in their place , so this is relation between the two that while doing data mining we can use machine learning algorithms , so this is the relation between them, OK so this is the answer to the first question. So let us now move forward to the second question.
Our next question is what is overfitting in machine learning. What is overfitting in machine learning.Overfitting, see to be fit and to be overfit , so in machine learning we have data and on that data an algorithm applies it’s logic and creates a function that best fits it and if that is not a best fit then there are two possibilities , that either it will be overfit or underfit , so what happens in overfit is that our model, or the function of machine learning it explores data to a deep level and imbibes it such that it will give the correct output for that data only, rest for all data that you give it will give wrong output or some random output , so in this case what will happen is that the actual relationship between input and output , it will not be able to identify, now for an example, if I try to find the performance of each student and for finding the performance I started finding parameters , now I took a parameter that how many hours do they study, second parameter that I took was that their parents are literate or not , then I took the third, the fourth, the fifth, the tenth, the fifteenth and total thirty parameters I took.
So by taking so many parameters , the chance of overfitting increases manifolds because the algorithm is not able to perform the mapping between input and output correctly , so because of this what is overfitting, when data is too less then overfitting can happen and when features are too many then also overfitting can happen.
So in this case the prediction of this algorithm just for the data, the data we have for that if you use this algorithm for prediction then the answer will be correct but if you take any test data or any new input it will give some random value in which you will not be able to understand that whether it has made some relation also internally,so this is the problem with overfitting.
So let us now move to our next question. Oue next question is that if there is a problem of overfitting then how can you avoid overfitting? How will you avoid it? So see I told you two things that the reason for this problem is either the data is very small , so if the data is small also there you can arrange the data , arrange more data and give training with that and with this chances of overfitting will reduce , the next option that you have is, the next option is to start cross validation techniques, in cross validation what will happen is that you separate twenty, twenty five or thirty percent of data and keep it for testing, the rest eighty percent data or eighty five percent data or seventy five percent of data, that entire data you use for training purpose , means keep some data separately with which you can test your model , so these two techniques that I have told you, these you can use.
Now the next question that we have is what is inductive machine learning. Inductive, so this word if you have heard for the first time then do not bother , what inductiube means is learning by examples , learning by examples , so inductive machine learning involves the process of learning where a system from a set of observed instances tries to induce a general rule.
So what actually we do in machine learning is this only, that we have a data set and in that data set there is a pair of input and output, such that there is a pair of X and Y so what we try here is , we with a formula or an algorithm , basically it ties input and output, input and output , what is the relation, what is the relation ,it tries to take out a relation OK, to induce a general rule, it takes out a rule lastly and on basis of that rule it does the prediction ahead , so in many algorithms inductive machine learning is involved.
Now let us move forward towards the next question. This is our fifth question.
What are the five popular algorithms of machine learning? So here you have got a lot of choices and the algorithms that you have studied nicely , those with which you keep practicing , it will be better that you take those names , but for name’s sake I will tell you five names , Decision tree which is basically a tree algorithm and includes IB3 and card, Neural networks in which we have feed forward or back propagation, this is an algorithm of deep learning, probabilistic networks, this is a probability based network , baisy and classifier we have discussed in this, these also come under supervised and are quite optimized , Nearest neighbor, this is also a supervised network in which basically we chose nearest neighbor, means if five you have chosen , so we will see five neighbors for our new data and out of the classes of those five , the class with maximum data , that class will be declared and this is how we do prediction , so nearest neighbor is one algorithm , next is support vector machines, no matter how the data is ,let it be of any type, support machine vectors work for linear data also because the kernel functions in this perform well for polynomial data also and non linear data also because we can change functions there , the kernel functions and then we can apply svm, so these were five example of popular algorithms that are there.
Now we move towards the sixth question, now sixth question is what are the three stages sto build hypotheses or models in machine learning.So at time we can do three to four and four to five means any bifurcation that we want to do we can do , this you can do but if exactly three have been asked then you should answer in a pin point way.
In this the first thing is model building, we make an object , make a model isn't it and after that what we do is we test the model and after testing we start applying that model means we start using it in real time , so model building is when we are making the object, then in model testing we fit it and train it and when training is done then we test it , we did testing, from test it got proved and we got satisfied and now start applying it on real time data also, so these are the three stages for making a hypothesis or for making a model in machine learning. Now let us go to the next question which is Question No 7.
What is the training set and test set? So what this training set and test set are. Now just see that you are also sitting in this discussion and you are listening to everything so in a way you are taking training and when you will take training and st in the interview or sit in a discussion, in a technical discussion , so that in a way will be a test for you because today you have sat in the training , in the same way the first algorithm is given training.
The data that we use for training , that data is called training set or training data set and after that when training has finished then we test that data and the data with which we test we call it test set , in this manner we do it, so now understand what it is , here if we have only one data we will split it normally , we will keep twenty percent or twenty five percent data separately and that will be saved separately by testing and the remaining data we can use in training , for giving training to the algorithm , so this is the difference between training and test set. So now we will move to the next question which is question number eight.
Here the question is what is the difference between artificial learning and machine learning meaning that what is the difference between artificial learning and machine learning. Now understand these are also related to each other but between them also there is a distinct boundary.
Now designing and developing algorithms according to the behaviors , according to the behaviors based on empirical data , it is known as machine learning, while artificial intelligence is addition to machine learning , it also covers other aspects, knowledge representation , natural language processing , planning, robotics, so now if you see then artificial intelligence is broader and machine learning is narrower, now I am saying this in what context, please try and understand that , broader we say because artificial intelligence covers bigger aspects and the complete product that we get is an AI product but machine learning is an ability that can be use in that product , and it provides the product the ability to perform better , so there is a difference, difference exists between the two. So artificial intelligence covers a lot of things and the same is written here also like knowledge representation, natural language processing, planning, robotics , so all of these things AI has to cover and machine learning is that part which in that particular device , basically to that controller , to that computer is providing a learning activity that is machine learning, so this is how it is.
Now we will move ahead to the next question which is question number nine, what is model selection in machine learning. So first of all understand that model means,model means , model means the algorithm, simple or with whatever local name you have denoted it there that is the model , so what is model selection, model selection means that if you have many algorithms with you then which one will you use that is acleed model selection and after model selection , see the model has been selected so now this model will do what, this will help, now it will surely help i machine learning but it will also help in statistics and data mining also because we had just discussed that machine algorithms work on statistics also and they work on data mining also , so this is how it is. So this is model selection, which model, which logic, which algorithm we will decide that should run. Now the next question is what is perceptron (per-cep-tron) in machine learning.
What is a perceptron? Perceptron is a single unit, you can say unit of execution , so perceptron is a supervised algorithm for binary classifiers where a binary classifier is a decider, is a deciding function of whether an input represents a vector or a number.So what is a perceptron, perceptron is basically is what, it is supervised, your data is labeled, OK, it can be of binary class or anything , so perceptron will take some input from that data and will give some output OK, so it will take some input and give output , so this is the meaning of perceptron , perceptrons are especially used in deep learning and there multi layered perceptrons are used. So this was our tenth question and similarly we have many more questions like this which you can note during the duration of the course.
If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.
Share a personalized message with your friends.