Interested in Personalized Training with Job Assistance? Know More

Complete Machine Learning Course in English > Regularization and Optimization

Variance and Bias

18.7k

Start a new search

To find content from modules and lessons

Overview

Namaskar I am Kushal from learnvern.

This tutorial is the continuation to the last tutorial and so let’s watch ahead. In this tutorial of machine learning we will understand variance and bias and the relation between them and in what way we should adjust variance and bias.So let’s get started and firstly understand variance.

Variance is ,when we implement an algorithm on a dataset then we evaluate it also , now to evaluate it we already keep the data sampled and after sampling let us assume that we have training data and testing data , so the different sets that we have training data, like we learnt K fold sampling , so if we have K 4 sets, meaning ten samples we have, then let us assume ten sets and on these ten sets how our model will differ , or what will bet the difference attained , so this is what is called variance.

Now let us move towards bias. In bias again whenever we implement an algorithm , then here we don’t see the training set but the predictor and the target ,meaning the input and the output , how strong can the algorithm build the relation between them, so this is how we check the bias.So these two somehow denote the error so to say that how accurate our model is how many errors it has.

Now here are some important things that we should understand. Now in variance, if there is a high variance then that means that overfitting is taking place continuously, overfitting is continuously taking place and in the same manner if there is high bias then underfitting is taking place continuously so that means high variance is also a problem and high bias is also a problem.

So we want that variance should be low and bias should also be low , so this is a challenging task and we call it tradeoff , that is to say how a tradeoff can be established between variance and bias and this we call variance bias tradeoff. So now to implement this we have a library , which has the name mlxtend and this we will use.

Now when I was exploring I witnessed that in this old version of mlxtend is installed due to which I first uninstalled the old version of mlxtend and then again installed the library for mlxtend , so here you will see mlxtend dot evaluate bias variance decompose OK, decompose, basically, I am using this module ,now this module previously had a library that was of 0.014 version and which was giving me some issues and that is why I uninstalled it and after uninstalling it I installed it again and you will observe now that after re-installing the version of the library has updated to 0.19.So this way I have updated the mlxtend package and installed the latest version.

Now here there is a request to restart the run time so I will restart the runtime also. Now let us walk through the code, so this is housing.csv dataset and in this we will calculate and in this we will now calculate bias and variance , now you will observe that here using sklearn dot model selection I have imported train test split and along with this I have with linear model imported linear regression and then with mlxtend dot evaluate I have imported the module bias variance decomposition , so algorithms we have already implemented because of which this section of the code, this much I am not explaining and now here we can see that after making the model mse, bias and variance that is mean square error and bias and variance we are calculating with the help of what, with the help of bias variance decompose package.

Now here we passed the model X train Y train , then X test and Y test and in loss because we have to take mse so in loss we mentioned mse here and after this number of rounds 200 and random seed value we have given as one and let us now execute it and see , and after executing it we get the output in which mse , mean square error is 22.418, bias is high and variance is low.

So this is the result we got.So we have to strike a balance between bias and variance and now let us see one more example, and this was our example of linear regression and in a similar manner let us see the example of decision tree qualifier and this is iris data which we have practiced many a times and on it let us see how bias and variance are attained. So here you can see that expected loss is of 0.062 and average bias is 0.022 and variance is 0.040 and now if in decision tree classifier which is a single algorithm I put a bagging classifier meaning I put ensembled technique then what will happen, so let’s watch it once.

So when we used bagging classifier and then we tried calculating , so it is taking some time, so let us have some patience and let us see that here the output that is attained in that it is expected that the variance for that will be low and with this expectation we will wait for the result and then we will understand the same example with decision tree regressor also.

So now here we can see that we have output on average expected loss and above here we had an average expected loss of 0.062 and the current average expected loss is 0.048, average bias is 0.022 and earlier also average bias was 0.022 but our variance has become low so this is how we try to achieve balance and try to reduce and so this is also an optimization that average bias instead of increasing is 0.022 and the average variance has reduced , so we want that bias should also be low and variance should also be low and in the same manner now if we perform it for decision tree regressor and then calculate there, then again this has been performed on boston housing data and here we can see that expected loss is 31.536 which is a high value and here bias is 14.096 and average variance is 17.440 and now let us execute this with bagging regressor that can there be an improvement to this , so here also we expect that there may be a reduction in variance , so let us wait for this execution and let us see that what we get in results and till the time results are attained,let us do a recap that bias and variance, both of these calculate the error and we want that variance should be low and bias should also be low so with this, with this intention, with this knowledge whenever you calculate bias and variance then you must try to keep it low.

So let us now see that what is the output for this, so here we can see that the in the output attained the average expected loss is of 18.620 and here the average expected loss was 31 so there is a reduction in loss and average loss is 15.461 and here it is 14.096 , so here average bias has increased but at the same time average variance has become 3.159 and here average variance was 17. So this is how we try to adjust bias and variance.

So today's session ends here, and the parts ahead we will see in the next session.. So keep learning, remain motivated, thank you.

If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.

See More

Learner's Ratings

4.2

Overall Rating

70%
10%
0%
10%
10%

Reviews

A

Aryan Ambat

5

Yes

Z

zeyana Fathima

5

thanks for giving this wonderful course in a understandable way please provide the details from where can i get the datasets

L

Losika Nicholas

5

were can i get the dataset

K

Kumar Madduru

5

Thanks for giving this course

D

Dinesh Kumar

4

Your screen is very blur and it doesn't has clarity even in 720P.Please make sure that will not happen again.

D

DOGALA UDAYKUMAR

5

bettor

N

Naresh Kulunge

4

good learning but the content titles are jumbled up, like first title of this module is decision tree dichotomiser which is practical part ahead of theory part. Same with the SVM practical 1 title has

E

Eswar Veeranki

5

good

I

Isakki Alias Devi P

5

Wonderful course

S

sushma Yadla

5

yes, i am happy to learning for machine learning in LearnVern.it i s easily understanding for Beginners.

Show More

Recommended Courses

Free हिन्दी

Excel For Data Analysis

54439

3.7 Enroll For Free

Free हिन्दी

SQL For Data Analysis

20634

3.8 Enroll For Free

Course Content

Getting Started with Machine Learning

How to use LearnVern

Introduction to Machine Learning

Environment Setup Part 1

Environment Setup Part 2

Environment Setup Part 3

Data Wrangling

Importing Libraries and Dataset

Handling Missing Data

Handling Missing Data - Practical

Encoding Categorical Data

Encoding Catergorical Data - Practical

Splitting Dataset

Splitting Dataset - Practical

Normalizing the Data - Part 1

Normalizing the Data - Part 2

Finding Machine Learning Datasets

Exploratory Data Analysis

Plotting Graphs - Part 1

Plotting Graphs - Part 2

Distribution Models - Part 1

Distribution Models - Part 2

Assignment : Data Preprocessing for Machine Learning

Machine Learning Paradigms

Assignment : Machine Learning Paradigms

Decision Tree Iterative Dichotomiser 3

Random Forest

Support Vector Machine Classifier

Support Vector Machine Classifier - Practical 1

Support Vector Machine Classifier - Practical 2

Naive Bayes Classifier

Naive Bayes Classifier - Practical 1

Naive Bayes Classifier - Practical 2

Evaluating Classification Models Performance

Evaluating Classification Models Performance - Practical

Overview of Classification

Logistic Regression

Logistic Regression - Practical - 1

Logistic Regression - Practical - 2

KNN

KNN Practical - 1

KNN - Practical 2

Decision Trees for Classification

Decision Trees for Classification - Practical 1

Decision Trees for Classification - Practical 2

Assignment : Supervised Learning Algorithms

Simple Linear Regression

Simple Linear Regression - Practical

Salary Prediction using Linear Regression

Multi-Linear Regression

Startup Prediction using Multiple Regression

Support Vector Regressor

Support Vector Regressor - Practical 1

Support Vector Regressor - Practical 2

Decision Tree Regressor

Decision Tree Regressor - Practical 1

Decision Tree Regressor - Practical 2

Regressor Model Selection

Evaluating Regression Model Performance

Evaluating Regression Model Performance - Practical

Assignment : Regression Algorithms

Distance Metrics

K-Means Clustering

K-Means Clustering - Practical

Mall Customers Prediction using K Means Clustering

Hierarchical Clustering - Agglomerative , Divisive

Agglomerative Clustering - Practical

Divisive Clustering - Practical

DBscan Spatial Clustering

Mall Customers Prediction using Hierarchical Clustering

Assignment : Unsupervised Learning Algorithms

Association Rule Learning - Apriori, FP Growth

Association Rule Learning - Apriori Practical

Market Basket Analysis using Apriori

FP Growth

Market Basket Analysis using FP Growth

Assignment : Association Rule Mining

Reinforcement Learning Theory - Multi Armed Bandits

Upper Confidence Bound - Practical

Thompson Sampling - Practical

Q Learning

Assignment : Reinforcement Learning

Overview of Dimensoionality Reduction

Princinpal Component Analysis

Principal Component Analysis - Practical

Linear Discriminant Analysis

Linear Discriminant Analysis - Practical

Assignment : Dimensionality Reduction

Basics of Regularization and Optimization

Cross Validation

Hyperparameter Tuning

Sampling Methods

Underfitting and Overfitting in Models

Variance and Bias

Assignment : Regularization and Optimization

Advance Trends in Machine Learning

Introduction to Keras and Deep Learning

Practical Demonstration -Keras

Reinforcement Learning Project - Teach a Taxi Part 1

Reinforcement Learning Project - Teach a Taxi Part 2

Reinforcement Learning Project - Teach a Taxi Part 3

Reinforcement Learning Project - Teach a Taxi Part 4

Loan Prediction Project Part 1

Loan Prediction Project Part 2

Course Summary

Interview Questions Part 1

Interview Questions Part 2

Interview Questions Part 3

Career Guidelines

Enroll For Free

Complete Machine Learning Course in English Code

Free

Full Course, No Certificate

With Ads
No Certificate

₹999/-

No Ads

Full Course, with NSDC Certificate

Ad Free
Globally Recognized NSDC Certificate