Interested in Personalized Training with Job Assistance? Know More

Complete Machine Learning Course in English > Reinforcement Learning

Thompson Sampling - Practical

18.2k

Start a new search

To find content from modules and lessons

Overview

Hello, I am (name) from learnvern. ( 6 seconds pause, music) In the continuation to the previous tutorial on Machine learning we will see ahead in this tutorial today. So, let’s watch it. Today we are talking about Thompson Sampling in reinforcement. We have seen one more technique also which we call upper confidence bound. So how does the former technique work, it works like whatever you have currently explored meaning you explored solution 1, explored solution 2 , so in the current one you just take out the average of rewards and from it choose one from it, the one with maximum, meaning in the current one you choose the best and start exploiting it. So this was one approach. Now, in Thompson Sampling , the difference in approach is that this works on probability basis , that means it does not do sampling in the same manner every time, it does it with a difference, based upon, again, what is this based upon? This is based upon the rewards that have been received, it will tweak. So here you can see that we have imported numpy, panda matplotlib to visualize that we give a chance to the maximum out of the displayed , import random. Now here ds=pd dot read_csv , OK this is the same file we used in the previous program and we will upload the same file that was in use in the previous session. Same file is present here also. So here I have loaded it, so, this is connecting here and the file has also loaded here and now let us execute this, we will execute this. Now you can see that it is the same data of ad1 ad2 , the very same data., OK. So let’s execute it and now you can see that all steps are the same.

2:00

How many ads, total number of times the ad I got rewarded, 1 , how many times he got one , total number of times the ad I got rewarded, 0. So now you will see, as I told you that this works in a probabilistic way. So it will not see how much reward is attained, till now we were noticing how many rewards were gained, it won’t see that , it will only see whether reward is being given or not and if given then , and if it’s giving then assume 1 and if not then assume 0, so this works on this approach. So the number of times the ad I got reward 1 and number of times the ad I got reward 0, alright. Total number of rounds are the same. Ad selected , total reward 0 initially, number of selections 0 into d means , meaning 10, the rewards we have. For n in range 0 to N . maximum range max_random, max_random is initially 0 , ad is also 0, and here you can see that where this probabilistic approach comes from , this comes from betavariate , in this alpha and beta are two components , so betavariate is from where it comes and we execute it from there. So what happens here is, alpha divided by alpha plus beta , this is how it is calculated, so what it means is, alpha means number of successes, how many succeeded divided by number of total trials , meaning success and failures in all , so that is where it comes from, OK. So let us execute, and here it is executed and let us go to the next cell , so total reward attained is twenty six hundred and eight and here you can see that the maximum has been attained by the fifth one. OK. So, here in the number of selections all those values are there ,OK. So here you can see that this is 8000 and it is the largest and in this way by doing Thompson sampling we help in reinforcement learning.So implement it and try it on your own dataset or by making some dummy dataset. So friends let’s conclude here today, we end today’s session here and the remaining parts we will see in the next session. So keep learning, remain motivated, thank you.

If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.

See More

Learner's Ratings

4.5

Overall Rating

78%
11%
0%
6%
5%

Reviews

A

Aryan Ambat

5

Yes

Z

zeyana Fathima

5

thanks for giving this wonderful course in a understandable way please provide the details from where can i get the datasets

L

Losika Nicholas

5

were can i get the dataset

K

Kumar Madduru

5

Thanks for giving this course

D

Dinesh Kumar

4

Your screen is very blur and it doesn't has clarity even in 720P.Please make sure that will not happen again.

D

DOGALA UDAYKUMAR

5

bettor

N

Naresh Kulunge

4

good learning but the content titles are jumbled up, like first title of this module is decision tree dichotomiser which is practical part ahead of theory part. Same with the SVM practical 1 title has

E

Eswar Veeranki

5

good

I

Isakki Alias Devi P

5

Wonderful course

S

sushma Yadla

5

yes, i am happy to learning for machine learning in LearnVern.it i s easily understanding for Beginners.

Show More

Recommended Courses

Free हिन्दी

Excel For Data Analysis

53270

3.7 Enroll For Free

Free हिन्दी

SQL For Data Analysis

20094

3.8 Enroll For Free

Course Content

Getting Started with Machine Learning

How to use LearnVern

Introduction to Machine Learning

Environment Setup Part 1

Environment Setup Part 2

Environment Setup Part 3

Data Wrangling

Importing Libraries and Dataset

Handling Missing Data

Handling Missing Data - Practical

Encoding Categorical Data

Encoding Catergorical Data - Practical

Splitting Dataset

Splitting Dataset - Practical

Normalizing the Data - Part 1

Normalizing the Data - Part 2

Finding Machine Learning Datasets

Exploratory Data Analysis

Plotting Graphs - Part 1

Plotting Graphs - Part 2

Distribution Models - Part 1

Distribution Models - Part 2

Assignment : Data Preprocessing for Machine Learning

Machine Learning Paradigms

Assignment : Machine Learning Paradigms

Decision Tree Iterative Dichotomiser 3

Random Forest

Support Vector Machine Classifier

Support Vector Machine Classifier - Practical 1

Support Vector Machine Classifier - Practical 2

Naive Bayes Classifier

Naive Bayes Classifier - Practical 1

Naive Bayes Classifier - Practical 2

Evaluating Classification Models Performance

Evaluating Classification Models Performance - Practical

Overview of Classification

Logistic Regression

Logistic Regression - Practical - 1

Logistic Regression - Practical - 2

KNN

KNN Practical - 1

KNN - Practical 2

Decision Trees for Classification

Decision Trees for Classification - Practical 1

Decision Trees for Classification - Practical 2

Assignment : Supervised Learning Algorithms

Simple Linear Regression

Simple Linear Regression - Practical

Salary Prediction using Linear Regression

Multi-Linear Regression

Startup Prediction using Multiple Regression

Support Vector Regressor

Support Vector Regressor - Practical 1

Support Vector Regressor - Practical 2

Decision Tree Regressor

Decision Tree Regressor - Practical 1

Decision Tree Regressor - Practical 2

Regressor Model Selection

Evaluating Regression Model Performance

Evaluating Regression Model Performance - Practical

Assignment : Regression Algorithms

Distance Metrics

K-Means Clustering

K-Means Clustering - Practical

Mall Customers Prediction using K Means Clustering

Hierarchical Clustering - Agglomerative , Divisive

Agglomerative Clustering - Practical

Divisive Clustering - Practical

DBscan Spatial Clustering

Mall Customers Prediction using Hierarchical Clustering

Assignment : Unsupervised Learning Algorithms

Association Rule Learning - Apriori, FP Growth

Association Rule Learning - Apriori Practical

Market Basket Analysis using Apriori

FP Growth

Market Basket Analysis using FP Growth

Assignment : Association Rule Mining

Reinforcement Learning Theory - Multi Armed Bandits

Upper Confidence Bound - Practical

Thompson Sampling - Practical

Q Learning

Assignment : Reinforcement Learning

Overview of Dimensoionality Reduction

Princinpal Component Analysis

Principal Component Analysis - Practical

Linear Discriminant Analysis

Linear Discriminant Analysis - Practical

Assignment : Dimensionality Reduction

Basics of Regularization and Optimization

Cross Validation

Hyperparameter Tuning

Sampling Methods

Underfitting and Overfitting in Models

Variance and Bias

Assignment : Regularization and Optimization

Advance Trends in Machine Learning

Introduction to Keras and Deep Learning

Practical Demonstration -Keras

Reinforcement Learning Project - Teach a Taxi Part 1

Reinforcement Learning Project - Teach a Taxi Part 2

Reinforcement Learning Project - Teach a Taxi Part 3

Reinforcement Learning Project - Teach a Taxi Part 4

Loan Prediction Project Part 1

Loan Prediction Project Part 2

Course Summary

Interview Questions Part 1

Interview Questions Part 2

Interview Questions Part 3

Career Guidelines

Enroll For Free

Complete Machine Learning Course in English Code

Free

Full Course, No Certificate

With Ads
No Certificate

₹999/-

No Ads

Full Course, with NSDC Certificate

Ad Free
Globally Recognized NSDC Certificate