Topic 17 Flashcards
(11 cards)
Bayes Theorem
P(Y|X) = P(X|Y)P(Y) /P(X) Where P(Y|X) is the posterior probability (description given all the available data), P(X|Y) is the likelihood (predicts the data), P(Y) is the prior probability (based on prior information) and P(X) is the normalising constant
Product Rule for Independent Events
P(X|Y) = P(x1, x2, … , xn|Y) = P(x1|Y)P(x2|Y)…P(xn|Y)
Conditional Probability
The probability that A and B are true is the same as the probability that A is true and that B is true given that A is true: P(A and B) = P(A) x P(B|A)
Naive Bayes General Formula
P(Y|x1, … , xn) ∝ 𝑃(𝑌)ෑ
Σ 𝑖=1,𝑛 𝑃(𝑥𝑖 |𝑌)
Y With Maximum Probability
𝑌𝑝𝑟𝑒𝑑 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑦𝑃(𝑌) Σ n, i=1 𝑃(𝑥𝑖 |𝑌)
Laplace/ Plus One Smoothing
Adding 1 to all occurences / data points to avoid having a probability of 0 that will ruin Naive Bayes calculations
Multinomial Naive Bayes
Document classification (eg. assigning an article to sports, politics etc or spam vs not spam)
Bernoulli Naive Bayes
Similar to Multinomial. but the features / predictors are boolean variables (eg. tennis weather example)
Gaussian Naive Bayes
When features are not discrete and take up a continuous value (assumed that the values are sampled from a Gaussian distribution), estimate a normal distribution with some mean and SD for each feature
Naive Bayes Advantages
*
It is easy and fast to predict a class of test dataset
*
Naïve Bayes classifier performs better compare to other models assuming independence
*
It performs well in case of categorical input variables compared to numerical variables
Naive Bayes Disadvantage
Independent Predictor Assumption (variables might not be independent)