yolov3算法优点缺点_优点缺点
yolov3算法優點缺點
Naive Bayes: A classification algorithm under a supervised learning group based on Probabilistic logic. This is one of the simplest machine learning algorithms of all. Logistic regression is another classification algorithm that models posterior probability by learning input to output mapping and creates a discriminate model.
樸素貝葉斯(Naive Bayes):一種基于概率邏輯的有監督學習小組下的分類算法。 這是所有最簡單的機器學習算法之一。 Logistic回歸是另一種分類算法,該算法通過學習輸入到輸出映射來建模后驗概率并創建區分模型。
- Conditional Probability 條件概率
- Independent events Vs. Mutually exclusive events 獨立事件與 互斥活動
- Bayes theorem with example 貝葉斯定理與例子
- Naive Bayes Algorithm 樸素貝葉斯算法
- Laplace smoothing 拉普拉斯平滑
- Implementation of Naive Bayes Algorithm with Sci-kit learn 樸素貝葉斯算法與Sci-kit學習的實現
- Pros & Cons 優點缺點
- Summary 摘要
Conditional Probability
條件概率
Conditional Probability is a measure of the probability of an event occurring given that another event has occurred.
條件概率是給定另一個事件已發生的情況下事件發生概率的度量。
Suppose, Ramesh wants to play cricket but he has to do his work, so there are two conditions first one is Ramesh wanna play cricket P(A) and the second one is he has to do his work P(B). So what is the probability that he can play cricket given that he already has a work P(A|B).
假設拉梅什想打板球,但他必須做他的工作,所以有兩個條件,第一個條件是拉梅什想打板球P(A),第二個條件是他必須做自己的工作P(B)。 那么,如果他已經有作品P(A | B),那么他可以打板球的可能性是多少。
例如 , (For example,)
Event A is drawing a Queen first, and Event B is drawing a Queen second.
事件A首先吸引女王,而事件B其次吸引女王。
For the first card the chance of drawing a Queen is 4 out of 52 (there are 4 Queens in a deck of 52 cards):
對于第一張牌,抽出一張女王的幾率是52中的4(在52張牌中有4位皇后):
P(A) = 4/52
P(A)= 4/52
But after removing a Queen from the deck the probability of the 2nd card drawn is less likely to be a Queen (only 3 of the 51 cards left are Queens):
但是,從卡組中移除女王后,第二張被抽出的可能性就不太可能是女王(剩下的51張卡中只有3張是女王):
P(B|A) = 3/51
P(B | A)= 3/51
And so:
所以:
P(A?B) = P(A) x P(B|A) = (4/52) x (3/51) = 12/2652 = 1/221
P(A?B)= P(A)x P(B | A)=(4/52)x(3/51)= 12/2652 = 1/221
So the chance of getting 2 Queens is 1 in 221 or about 0.5%
因此,獲得2個皇后的機會是221中的1個或約0.5%
Independent events Vs. Mutually exclusive events
獨立事件與 互斥活動
P(A|B) is said to be Independent events if and only if events are occurring without affecting each other. Both events can occur simultaneously.
當且僅當事件發生而不相互影響時,才將P(A | B)稱為獨立事件。 這兩個事件可以同時發生。
P(A|B) = P(A)
P(A | B)= P(A)
P(B|A)=P(B)
P(B | A)= P(B)
Let suppose there are two dies D1 and D2
假設有兩個模具D1和D2
P(D1 = 6 | D2 = 3) = P(D1 =6)
P(D1 = 6 | D2 = 3)= P(D1 = 6)
Then there is no relationship between both two events occurrence. Both of the events are independent of each other.
這樣,兩個事件的發生之間就沒有關系。 這兩個事件彼此獨立。
There is no impact of getting 6 on D1 to getting 3 on D2
在D1上獲得6到D2上獲得3的影響沒有影響
Two events are mutually exclusive or disjoint if they cannot both occur at the same time.
如果兩個事件不能同時發生,則它們是互斥或互斥的 。
Suppose,
假設,
The event is on D1 we wanna get 3 and same on the D1 we can’t get 5 because we already got 3 on D1.
該事件在D1上我們想要得到3,而在D1上我們同樣不能得到5,因為我們已經在D1上得到3。
P(A|B)=P(B|A)=0
P(A | B)= P(B | A)= 0
P(A ∩ B) = P(B ∩ A) = 0
P(A∩B)= P(B∩A)= 0
It means both the events are cannot occur simultaneously.
這意味著兩個事件不能同時發生。
Bayes Theorem with example
貝葉斯定理與例子
Bayes Theorem Equations貝葉斯定理方程With this Image we can clearly see that the above given equation proved through the given steps.
借助此圖像,我們可以清楚地看到上述給定的方程式已通過給定的步驟得到證明。
Suppose we have 3 machines A1,A2,A3 that produce an item given their probability & defective ratio.
假設我們有3臺機器A1,A2,A3,根據它們的概率和次品率來生產一件物品。
p(A1) = 0.2
p(A1)= 0.2
p(A2) = 0.3
p(A2)= 0.3
p(A3) = 0.5
p(A3)= 0.5
B is representing defective
B代表缺陷
P(B|A1) = 0.05
P(B | A1)= 0.05
P(B|A2) = 0.03
P(B | A2)= 0.03
p(B|A3) = 0.01
p(B | A3)= 0.01
If an item is chosen as random then what is probability to be produced by Machine 3 given that item is defective (A3|B).
如果將某個項目選擇為隨機項目,則機器3假定該項目有缺陷(A3 | B),那么該概率是多少。
P(B) = P (B|A1) P(A1) + P (B|A2) P(A2) +P (B|A3) P(A3)
P(B)= P(B | A1)P(A1)+ P(B | A2)P(A2)+ P(B | A3)P(A3)
P(B) = (0.05) (0.2) + (0.03) (0.3) + (0.01) (0.5)
P(B)=(0.05)(0.2)+(0.03)(0.3)+(0.01)(0.5)
P(B) = 0.024
P(B)= 0.024
Here 24% of the total Output of the factory is defective.
此處工廠總產量的24%有缺陷。
P(A3|B) = P(B|A3) P(A3)/P(B)
P(A3 | B)= P(B | A3)P(A3)/ P(B)
= (0.01) (0.50) / 0.024
=(0.01)(0.50)/ 0.024
= 5/24
= 5/24
= 0.21
= 0.21
Naive Bayes Algorithm
樸素貝葉斯算法
Given a vector x of features n where x is an independent variable
給定特征n的向量x ,其中x是自變量
C is classes.
C是類。
P( Ck | x1, x2, … xn )
P(Ck | x1,x2,…xn)
P( Ck | x) = (P(x | Ck) *P(Ck) ) / P(x)
P(Ck | x )=(P( x | Ck)* P(Ck))/ P( x )
Suppose given we are solving classification problem then we have to find two conditional probability P(C1|x) and P(C2|x). Out of these two binary classification probability which is highest we use that as maximum posterior.
假設給定我們正在解決分類問題,那么我們必須找到兩個條件概率P(C1 | x)和P(C2 | x)。 在這兩個最高的二元分類概率中,我們將其用作最大后驗。
P(Ck ,x1 , x2, …, xn)=P(x1 , x2, …, xn,Ck )
P(Ck,x1,x2,...,xn)= P(x1,x2,...,xn,Ck)
which can be rewritten as follows, using the chain rule for repeated applications of the definition of conditional probability:
可以使用鏈式規則將其重寫為條件概率定義的重復應用,如下所示:
P(x1 , x2, …, xn,Ck ) = P(x1 | x2, …, xn, Ck) * P(x2, …, xn, Ck)
P(x1,x2,...,xn,Ck)= P(x1 | x2,...,xn,Ck)* P(x2,...,xn,Ck)
= P(x1 | x2, …, xn, Ck) * P(x2, | x3, …, xn, Ck) * P(x3, …, xn, Ck)
= P(x1 | x2,…,xn,Ck)* P(x2,| x3,…,xn,Ck)* P(x3,…,xn,Ck)
= ….
=…。
= P(x1 | x2, …, xn, Ck) * P(x2, | x3, …, xn, Ck) * P(xn-1, | xn, Ck)* P(xn | Ck)* P(Ck)
= P(x1 | x2,…,xn,Ck)* P(x2,| x3,…,xn,Ck)* P(xn-1,| xn,Ck)* P(xn | Ck)* P(Ck )
Assume that all features in x are mutually independent, conditional on the category Ck
假設x中的所有要素都是相互獨立的,且以類別Ck為條件
P(xi, | xi+1, …, xn, Ck) = P(xi, | Ck).
P(xi,| xi + 1,…,xn,Ck)= P(xi,| Ck)。
Thus, the joint model can be expressed as
因此,聯合模型可以表示為
z is a scaling factor dependent only on x1,……xn that is, a constant if the values of the variables are known
z是僅取決于x1,…xn的比例因子,即,如果變量的值已知,則為常數
The discussion so far has derived the independent feature model, that is, the naive Bayes probability model. The naive Bayes classifier combines this model with a decision rule. One common rule is to pick the hypothesis that is most probable; this is known as the maximum a posterior or MAP decision rule. The corresponding classifier, a Bayes classifier, is the function that assigns a class label
到目前為止,討論已經得出了獨立的特征模型,即樸素的貝葉斯概率模型。 樸素的貝葉斯分類器將此模型與決策規則結合在一起。 一個普遍的規則是選擇最可能的假設。 這稱為最大后驗或MAP決策規則。 相應的分類器(貝葉斯分類器)是分配類標簽的函數
Example
例
Say you have 1000 fruits which could be either ‘banana’, ‘orange’ or ‘other’. These are the 3 possible classes of the Y variable.
假設您有1000種水果,可以是“香蕉”,“橙色”或“其他”。 這是Y變量的3種可能的類別。
We have data for the following X variables, all of which are binary (1 or 0).
我們具有以下X變量的數據,所有這些變量都是二進制的(1或0)。
- Long 長
- Sweet 甜
- Yellow 黃色
The first few rows of the training dataset look like this:
訓練數據集的前幾行如下所示:
Training dataset訓練數據集For the sake of computing the probabilities, let’s aggregate the training data to form a counts table like this.
為了計算概率,讓我們匯總訓練數據以形成這樣的計數表。
So the objective of the classifier is to predict if a given fruit is a ‘Banana’ or ‘Orange’ or ‘Other’ when only the 3 features (long, sweet and yellow) are known.
因此,分類器的目的是在僅知道3個特征(長,甜和黃色)的情況下預測給定的水果是“香蕉”還是“橙色”或“其他”。
Let’s say you are given a fruit that is: Long, Sweet and Yellow, can you predict what fruit it is?
假設您得到的水果是:長,甜和黃色,您能預測它是什么水果嗎?
This is the same of predicting the Y when only the X variables in testing data are known. Let’s solve it by hand using Naive Bayes.
這與僅知道測試數據中的X變量時預測Y相同。 讓我們使用樸素貝葉斯解決它。
The idea is to compute the 3 probabilities, that is the probability of the fruit being a banana, orange or other. Whichever fruit type gets the highest probability wins.
這個想法是要計算3個概率,即水果是香蕉,橙子或其他的概率。 無論哪種水果獲得最高的概率獲勝。
All the information to calculate these probabilities is present in the above tabulation.
以上表格中列出了所有計算這些概率的信息。
Step 1: Compute the ‘Prior’ probabilities for each of the class of fruits.
步驟1:計算每種水果的“在先”概率。
That is, the proportion of each fruit class out of all the fruits from the population. You can provide the ‘Priors’ from prior information about the population. Otherwise, it can be computed from the training data.
也就是說,每種水果類別在人口中所有水果中所占的比例。 您可以從有關人口的先前信息中提供“優先”。 否則,可以根據訓練數據進行計算。
For this case, let’s compute from the training data. Out of 1000 records in training data, you have 500 Bananas, 300 Oranges and 200 Others. So the respective priors are 0.5, 0.3 and 0.2.
對于這種情況,讓我們根據訓練數據進行計算。 在訓練數據的1000條記錄中,您有500個香蕉,300個橙子和200個其他。 因此,各個先驗分別為0.5、0.3和0.2。
P(Y=Banana) = 500 / 1000 = 0.50
P(Y =香蕉)= 500/1000 = 0.50
P(Y=Orange) = 300 / 1000 = 0.30
P(Y =橙色)= 300/1000 = 0.30
P(Y=Other) = 200 / 1000 = 0.20
P(Y =其他)= 200/1000 = 0.20
Step 2: Compute the probability of evidence that goes in the denominator.
步驟2:計算進入分母的證據概率。
This is nothing but the product of P of Xs for all X. This is an optional step because the denominator is the same for all the classes and so will not affect the probabilities.
這不過是所有X的X的P的乘積。這是一個可選步驟,因為所有類的分母都相同,因此不會影響概率。
P(x1=Long) = 500 / 1000 = 0.50
P(x1 =長)= 500/1000 = 0.50
P(x2=Sweet) = 650 / 1000 = 0.65
P(x2 =甜)= 650/1000 = 0.65
P(x3=Yellow) = 800 / 1000 = 0.80
P(x3 =黃色)= 800/1000 = 0.80
Step 3: Compute the probability of likelihood of evidences that goes in the numerator.
步驟3:計算分子中證據出現的可能性。
It is the product of conditional probabilities of the 3 features. If you refer back to the formula, it says P(X1 |Y=k). Here X1 is ‘Long’ and k is ‘Banana’. That means the probability the fruit is ‘Long’ given that it is a Banana. In the above table, you have 500 Bananas. Out of that 400 is long. So, P(Long | Banana) = 400/500 = 0.8.
它是3個特征的條件概率的乘積。 如果回頭看該公式,它將顯示P(X1 | Y = k)。 X1是“長”,k是“香蕉”。 這意味著如果該水果是香蕉,則該水果“長”的概率。 在上表中,您有500個香蕉。 在這400個中很長。 因此,P(Long | Banana)= 400/500 = 0.8。
Here, I have done it for Banana alone.
在這里,我只為香蕉做過。
Probability of Likelihood for Banana
香蕉的可能性
P(x1=Long | Y=Banana) = 400 / 500 = 0.80
P(x1 =長| Y =香蕉)= 400/500 = 0.80
P(x2=Sweet | Y=Banana) = 350 / 500 = 0.70
P(x2 =甜| Y =香蕉)= 350/500 = 0.70
P(x3=Yellow | Y=Banana) = 450 / 500 = 0.90
P(x3 =黃色| Y =香蕉)= 450/500 = 0.90
So, the overall probability of Likelihood of evidence for Banana = 0.8 * 0.7 * 0.9 = 0.504
因此,香蕉證據的總體可能性為= 0.8 * 0.7 * 0.9 = 0.504
Step 4: Substitute all the 3 equations into the Naive Bayes formula, to get the probability that it is a banana.
步驟4:將所有三個方程式代入樸素貝葉斯公式,以得出它是香蕉的概率。
Similarly, you can compute the probabilities for ‘Orange’ and ‘Other fruit’. The denominator is the same for all 3 cases, so it’s optional to compute.
同樣,您可以計算“橙色”和“其他水果”的概率。 這三種情況的分母都相同,因此計算是可選的。
Clearly, Banana gets the highest probability, so that will be our predicted class.
顯然,香蕉獲得的概率最高,因此這將是我們的預測類別。
Laplace Smoothing
拉普拉斯平滑
In statistics, Laplace Smoothing is a technique to smooth categorical data. Laplace Smoothing is introduced to solve the problem of zero probability. Laplace smoothing is used to deal with overfitting of models. Suppose in the given dataset if a word is not present at test time then we find out P(C=”Yes”|Textq) =0 or P(C=”No”|Textq).
在統計數據中,拉普拉斯平滑化是一種平滑分類數據的技術。 引入拉普拉斯平滑法來解決零概率問題。 拉普拉斯(Laplace)平滑用于處理模型的過擬合。 假設在給定的數據集中,如果單詞在測試時不存在,那么我們找出P(C =“是” | Textq)= 0或P(C =“否” | Textq)。
Textq={w1,w2,w3,w4,W}
Textq = {w1,w2,w3,w4,W}
In the given training data we have w1,w2,w3 and w4 . But we don’t have W in training data so if we run P(C=”Yes”|Textq) or P(C=”No”|Textq) it we got
在給定的訓練數據中,我們有w1,w2,w3和w4。 但是我們的訓練數據中沒有W,所以如果我們運行P(C =“是” | Textq)或P(C =“否” | Textq),我們得到
P(C=”Yes”|Textq) =P(C=”No”|Textq)=0…………..condition (i)
P(C =“是” | Textq)= P(C =“否” | Textq)= 0…………..條件(i)
Because P(C=”Yes”|W) and P(C=”No”|W) we don’t have any probability value for this new word. So the value of probability is zero then ultimately our model is overfitting on train data because it can identify and classify the text which is available in the train data.
因為P(C =“是” | W)和P(C =“否” | W),所以對于這個新單詞我們沒有任何概率值。 因此,概率值為零,則最終我們的模型對火車數據過度擬合,因為它可以識別和分類火車數據中可用的文本。
If the given dataset is imbalanced then the data model is underfitting and biased towards the majority class. To overcome this situation we use two different for binary classification and give more weightage to minor class to Balanced dataset.
如果給定的數據集不平衡,則數據模型不適合并且偏向多數類。 為了克服這種情況,我們對二進制分類使用兩種不同的方法,并為“平衡”數據集的次要類賦予更大的權重。
P(C=”Yes”) we have λ1
P(C =“是”)我們有λ1
P(C=”No”) we have λ2 …………………… condition (ii)
P(C =“否”)我們有λ2……………………條件(ii)
To deal with this condition (i) and condition (ii) we used Laplace smoothing.
為了處理條件(i)和條件(ii),我們使用了拉普拉斯平滑。
By applying this method, prior probability and conditional probability can be written as:
通過應用此方法,先驗概率和條件概率可以寫為:
K denotes the number of different values in y and A denotes the number of different values in aj. Usually lambda in the formula equals to 1.
K表示y中不同值的數量,而A表示aj中不同值的數量。 通常,公式中的lambda等于1。
By applying Laplace Smoothing, the prior probability and conditional probability in previous example can be written as:
通過應用拉普拉斯平滑,可以將前面示例中的先驗概率和條件概率寫為:
Here λ is a hyper-parameter to deal with Overfitting and Underfitting of models.
λ是一個超參數,用于處理模型的過擬合和欠擬合。
When the value of λ Decreasing that time model is Overfitting because it gives less value to the new word or imbalance data.
當“減少時間”的“λ值”為“過擬合”時,因為它給新單詞或不平衡數據賦予的值較小。
When the value of λ Increasing that time model is Underfitting.
當“增加該時間”的“λ”值是“擬合不足”時。
λ is used for tug-of-bar between Overfitting and Underfitting of models.
λ用于模型的過擬合和欠擬合之間的拔河。
Implementation of Naive Bayes in Sci-kit learn
在Sci-kit中實施樸素貝葉斯學習
Applications of Naive Bayes Algorithm
樸素貝葉斯算法的應用
Real-time Prediction: As Naive Bayes is super fast, it can be used for making predictions in real time.
實時預測 :由于樸素貝葉斯非常快,因此可以用于實時預測。
Multi-class Prediction: This algorithm can predict the posterior probability of multiple classes of the target variable.
多類預測 :該算法可以預測目標變量的多類后驗概率。
Text classification/ Spam Filtering/ Sentiment Analysis: Naive Bayes classifiers are mostly used in text classification (due to their better results in multi-class problems and independence rule) have a higher success rate as compared to other algorithms. As a result, it is widely used in Spam filtering (identify spam email) and Sentiment Analysis (in social media analysis, to identify positive and negative customer sentiments)
文本分類/垃圾郵件過濾/情感分析 :樸素貝葉斯分類器主要用于文本分類(由于它們在多類問題和獨立性規則方面的更好結果)與其他算法相比具有較高的成功率。 因此,它被廣泛用于垃圾郵件過濾(識別垃圾郵件)和情感分析(在社交媒體分析中,確定正面和負面的客戶情緒)
Recommendation System: Naive Bayes Classifier along with algorithms like Collaborative Filtering makes a Recommendation System that uses machine learning and data mining techniques to filter unseen information and predict whether a user would like a given resource or not.
推薦系統 :樸素貝葉斯分類器與諸如協同過濾之類的算法一起構成了一個推薦系統,該推薦系統使用機器學習和數據挖掘技術來過濾看不見的信息并預測用戶是否希望使用給定資源。
優點缺點 (Pros &Cons)
Pros:
優點:
- It is easy and fast to predict the class of test data sets. It also perform well in multi class prediction 預測測試數據集的類別既容易又快速。 在多類別預測中也表現出色
- When assumption of independence holds, a Naive Bayes classifier performs better compared to other models like logistic regression and you need less training data. 如果保持獨立性假設,那么與其他模型(例如邏輯回歸)相比,樸素貝葉斯分類器的性能會更好,并且您需要的訓練數據也更少。
- It performs well in case of categorical input variables compared to numerical variable(s). For numerical variables, Gaussian normal distribution is assumed (bell curve, which is a strong assumption). 與數字變量相比,在分類輸入變量的情況下,它表現良好。 對于數值變量,假定為高斯正態分布(鐘形曲線,這是一個很強的假設)。
Cons:
缺點:
- If a categorical variable has a category (in test data set), which was not observed in training data set, then the model will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as “Zero Frequency”. To solve this, we can use the smoothing technique. One of the simplest smoothing techniques is called Laplace estimation. 如果分類變量具有一個類別(在測試數據集中),而該類別在訓練數據集中沒有被觀察到,則該模型將分配0(零)概率,并且將無法進行預測。 這通常被稱為“零頻率”。 為了解決這個問題,我們可以使用平滑技術。 最簡單的平滑技術之一稱為拉普拉斯估計。
- On the other side naive Bayes is also known as a bad estimator, so the probability outputs from predict_proba are not to be taken too seriously. 另一方面,樸素的貝葉斯也被認為是一個不好的估計量,因此,predict_proba的概率輸出不要太當真。
- Another limitation of Naive Bayes is the assumption of independent predictors. In real life, it is almost impossible that we get a set of predictors which are completely independent. 樸素貝葉斯的另一個局限性是獨立預測變量的假設。 在現實生活中,我們幾乎不可能獲得一組完全獨立的預測變量。
Summary
摘要
In this Blog, you will learn about What is Conditional Probability and Different types of events that are used in Bayes theorem.
在此博客中,您將了解什么是條件概率和貝葉斯定理中使用的不同類型的事件。
How Bayes theorem is applied in Naive Bayes Algorithm.
樸素貝葉斯算法如何應用貝葉斯定理。
How Naive Bayes algorithm deals with Overfitting and Underfitting.
樸素貝葉斯算法如何處理過擬合和欠擬合。
How to Implement algorithm with sci-kit learn.
如何使用sci-kit學習實現算法。
What are the application of Naive Bayes Algorithm.
樸素貝葉斯算法有哪些應用。
What is Procs & Cons of algorithm.
什么是算法的過程與缺點。
翻譯自: https://medium.com/@jeevansinghchauhan247/what-everybody-ought-to-know-about-naive-bayes-theorem-51a9673ef226
yolov3算法優點缺點
總結
以上是生活随笔為你收集整理的yolov3算法优点缺点_优点缺点的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 机器学习 异常值检测_异常值是否会破坏您
- 下一篇: 主成分分析具体解释_主成分分析-现在用您