日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

数据安全分类分级实施指南_不平衡数据集分类指南

發(fā)布時(shí)間:2023/12/14 编程问答 41 豆豆
生活随笔 收集整理的這篇文章主要介紹了 数据安全分类分级实施指南_不平衡数据集分类指南 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

數(shù)據(jù)安全分類分級(jí)實(shí)施指南

重點(diǎn) (Top highlight)

Balance within the imbalance to balance what’s imbalanced — Amadou Jarou Bah

在不平衡中保持平衡以平衡不平衡— Amadou Jarou Bah

Disclaimer: This is a comprehensive tutorial on handling imbalanced datasets. Whilst these approaches remain valid for multiclass classification, the main focus of this article will be on binary classification for simplicity.

免責(zé)聲明:這是有關(guān)處理不平衡數(shù)據(jù)集的綜合教程。 盡管這些方法對(duì)于多類分類仍然有效,但為簡(jiǎn)單起見,本文的主要重點(diǎn)將放在二進(jìn)制分類上。

介紹 (Introduction)

As any seasoned data scientist or statistician will be aware of, datasets are rarely distributed evenly across attributes of interest. Let’s imagine we are tasked with discovering fraudulent credit card transactions — naturally, the vast majority of these transactions will be legitimate, and only a very small proportion will be fraudulent. Similarly, if we are testing individuals for cancer, or for the presence of a virus (COVID-19 included), the positive rate will (hopefully) be only a small fraction of those tested. More examples include:

正如任何經(jīng)驗(yàn)豐富的數(shù)據(jù)科學(xué)家或統(tǒng)計(jì)學(xué)家都會(huì)意識(shí)到的那樣,數(shù)據(jù)集很少會(huì)在感興趣的屬性之間均勻分布。 想象一下,我們負(fù)有發(fā)現(xiàn)欺詐性信用卡交易的任務(wù)-自然,這些交易中的絕大多數(shù)都是合法的,只有很小一部分是欺詐性的。 同樣,如果我們正在測(cè)試個(gè)人是否患有癌癥或是否存在病毒(包括COVID-19),那么(希望)陽性率僅是所測(cè)試者的一小部分。 更多示例包括:

  • An e-commerce company predicting which users will buy items on their platform

    一家電子商務(wù)公司預(yù)測(cè)哪些用戶將在其平臺(tái)上購買商品
  • A manufacturing company analyzing produced materials for defects

    一家制造公司分析所生產(chǎn)材料的缺陷
  • Spam email filtering trying to differentiation ‘ham’ from ‘spam’

    垃圾郵件過濾試圖區(qū)分“火腿”和“垃圾郵件”
  • Intrusion detection systems examining network traffic for malware signatures or atypical port activity

    入侵檢測(cè)系統(tǒng)檢查網(wǎng)絡(luò)流量中是否存在惡意軟件簽名或非典型端口活動(dòng)
  • Companies predicting churn rates amongst their customers

    預(yù)測(cè)客戶流失率的公司
  • Number of clients who closed a specific account in a bank or financial organization

    在銀行或金融組織中關(guān)閉特定帳戶的客戶數(shù)量
  • Prediction of telecommunications equipment failures

    預(yù)測(cè)電信設(shè)備故障
  • Detection of oil spills from satellite images

    從衛(wèi)星圖像檢測(cè)漏油
  • Insurance risk modeling

    保險(xiǎn)風(fēng)險(xiǎn)建模
  • Hardware fault detection

    硬件故障檢測(cè)

One has usually much fewer datapoints from the adverse class. This is unfortunate as we care a lot about avoiding misclassifying elements of this class.

通常,來自不利類的數(shù)據(jù)點(diǎn)少得多。 這很不幸,因?yàn)槲覀兎浅T谝獗苊鈱?duì)此類元素進(jìn)行錯(cuò)誤分類。

In actual fact, it is pretty rare to have perfectly balanced data in classification tasks. Oftentimes the items we are interested in analyzing are inherently ‘rare’ events for the very reason that they are rare and hence difficult to predict. This presents a curious problem for aspiring data scientists since many data science programs do not properly address how to handle imbalanced datasets given their prevalence in industry.

實(shí)際上,在分類任務(wù)中擁有完全平衡的數(shù)據(jù)非常罕見。 通常,我們感興趣的項(xiàng)目本質(zhì)上是“稀有”事件,原因是它們很少見,因此難以預(yù)測(cè)。 對(duì)于有抱負(fù)的數(shù)據(jù)科學(xué)家而言,這是一個(gè)令人好奇的問題,因?yàn)殍b于其在行業(yè)中的普遍性,許多數(shù)據(jù)科學(xué)程序無法正確解決如何處理不平衡的數(shù)據(jù)集。

數(shù)據(jù)集什么時(shí)候變得“不平衡”? (When does a dataset become ‘imbalanced’?)

The notion of an imbalanced dataset is a somewhat vague one. Generally, a dataset for binary classification with a 49–51 split between the two variables would not be considered imbalanced. However, if we have a dataset with a 90–10 split, it seems obvious to us that this is an imbalanced dataset. Clearly, the boundary for imbalanced data lies somewhere between these two extremes.

不平衡數(shù)據(jù)集的概念有些模糊。 通常,在兩個(gè)變量之間劃分為49-51的二進(jìn)制分類數(shù)據(jù)集不會(huì)被認(rèn)為是不平衡的。 但是,如果我們有一個(gè)90-10分割的數(shù)據(jù)集,對(duì)我們來說顯然這是一個(gè)不平衡的數(shù)據(jù)集。 顯然,不平衡數(shù)據(jù)的邊界介于這兩個(gè)極端之間。

In some sense, the term ‘imbalanced’ is a subjective one and it is left to the discretion of the data scientist. In general, a dataset is considered to be imbalanced when standard classification algorithms — which are inherently biased to the majority class (further details in a previous article) — return suboptimal solutions due to a bias in the majority class. A data scientist may look at a 45–55 split dataset and judge that this is close enough that measures do not need to be taken to correct for the imbalance. However, the more imbalanced the dataset becomes, the greater the need is to correct for this imbalance.

從某種意義上說,“不平衡”一詞是主觀的,由數(shù)據(jù)科學(xué)家自行決定。 通常,當(dāng)標(biāo)準(zhǔn)分類算法(固有地偏向多數(shù)類(在上一篇文章中有更多詳細(xì)信息))由于多數(shù)類的偏向而返回次優(yōu)解時(shí),則認(rèn)為數(shù)據(jù)集不平衡。 數(shù)據(jù)科學(xué)家可以查看45–55的分割數(shù)據(jù)集,并判斷該數(shù)據(jù)集足夠接近,因此無需采取措施來糾正不平衡。 但是,數(shù)據(jù)集變得越不平衡,就越需要糾正這種不平衡。

In a concept-learning problem, the data set is said to present a class imbalance if it contains many more examples of one class than the other.

在概念學(xué)習(xí)問題中,如果數(shù)據(jù)集包含一個(gè)類別的實(shí)例多于另一個(gè)類別的實(shí)例,則稱該數(shù)據(jù)集存在類別不平衡。

As a result, these classifiers tend to ignore small classes while concentrating on classifying the large ones accurately.

結(jié)果,這些分類器傾向于忽略小類別,而專注于準(zhǔn)確地對(duì)大類別進(jìn)行分類。

Imagine you are working for Netflix and are tasked with determining which customer churn rates (a customer ‘churning’ means they will stop using your services or using your products).

想象您正在為Netflix工作,并負(fù)責(zé)確定哪些客戶流失率(客戶“流失”意味著他們將停止使用您的服務(wù)或產(chǎn)品)。

In an ideal world (at least for the data scientist), our training and testing datasets would be close to fully balanced, having around 50% of the dataset containing individuals that will churn and 50% who will not. In this case, a 90% accuracy will more or less indicate a 90% accuracy on both the positively and negatively classed groups. Our errors will be evenly split across both groups. In addition, we have roughly the same number of points in both classes, which from the law of large numbers tells us reduces the overall variance in the class. This is great for us, accuracy is an informative metric in this situation and we can continue with our analysis unimpeded.

在理想的世界中(至少對(duì)于數(shù)據(jù)科學(xué)家而言),我們的訓(xùn)練和測(cè)試數(shù)據(jù)集將接近完全平衡,大約50%的數(shù)據(jù)集包含會(huì)攪動(dòng)的人和50%不會(huì)攪動(dòng)的人。 在這種情況下,90%的準(zhǔn)確度將或多或少地表明在正面和負(fù)面分類組中都達(dá)到90%的準(zhǔn)確度。 我們的錯(cuò)誤將平均分配給兩個(gè)組。 此外,兩個(gè)類中的點(diǎn)數(shù)大致相同,這從大數(shù)定律可以看出,這減少了類中的總體方差。 這對(duì)我們來說非常好,在這種情況下,準(zhǔn)確性是一個(gè)有用的指標(biāo),我們可以繼續(xù)進(jìn)行不受阻礙的分析。

A dataset with an even 50–50 split across the binary response variable. There is no majority class in this example.二進(jìn)制響應(yīng)變量之間均分50–50的數(shù)據(jù)集。 此示例中沒有多數(shù)類。

As you may have suspected, most people that already pay for Netflix don't have a 50% chance of stopping their subscription every month. In fact, the percentage of people that will churn is rather small, closer to a 90–10 split. How does the presence of this dataset imbalance complicate matters?

您可能會(huì)懷疑,大多數(shù)已經(jīng)為Netflix付款的人沒有50%的機(jī)會(huì)每月停止訂閱。 實(shí)際上,會(huì)流失的人數(shù)比例很小,接近90-10。 這個(gè)數(shù)據(jù)集的不平衡如何使問題復(fù)雜化?

Assuming a 90–10 split, we now have a very different data story to tell. Giving this data to an algorithm without any further consideration will likely result in an accuracy close to 90%. This seems pretty good, right? It’s about the same as what we got previously. If you try putting this model into production your boss will probably not be so happy.

假設(shè)拆分為90-10,我們現(xiàn)在要講一個(gè)非常不同的數(shù)據(jù)故事。 將此數(shù)據(jù)提供給算法而無需進(jìn)一步考慮,可能會(huì)導(dǎo)致接近90%的精度。 這看起來還不錯(cuò)吧? 它與我們之前獲得的內(nèi)容大致相同。 如果您嘗試將這種模型投入生產(chǎn),您的老板可能不會(huì)很高興。

An imbalanced dataset with a 90–10 split. False positives will be much larger than false negatives. Variance in the minority set will be larger due to fewer data points. The majority class will dominate algorithmic predictions without any correction for imbalance.分割為90-10的不平衡數(shù)據(jù)集。 假陽性比假陰性要大得多。 由于較少的數(shù)據(jù)點(diǎn),少數(shù)派集中的方差會(huì)更大。 多數(shù)類將主導(dǎo)算法預(yù)測(cè),而無需對(duì)不平衡進(jìn)行任何校正。

Given the prevalence of the majority class (the 90% class), our algorithm will likely regress to a prediction of the majority class. The algorithm can pretty closely maximize its accuracy (our scoring metric of choice) by arbitrarily predicting that the majority class occurs every time. This is a trivial result and provides close to zero predictive power.

給定多數(shù)類別(90%類別)的患病率,我們的算法可能會(huì)回歸到多數(shù)類別的預(yù)測(cè)。 通過任意預(yù)測(cè)每次都會(huì)出現(xiàn)多數(shù)類,該算法可以非常精確地最大程度地提高其準(zhǔn)確性(我們的選擇評(píng)分標(biāo)準(zhǔn))。 這是微不足道的結(jié)果,并提供接近零的預(yù)測(cè)能力。

(Left) A balanced dataset with the same number of items in the positive and negative class; the number of false positives and false negatives in this scenario are roughly equivalent and result in little classification bias. (Right) An imbalanced dataset with around 5% of samples being in the negative class and 95% of samples being in the positive class (this could be the number of people that pay for Netflix that decide to quit during the next payment cycle).(左)一個(gè)平衡的數(shù)據(jù)集,其中正數(shù)和負(fù)數(shù)類的項(xiàng)目數(shù)相同; 在這種情況下,假陽性和假陰性的數(shù)量大致相等,并且?guī)缀鯖]有分類偏差。 (右)一個(gè)不平衡的數(shù)據(jù)集,其中約5%的樣本屬于負(fù)面類別,而95%的樣本屬于正面類別(這可能是為Netflix付款并決定在下一個(gè)付款周期退出的人數(shù))。

Predictive accuracy, a popular choice for evaluating the performance of a classifier, might not be appropriate when the data is imbalanced and/or the costs of different errors vary markedly.

當(dāng)數(shù)據(jù)不平衡和/或不同錯(cuò)誤的成本明顯不同時(shí),預(yù)測(cè)準(zhǔn)確性是評(píng)估分類器性能的一種普遍選擇,可能不合適。

Visually, this dataset might look something like this:

從視覺上看,該數(shù)據(jù)集可能看起來像這樣:

Machine learning algorithms by default assume that data is balanced. In classification, this corresponds to a comparative number of instances of each class. Classifiers learn better from a balanced distribution. It is up to the data scientist to correct for imbalances, which can be done in multiple ways.

默認(rèn)情況下,機(jī)器學(xué)習(xí)算法假定數(shù)據(jù)是平衡的。 在分類中,這對(duì)應(yīng)于每個(gè)類的比較實(shí)例數(shù)。 分類器從均衡的分布中學(xué)習(xí)得更好。 數(shù)據(jù)科學(xué)家可以糾正不平衡,這可以通過多種方式來完成。

不同類型的失衡 (Different Types of Imbalance)

We have clearly shown that imbalanced datasets have some additional challenges to standard datasets. To further complicate matters, there are different types of imbalance that can occur in a dataset.

我們已經(jīng)清楚地表明,不平衡的數(shù)據(jù)集對(duì)標(biāo)準(zhǔn)數(shù)據(jù)集還有一些其他挑戰(zhàn)。 更復(fù)雜的是,數(shù)據(jù)集中可能會(huì)出現(xiàn)不同類型的失衡。

(1) Between-Class

(1)課間

A between-class imbalance occurs when there is an imbalance in the number of data points contained within each class. An example of this is shown below:

當(dāng)每個(gè)類中包含的數(shù)據(jù)點(diǎn)數(shù)量不平衡時(shí),將發(fā)生類間不平衡。 下面是一個(gè)示例:

An illustration of between-class imbalance. We have a large number of data points for the red class but relatively few for the white class.類間失衡的例證。 紅色類別的數(shù)據(jù)點(diǎn)很多,而白色類別的數(shù)據(jù)點(diǎn)相對(duì)較少。

An example of this would be a mammography dataset, which uses images known as mammograms to predict breast cancer. Consider the number of mammograms related to positive and negative cancer diagnoses:

這樣的一個(gè)例子是乳腺X射線攝影數(shù)據(jù)集,它使用稱為乳腺X線照片的圖像來預(yù)測(cè)乳腺癌。 考慮與陽性和陰性癌癥診斷相關(guān)的乳房X線照片數(shù)量:

The vast majority of samples (>90%) are negative, whilst relatively few (<10%) are positive.絕大多數(shù)樣本(> 90%)為陰性,而相對(duì)少數(shù)(<10%)為陽性。

Note that given enough data samples in both classes the accuracy will improve as the sampling distribution is more representative of the data distribution, but by virtue of the law of large numbers, the majority class will have inherently better representation than the minority class.

請(qǐng)注意,如果兩個(gè)類別中都有足夠的數(shù)據(jù)樣本,則精度會(huì)隨著采樣分布更能代表數(shù)據(jù)分布而提高,但是由于數(shù)量規(guī)律,多數(shù)類別在本質(zhì)上要比少數(shù)類別更好。

(2) Within-Class

(2)班內(nèi)

A within-class imbalance occurs when the dataset has balanced between-class data but one of the classes is not representative in some regions. An example of this is shown below:

當(dāng)數(shù)據(jù)集具有平衡的類間數(shù)據(jù),但其中一個(gè)類在某些區(qū)域中不具有代表性時(shí),會(huì)發(fā)生類內(nèi)不平衡。 下面是一個(gè)示例:

An illustration of within-class imbalance. We have a large number of data points for both classes but the number of data points in the white class in the top left corner is very sparse, which can result in similar complications as between-class imbalance for predictions in those regions.類內(nèi)失衡的例證。 這兩個(gè)類別都有大量數(shù)據(jù)點(diǎn),但是左上角的白色類別中的數(shù)據(jù)點(diǎn)數(shù)量非常稀疏,這可能導(dǎo)致與這些區(qū)域中的類間不平衡預(yù)測(cè)相似的復(fù)雜情況。

(3) Intrinsic and Extrinsic

(3)內(nèi)部和外部

An intrinsic imbalance is due to the nature of the dataset, while extrinsic imbalance is related to time, storage, and other factors that limit the dataset or the data analysis. Intrinsic characteristics are relatively simple and are what we commonly see, but extrinsic imbalance can exist separately and can also work to increase the imbalance of a dataset.

內(nèi)在的不平衡歸因于數(shù)據(jù)集的性質(zhì), 而外在的不平衡則與時(shí)間,存儲(chǔ)以及其他限制數(shù)據(jù)集或數(shù)據(jù)分析的因素有關(guān)。 內(nèi)部特征相對(duì)簡(jiǎn)單,這是我們通常看到的特征,但是外部不平衡可以單獨(dú)存在,也可以用來增加數(shù)據(jù)集的不平衡。

For example, companies often use intrusion detection systems that analyze packets of data sent in and out of networks in order to detect malware of malicious activity. Depending on whether you analyze all data or just data sent through specific ports or specific devices, this will significantly influence the imbalance of the dataset (most network traffic is likely legitimate). Similarly, if log files or data packets related to suspected malicious behavior are commonly stored but normal log are not (or only a select few types are stored), then this can also influence the imbalance of the dataset. Similarly, if logs were only stored during a normal working day (say, 9–5 PM) instead of 24 hours, this will also affect the imbalance.

例如,公司經(jīng)常使用入侵檢測(cè)系統(tǒng)來分析進(jìn)出網(wǎng)絡(luò)的數(shù)據(jù)包,以檢測(cè)惡意活動(dòng)的惡意軟件。 根據(jù)您是分析所有數(shù)據(jù)還是僅分析通過特定端口或特定設(shè)備發(fā)送的數(shù)據(jù),這將嚴(yán)重影響數(shù)據(jù)集的不平衡(大多數(shù)網(wǎng)絡(luò)流量可能是合法的)。 同樣,如果通常存儲(chǔ)與可疑惡意行為有關(guān)的日志文件或數(shù)據(jù)包,但不存儲(chǔ)常規(guī)日志(或僅存儲(chǔ)少數(shù)幾種類型的日志),則這也可能會(huì)影響數(shù)據(jù)集的不平衡。 同樣,如果日志僅在正常工作日(例如9-5 PM)而非24小時(shí)內(nèi)存儲(chǔ),這也會(huì)影響不平衡。

不平衡的進(jìn)一步復(fù)雜化 (Further Complication of Imbalance)

There are a couple more difficulties increased by imbalanced datasets. Firstly, we have class overlapping. This is not always a problem, but can often arise in imbalanced learning problems and cause headaches. Class overlapping is illustrated in the below dataset.

不平衡的數(shù)據(jù)集會(huì)增加更多的困難。 首先,我們有班級(jí)重疊 。 這并不總是一個(gè)問題,但是經(jīng)常會(huì)在學(xué)習(xí)不平衡的問題中出現(xiàn)并引起頭痛。 下面的數(shù)據(jù)集說明了類重疊。

Example of class overlapping. Some of the positive data points (stars) are intermixed with the negative data points (circles), which would lead an algorithm to construct an imperfect decision boundary.類重疊的示例。 一些正數(shù)據(jù)點(diǎn)(星號(hào))與負(fù)數(shù)據(jù)點(diǎn)(圓)混合在一起,這將導(dǎo)致算法構(gòu)造不完善的決策邊界。

Class overlapping occurs in normal classification problems, so what is the additional issue here? Well, the class more represented in overlap regions tends to be better classified by methods based on global learning (on the full dataset). This is because the algorithm is able to get a more informed picture of the data distribution of the majority class.

在正常的分類問題中會(huì)發(fā)生類重疊,那么這里還有什么其他問題? 好吧,在重疊區(qū)域中表示更多的類傾向于通過基于全局學(xué)習(xí)的方法(在完整數(shù)據(jù)集上)更好地分類。 這是因?yàn)樵撍惴軌颢@得多數(shù)類數(shù)據(jù)分布的更多信息。

In contrast, the class less represented in such regions tends to be better classified by local methods. If we take k-NN as an example, as the value of k increases, it becomes increasingly global and increasingly local. It can be shown that performance for low values of k has better performance on the minority dataset, and lower performance at high values of k. This shift in accuracy is not exhibited for the majority class because it is well-represented at all points.

相反,在此類區(qū)域中較少代表的類別傾向于通過本地方法更好地分類。 如果以k-NN為例,隨著k值的增加,它變得越來越全球化,也越來越局部化。 可以證明,k值較低時(shí)的性能在少數(shù)數(shù)據(jù)集上具有較好的性能,而k值較高時(shí)的性能較低。 準(zhǔn)確性的這種變化在大多數(shù)類別中都沒有表現(xiàn)出來,因?yàn)樗谒蟹矫娑嫉玫搅撕芎玫捏w現(xiàn)。

This suggests that local methods may be better suited for studying the minority class. One method to correct for this is the CBO Method. The CBO Method uses cluster-based resampling to identify ‘rare’ cases and resample them individually, so as to avoid the creation of small disjuncts in the learned hypothesis. This is a method of oversampling — a topic that we will discuss in detail in the following section.

這表明本地方法可能更適合于研究少數(shù)群體。 一種糾正此問題的方法CBO方法 。 CBO方法使用基于聚類的重采樣來識(shí)別“稀有”案例并分別對(duì)其進(jìn)行重采樣,以避免在學(xué)習(xí)的假設(shè)中產(chǎn)生小的歧義。 這是一種過采樣的方法-我們將在下一節(jié)中詳細(xì)討論這個(gè)主題。

CBO Method. Once the training examples of each class have been clustered, oversampling starts. In the majority class, all the clusters, except for the largest one, are randomly oversampled so as to get the same number of training examples as the largest cluster.CBO方法。 一旦將每個(gè)班級(jí)的訓(xùn)練示例進(jìn)行了聚類,就會(huì)開始進(jìn)行過度采樣。 在多數(shù)類中,除最大的聚類外,所有聚類均被隨機(jī)過采樣,以便獲得與最大聚類相同數(shù)量的訓(xùn)練樣例。

糾正數(shù)據(jù)集不平衡 (Correcting Dataset Imbalance)

There are several techniques to control for dataset imbalance. There are two main types of techniques to handle imbalanced datasets: sampling methods, and cost-sensitive methods.

有幾種控制數(shù)據(jù)集不平衡的技術(shù)。 處理不平衡數(shù)據(jù)集的技術(shù)主要有兩種: 抽樣方法成本敏感方法

The simplest and most commonly used of these are sampling methods called oversampling and undersampling, which we will go into more detail on.

其中最簡(jiǎn)單,最常用的是稱為過采樣和欠采樣的采樣方法,我們將對(duì)其進(jìn)行詳細(xì)介紹。

Oversampling/Undersampling

過采樣/欠采樣

Simply stated, oversampling involves generating new data points for the minority class, and undersampling involves removing data points from the majority class. This acts to somewhat reduce the extent of the imbalance in the dataset.

簡(jiǎn)而言之,過采樣涉及為少數(shù)類生成新的數(shù)據(jù)點(diǎn),而欠采樣涉及從多數(shù)類中刪除數(shù)據(jù)點(diǎn)。 這在某種程度上減少了數(shù)據(jù)集中的不平衡程度。

What does undersampling look like? We continually remove like-samples in close proximity until both classes have the same number of data points.

欠采樣是什么樣的? 我們會(huì)不斷刪除附近的相似樣本,直到兩個(gè)類具有相同數(shù)量的數(shù)據(jù)點(diǎn)。

Undersampling. Imagine you are analysing a dataset for fraudulent transactions. Most of the transactions are not fraudulent, creating a fundamentally imbalanced dataset. In the scenario of undersampling, we will take fewer samples from the majority class to help reduce the extent of this imbalance.欠采樣。 假設(shè)您正在分析數(shù)據(jù)集中的欺詐性交易。 大多數(shù)交易不是欺詐性的,從而造成了根本上不平衡的數(shù)據(jù)集。 在抽樣不足的情況下,我們將從多數(shù)類別中抽取較少的樣本,以幫助減少這種不平衡的程度。

Is undersampling a good idea? Undersampling is recommended by many statistical researchers but is only good if enough data points are available on the undersampled class. Also, since the majority class will end up with the same number of points as the minority class, the statistical properties of the distributions will become ‘looser’ in a sense. However, we have not artificially distorted the data distribution with this method by adding in artificial data points.

采樣不足是個(gè)好主意嗎? 許多統(tǒng)計(jì)研究人員建議進(jìn)行欠采樣,但是只有在欠采樣類別上有足夠的數(shù)據(jù)點(diǎn)可用時(shí),采樣才是好的。 同樣,由于多數(shù)類最終將獲得與少數(shù)類相同的分?jǐn)?shù),因此從某種意義上說,分布的統(tǒng)計(jì)屬性將變?yōu)椤拜^弱”。 但是,我們沒有通過添加人工數(shù)據(jù)點(diǎn)來使用這種方法人為地扭曲數(shù)據(jù)分布。

Illustration of undersampling. Like-samples in close proximity are removed in an attempt to increase the sparsity of the data distribution.欠采樣的插圖。 為了提高數(shù)據(jù)分布的稀疏性,刪除了附近的相似樣本。

What does oversampling look like? In shot, the opposite of undersampling. We are artificially adding data points to our dataset to make the number of instances in each class balanced.

過采樣看起來像什么? 在拍攝中,欠采樣的情況與之相反。 我們正在人為地向數(shù)據(jù)集中添加數(shù)據(jù)點(diǎn),以使每個(gè)類中的實(shí)例數(shù)量保持平衡。

Oversampling. In the scenario of oversampling, we will oversample from the minority class to help reduce the extent of this imbalance.過采樣。 在過度采樣的情況下,我們將對(duì)少數(shù)群體進(jìn)行過度采樣,以幫助減少這種不平衡的程度。

How do we generate these samples? The most common way is to generate points that are close in dataspace proximity to existing samples or are ‘between’ two samples, as illustrated below.

我們?nèi)绾紊蛇@些樣本? 最常見的方法是生成在數(shù)據(jù)空間中與現(xiàn)有樣本接近或在兩個(gè)樣本“之間”的點(diǎn),如下所示。

Illustration of oversampling.過度采樣的插圖。

As you may have suspected, there are some downsides to adding false data points. Firstly, you risk overfitting, especially if one does this for points that are noise — you end up exacerbating this noise by adding reinforced measurements. In addition, adding these values randomly can also contribute additional noise to our model.

您可能已經(jīng)懷疑過,添加錯(cuò)誤的數(shù)據(jù)點(diǎn)有一些缺點(diǎn)。 首先,您可能會(huì)面臨過度擬合的風(fēng)險(xiǎn),特別是如果對(duì)噪聲點(diǎn)進(jìn)行過度擬合時(shí),最終會(huì)通過添加增強(qiáng)的測(cè)量來加劇這種噪聲。 此外,隨機(jī)添加這些值也會(huì)給我們的模型帶來額外的噪聲。

SMOTE (Synthetic minority oversampling technique)

SMOTE(合成少數(shù)群體過采樣技術(shù))

Luckily for us, we don’t have to write an algorithm for randomly generating data points for the purpose of oversampling. Instead, we can use the SMOTE algorithm.

對(duì)我們來說幸運(yùn)的是,我們不必編寫用于過采樣的隨機(jī)生成數(shù)據(jù)點(diǎn)的算法。 相反,我們可以使用SMOTE算法。

How does SMOTE work? SMOTE generates new samples in between existing data points based on their local density and their borders with the other class. Not only does it perform oversampling, but can subsequently use cleaning techniques (undersampling, more on this shortly) to remove redundancy in the end. Below is an illustration for how SMOTE works when studying class data.

SMOTE如何工作? SMOTE根據(jù)現(xiàn)有數(shù)據(jù)點(diǎn)的局部密度及其與其他類別的邊界在新數(shù)據(jù)點(diǎn)之間生成新樣本。 它不僅執(zhí)行過采樣,而且可以隨后使用清除技術(shù)(欠采樣,稍后對(duì)此進(jìn)行更多介紹)最終消除冗余。 下面是學(xué)習(xí)班級(jí)數(shù)據(jù)時(shí)SMOTE如何工作的圖示。

An illustration of how SMOTE functions. The instance on the left is isolated and is thus considered noise by the algorithm. No additional data points are generated in its proximity, or, if they are, they will be in very close proximity to the singular point. The two clusters in the center and right have several data points, indicating that it is less likely that these points correspond to random noise. Thus, a larger cluster (empirical data distribution) can be drawn by the algorithm from which additional samples can be generated.SMOTE的功能說明。 左側(cè)的實(shí)例被隔離,因此被算法視為噪聲。 不會(huì)在其附近生成任何其他數(shù)據(jù)點(diǎn),或者如果它們是,它們將非常靠近奇異點(diǎn)。 中央和右側(cè)的兩個(gè)群集具有幾個(gè)數(shù)據(jù)點(diǎn),表明這些點(diǎn)對(duì)應(yīng)于隨機(jī)噪聲的可能性較小。 因此,可以通過該算法得出更大的聚類(經(jīng)驗(yàn)數(shù)據(jù)分布),從中可以生成其他樣本。

The algorithm for SMOTE is as follows. For each minority sample:

SMOTE的算法如下。 對(duì)于每個(gè)少數(shù)族裔樣本:

– Find its k-nearest minority neighbours

–尋找其k最近的少數(shù)族裔鄰居

– Randomly select j of these neighbours

–隨機(jī)選擇這些鄰居中的j個(gè)

– Randomly generate synthetic samples along the lines joining the minority sample and its j selected neighbours (j depends on the amount of oversampling desired)

–沿連接少數(shù)樣本及其j個(gè)選定鄰居的直線隨機(jī)生成合成樣本(j取決于所需的過采樣量)

Informed vs. Random Oversampling

知情vs.隨機(jī)過采樣

Using random oversampling (with replacement) of the minority class has the effect of making the decision region for the minority class very specific. In a decision tree, it would cause a new split and often lead to overfitting. SMOTE’s informed oversampling generalizes the decision region for the minority class. As a result, larger and less specific regions are learned, thus, paying attention to minority class samples without causing overfitting.

使用少數(shù)類的隨機(jī)過采樣 (替換)具有使少數(shù)類的決策區(qū)域非常具體的效果。 在決策樹中,這將導(dǎo)致新的分裂并經(jīng)常導(dǎo)致過度擬合。 SMOTE的明智超采樣概括了少數(shù)群體的決策區(qū)域。 結(jié)果,學(xué)習(xí)了更大和更少的特定區(qū)域,因此,在不引起過度擬合的情況下注意少數(shù)類樣本。

Drawbacks of SMOTE

SMOTE的缺點(diǎn)

Overgeneralization. SMOTE’s procedure can be dangerous since it blindly generalizes the minority area without regard to the majority class. This strategy is particularly problematic in the case of highly skewed class distributions since, in such cases, the minority class is very sparse with respect to the majority class, thus resulting in a greater chance of class mixture.

過度概括。 SMOTE的程序可能很危險(xiǎn),因?yàn)樗つ康貙⑸贁?shù)民族地區(qū)泛化而無視多數(shù)階級(jí)。 這種策略在階級(jí)分布高度偏斜的情況下尤其成問題,因?yàn)樵谶@種情況下,少數(shù)階級(jí)相對(duì)于多數(shù)階級(jí)而言非常稀疏,因此導(dǎo)致階級(jí)混合的機(jī)會(huì)更大。

Inflexibility. The number of synthetic samples generated by SMOTE is fixed in advance, thus not allowing for any flexibility in the re-balancing rate.

僵硬。 SMOTE生成的合成樣本的數(shù)量是預(yù)先固定的,因此再平衡速率不具有任何靈活性。

Another potential issue is that SMOTE might introduce the artificial minority class examples too deeply in the majority class space. This drawback can be resolved by hybridization: combining SMOTE with undersampling algorithms. One of the most famous of these is Tomek Links. Tomek Links are pairs of instances of opposite classes who are their own nearest neighbors. In other words, they are pairs of opposing instances that are very close together.

另一個(gè)潛在的問題是,SMOTE可能會(huì)在多數(shù)階層的空間中過于深入地介紹人工少數(shù)群體的例子。 這個(gè)缺點(diǎn)可以通過雜交解決:將SMOTE與欠采樣算法結(jié)合在一起。 其中最著名的就是Tomek Links 。 Tomek鏈接是一對(duì)相反類別的實(shí)例,它們是自己最近的鄰居。 換句話說,它們是一對(duì)非常靠近的相對(duì)實(shí)例。

Tomek’s algorithm looks for such pairs and removes the majority instance of the pair. The idea is to clarify the border between the minority and majority classes, making the minority region(s) more distinct. Scikit-learn has no built-in modules for doing this, though there are some independent packages (e.g., TomekLink, imbalanced-learn).

Tomek的算法會(huì)查找此類對(duì),并刪除該對(duì)的多數(shù)實(shí)例。 這樣做的目的是弄清少數(shù)民族和多數(shù)階級(jí)之間的界限,使少數(shù)民族地區(qū)更加鮮明。 盡管有一些獨(dú)立的軟件包(例如TomekLink , imbalanced -learn ),但Scikit-learn沒有內(nèi)置模塊可以執(zhí)行此操作。

Thus, Tomek’s algorithm is an undersampling technique that acts as a data cleaning method for SMOTE to regulate against redundancy. As you may have suspected, there are many additional undersampling techniques that can be combined with SMOTE to perform the same function. A comprehensive list of these functions can be found in the functions section of the imbalanced-learn documentation.

因此,Tomek的算法是一種欠采樣技術(shù),可作為SMOTE調(diào)節(jié)冗余的數(shù)據(jù)清洗方法。 您可能已經(jīng)懷疑,還有許多其他的欠采樣技術(shù)可以與SMOTE結(jié)合使用以執(zhí)行相同的功能。 這些功能的全面列表可在不平衡學(xué)習(xí)文檔的功能部分中找到。

An additional example is Edited Nearest Neighbors (ENN). ENN removes any example whose class label differs from the class of at least two of their neighbor. ENN removes more examples than the Tomek links does and also can remove examples from both classes.

另一個(gè)示例是“最近的鄰居”(ENN)。 ENN刪除任何其類別標(biāo)簽不同于其至少兩個(gè)鄰居的類別的示例。 與Tomek鏈接相比,ENN刪除的示例更多,并且還可以從兩個(gè)類中刪除示例。

Other more nuanced versions of SMOTE include Borderline SMOTE, SVMSMOTE, and KMeansSMOTE, and more nuanced versions of the undersampling techniques applied in concert with SMOTE are Condensed Nearest Neighbor (CNN), Repeated Edited Nearest Neighbor, and Instance Hardness Threshold.

SMOTE的其他細(xì)微差別版本包括Borderline SMOTE,SVMSMOTE和KMeansSMOTE,與SMOTE結(jié)合使用的欠采樣技術(shù)的細(xì)微差別版本是壓縮最近鄰(CNN),重復(fù)編輯最近鄰和實(shí)例硬度閾值。

成本敏感型學(xué)習(xí) (Cost-Sensitive Learning)

We have discussed sampling techniques and are now ready to discuss cost-sensitive learning. In many ways, the two approaches are analogous — the main difference being that in cost-sensitive learning we perform under- and over-sampling by altering the relative weighting of individual samples.

我們已經(jīng)討論了采樣技術(shù),現(xiàn)在準(zhǔn)備討論對(duì)成本敏感的學(xué)習(xí)。 在許多方面,這兩種方法是相似的-主要區(qū)別在于在成本敏感型學(xué)習(xí)中,我們通過更改單個(gè)樣本的相對(duì)權(quán)重來進(jìn)行欠采樣和過采樣。

Upweighting. Upweighting is analogous to over-sampling and works by increasing the weight of one of the classes keeping the weight of the other class at one.

增重。 上權(quán)類似于過采樣,其工作方式是增加一個(gè)類別的權(quán)重,將另一類別的權(quán)重保持為一個(gè)。

Down-weighting. Down-weighting is analogous to under-sampling and works by decreasing the weight of one of the classes keeping the weight of the other class at one.

減重。 減權(quán)類似于欠采樣,它通過減小一個(gè)類別的權(quán)重而將另一類別的權(quán)重保持為一個(gè)來工作。

An example of how this can be performed using sklearn is via the sklearn.utils.class_weight function and applied to any sklearn classifier (and within keras).

如何使用sklearn執(zhí)行此操作的示例是通過sklearn.utils.class_weight函數(shù)并將其應(yīng)用于任何sklearn分類器(以及在keras中)。

from sklearn.utils import class_weight
class_weights = class_weight.compute_class_weight('balanced', np.unique(y_train), y_train)
model.fit(X_train, y_train, class_weight=class_weights)

In this case, we have set the instances to be ‘balanced’, meaning that we will treat these instances to have balanced weighting based on their relative number of points — this is what I would recommend unless you have a good reason for setting the values yourself. If you have three classes and wanted to weight one of them 10x larger and another 20x larger (because there are 10x and 20x fewer of these points in the dataset than the majority class), then we can rewrite this as:

在這種情況下,我們將實(shí)例設(shè)置為“平衡”,這意味著我們將根據(jù)它們的相對(duì)點(diǎn)數(shù)將這些實(shí)例視為具有均衡的權(quán)重-這是我的建議,除非您有充分的理由來設(shè)置值你自己 如果您有三個(gè)類別,并且想要將其中一個(gè)類別的權(quán)重放大10倍,將另一個(gè)類別的權(quán)重增大20倍(因?yàn)閿?shù)據(jù)集中這些點(diǎn)的數(shù)量比多數(shù)類別少10倍和20倍),則可以將其重寫為:

class_weight = {0: 0.1,
1: 1.,
2: 2.}

Some authors claim that cost-sensitive learning is slightly more effective than random or directed over- or under-sampling, although all approaches are helpful, and directed oversampling, is close to cost-sensitive learning in efficacy. Personally, when I am working on a machine learning problem I will use cost-sensitive learning because it is much simpler to implement and communicate to individuals. However, there may be additional aspects of using sampling techniques that provide superior results of which I am not aware.

一些作者聲稱,成本敏感型學(xué)習(xí)比隨機(jī)或有針對(duì)性的過度采樣或欠采樣略有效果,盡管所有方法都是有幫助的,有針對(duì)性的過度采樣在效果上接近于成本敏感型學(xué)習(xí)。 就個(gè)人而言,當(dāng)我處理機(jī)器學(xué)習(xí)問題時(shí),我將使用成本敏感型學(xué)習(xí),因?yàn)樗子趯?shí)現(xiàn)并與個(gè)人進(jìn)行交流 。 但是,使用采樣技術(shù)可能存在其他方面,這些方面提供了我所不知道的優(yōu)異結(jié)果。

評(píng)估指標(biāo) (Assessment Metrics)

In this section, I outline several metrics that can be used to analyze the performance of a classifier trained to solve a binary classification problem. These include (1) the confusion matrix, (2) binary classification metrics, (3) the receiver operating characteristic curve, and (4) the precision-recall curve.

在本節(jié)中,我概述了幾個(gè)可用于分析經(jīng)過訓(xùn)練以解決二進(jìn)制分類問題的分類器的性能的指標(biāo)。 其中包括(1)混淆矩陣,(2)二進(jìn)制分類指標(biāo),(3)接收器工作特性曲線和(4)精確調(diào)用曲線。

混淆矩陣 (Confusion Matrix)

Despite what you may have garnered from its name, a confusion matrix is decidedly confusing. A confusion matrix is the most basic form of assessment of a binary classifier. Given the prediction outputs of our classifier and the true response variable, a confusion matrix tells us how many of our predictions are correct for each class, and how many are incorrect. The confusion matrix provides a simple visualization of the performance of a classifier based on these factors.

盡管您可能從它的名字中學(xué)到了什么,但是混亂矩陣顯然令人困惑。 混淆矩陣是二進(jìn)制分類器評(píng)估的最基本形式。 給定分類器的預(yù)測(cè)輸出和真實(shí)的響應(yīng)變量,混淆矩陣會(huì)告訴我們每個(gè)類別正確的預(yù)測(cè)有多少,不正確的預(yù)測(cè)有多少。 混淆矩陣基于這些因素提供了分類器性能的簡(jiǎn)單可視化。

Here is an example of a confusion matrix:

這是一個(gè)混淆矩陣的示例:

Hopefully what this is showing is relatively clear. The TN cell tells us the number of true positives: the number of positive samples that I predicted were positive.

希望這顯示的是相對(duì)清楚的。 TN細(xì)胞告訴我們真正的陽性數(shù)量:我預(yù)測(cè)的陽性樣品數(shù)量為陽性。

The TP cell tells us the number of true negatives: the number of negative samples that I predicted were negative.

TP單元告訴我們真實(shí)陰性的數(shù)量:我預(yù)測(cè)的陰性樣品的數(shù)量為陰性。

The FP cell tells us the number of false positives: the number of negative samples that I predicted were positive.

FP細(xì)胞告訴我們假陽性的數(shù)量:我預(yù)測(cè)的陰性樣品的數(shù)量是陽性的。

The FN cell tells us the number of false negatives: the number of positive samples that I predicted were positive.

FN細(xì)胞告訴我們假陰性的數(shù)量:我預(yù)測(cè)的陽性樣品的數(shù)量為陽性。

These numbers are very important as they form the basis of the binary classification metrics discussed next.

這些數(shù)字非常重要,因?yàn)樗鼈儤?gòu)成了下面討論的二進(jìn)制分類指標(biāo)的基礎(chǔ)。

二進(jìn)制分類指標(biāo) (Binary Classification Metrics)

There are a plethora of single-value metrics for binary classification. As such, only a few of the most commonly used ones and their different formulations are presented here, more details can be found on scoring metrics in the sklearn documentation and on their relation to confusion matrices and ROC curves (discussed in the next section) here.

二進(jìn)制分類有很多單值指標(biāo)。 因此,此處僅介紹一些最常用的方法及其不同的公式,有關(guān)更多詳細(xì)信息,請(qǐng)參見sklearn文檔中的評(píng)分指標(biāo)以及它們與混淆矩陣和ROC曲線的關(guān)系(在下一節(jié)中討論) 。 。

Arguably the most important five metrics for binary classification are: (1) precision, (2) recall, (3) F1 score, (4) accuracy, and (5) specificity.

可以說,二元分類最重要的五個(gè)指標(biāo)是:(1)精度,(2)回憶,(3)F1得分,(4)準(zhǔn)確性和(5)特異性。

Precision. Precision provides us with the answer to the question “Of all my positive predictions, what proportion of them are correct?”. If you have an algorithm that predicts all of the positive class correctly but also has a large portion of false positives, the precision will be small. It makes sense why this is called precision since it is a measure of how ‘precise’ our predictions are.

精確。 Precision為我們提供了以下問題的答案: “在我所有的積極預(yù)測(cè)中,有多少是正確的?” 。 如果您有一種算法可以正確預(yù)測(cè)所有肯定分類,但也有很大一部分誤報(bào),則精度會(huì)很小。 之所以將其稱為“精度”是有道理的,因?yàn)樗梢院饬课覀兊念A(yù)測(cè)有多“精確”。

Recall. Recall provides us with the answer to a different question “Of all of the positive samples, what proportion did I predict correctly?”. Instead of false positives, we are now interested in false negatives. These are items that our algorithm missed, and are often the most egregious errors (e.g. failing to diagnose something with cancer that actually has cancer, failing to discover malware when it is present, or failing to spot a defective item). The name ‘recall’ also makes sense for this circumstance as we are seeing how many of the samples the algorithm was able to pick up on.

召回。 Recall為我們提供了一個(gè)不同問題的答案: “在所有陽性樣本中,我正確預(yù)測(cè)的比例是多少?” 。 現(xiàn)在,我們對(duì)假陰性感興趣了,而不是假陽性。 這些是我們的算法遺漏的項(xiàng)目,并且通常是最嚴(yán)重的錯(cuò)誤(例如,未能診斷出確實(shí)患有癌癥的癌癥,無法發(fā)現(xiàn)惡意軟件或存在缺陷的項(xiàng)目)。 在這種情況下,“召回”這個(gè)名稱也很有意義,因?yàn)槲覀兛吹搅嗽撍惴軌蛱崛《嗌賯€(gè)樣本。

It should be clear that these questions, whilst related, are substantially different to each other. It is possible to have a very high precision and simultaneously have a low recall, and vice versa. For example, if you predicted the majority class every time, you would have 100% recall on the majority class, but you would then get a lot of false positives from the minority class.

應(yīng)當(dāng)明確的是,這些問題雖然相關(guān),但彼此之間卻有很大不同。 可能有很高的精度,同時(shí)召回率也很低,反之亦然。 例如,如果您每次都預(yù)測(cè)多數(shù)派,則多數(shù)派將有100%的回憶率,但隨后您將從少數(shù)派中得到很多誤報(bào)。

One other important point to make is that precision and recall can be determined for each individual class. That is, we can talk about the precision of class A, or the precision of class B, and they will have different values — when doing this, we assume that the class we are interested in is the positive class, regardless of its numeric value.

另一個(gè)重要的觀點(diǎn)是, 可以為每個(gè)單獨(dú)的類確定精度和召回率 。 也就是說,我們可以談?wù)擃怉的精度或類B的精度,并且它們將具有不同的值-這樣做時(shí),我們假設(shè)我們感興趣的類是正類,而不管其數(shù)值如何。

.。

F1 Score. The F1 score is a single-value metric that combines precision and recall by using the harmonic mean (a fancy type of averaging). The β parameter is a strictly positive value that is used to describe the relative importance of recall to precision. A larger β value puts a higher emphasis on recall than precision, whilst a smaller value puts less emphasis. If the value is 1, precision and recall are treated with equal weighting.

F1分?jǐn)?shù)。 F1分?jǐn)?shù)是一個(gè)單值指標(biāo),通過使用諧波均值(一種奇特的平均值)將精度和召回率結(jié)合在一起。 β參數(shù)是一個(gè)嚴(yán)格的正值,用于描述召回對(duì)精度的相對(duì)重要性。 β值較大時(shí),對(duì)查全率的重視程度要高于精度,而β值較小時(shí),對(duì)查全率的重視程度較低。 如果該值為1,則精度和召回率將以相等的權(quán)重處理。

What does a high F1 score mean? It suggests that both the precision and recall have high values — this is good and is what you would hope to see upon generating a well-functioning classification model on an imbalanced dataset. A low value indicates that either precision or recall is low, and maybe a call for concern. Good F1 scores are generally lower than good accuracies (in many situations, an F1 score of 0.5 would be considered pretty good, such as predicting breast cancer from mammograms).

F1高分意味著什么? 它表明精度和查全率都具有很高的值-這很好,這是在不平衡數(shù)據(jù)集上生成功能良好的分類模型時(shí)希望看到的。 較低的值表示準(zhǔn)確性或召回率較低,可能表示需要關(guān)注。 良好的F1分?jǐn)?shù)通常低于良好的準(zhǔn)確性(在許多情況下,F1分?jǐn)?shù)0.5被認(rèn)為是相當(dāng)不錯(cuò)的,例如根據(jù)乳房X線照片預(yù)測(cè)乳腺癌)。

Specificity. Simply stated, specificity is the recall of negative values. It answers the question “Of all of my negative predictions, what proportion of them are correct?”. This may be important in situations where examining the relative proportion of false positives is necessary.

特異性。 簡(jiǎn)而言之,特異性就是召回負(fù)值。 它回答了一個(gè)問題: “在我所有的負(fù)面預(yù)測(cè)中,有多少比例是正確的?” 。 這在需要檢查假陽性的相對(duì)比例的情況下可能很重要。

Macro, Micro, and Weighted Scores

宏觀,微觀和加權(quán)分?jǐn)?shù)

This is where things get a little complicated. Anyone who has delved into these metrics on sklearn may have noticed that we can refer to the recall-macro or f1-weighted score.

這會(huì)使事情變得有些復(fù)雜。 認(rèn)真研究了sklearn的這些指標(biāo)的任何人都可能已經(jīng)注意到,我們可以參考召回宏或f1加權(quán)得分。

A macro-F1 score is the average of F1 scores across each class.

宏觀F1分?jǐn)?shù)是每個(gè)課程中F1分?jǐn)?shù)的平均值。

This is most useful if we have many classes and we are interested in the average F1 score for each class. If you only care about the F1 score for one class, you probably won’t need a macro-F1 score.

如果我們有很多班,并且我們對(duì)每個(gè)班的平均F1成績(jī)感興趣,這將是最有用的。 如果您只關(guān)心一個(gè)班級(jí)的F1分?jǐn)?shù),則可能不需要宏F1分?jǐn)?shù)。

A micro-F1 score takes all of the true positives, false positives, and false negatives from all the classes and calculates the F1 score.

微型F1分?jǐn)?shù)采用所有類別中的所有真實(shí)肯定,錯(cuò)誤肯定和錯(cuò)誤否定,并計(jì)算F1得分。

The micro-F1 score is pretty similar in utility to the macro-F1 score as it gives an aggregate performance of a classifier over multiple classes. That being said, they will give different results and understand the underlying difference in that result may be informative for a given application.

微型F1得分的效用與宏觀F1得分非常相似,因?yàn)樗峁┝硕鄠€(gè)類別的分類器的綜合性能。 話雖如此,他們將給出不同的結(jié)果,并了解該結(jié)果的根本差異可能對(duì)給定的應(yīng)用程序有幫助。

A weighted-F1 score is the same as the macro-F1 score, but each of the class-specific F1 scores is scaled by the relative number of samples from that class.

加權(quán)F1分?jǐn)?shù)與宏F1分?jǐn)?shù)相同,但是每個(gè)類別特定的F1分?jǐn)?shù)均根據(jù)該類別的樣本的相對(duì)數(shù)量進(jìn)行縮放。

In this case, N refers to the proportion of samples in the dataset belonging to a single class. For class A, where class A is the majority class, this might be equal to 0.8 (80%). The values for B and C might be 0.15 and 0.05, respectively.

在這種情況下, N是指數(shù)據(jù)集中屬于單個(gè)類別的樣本所占的比例。 對(duì)于A類,其中A類為多數(shù)類,這可能等于0.8(80%)。 B和C的值分別為0.15和0.05。

For a highly imbalanced dataset, a large weighted-F1 score might be somewhat misleading because it is overly influenced by the majority class.

對(duì)于高度不平衡的數(shù)據(jù)集,較大的F1加權(quán)分?jǐn)?shù)可能會(huì)引起誤導(dǎo),因?yàn)樗艿蕉鄶?shù)類別的過度影響。

Other Metrics

其他指標(biāo)

Some other metrics that you may see around that can be informative for binary classification (and multiclass classification to some extent) are:

您可能會(huì)發(fā)現(xiàn)的一些其他指標(biāo)可對(duì)二進(jìn)制分類(在某種程度上,以及多類分類)有所幫助:

Accuracy. If you are reading this, I would imagine you are already familiar with accuracy, but perhaps not so familiar with the others. Cast in the light of a metric for a confusion matrix, the accuracy can be described as the ratio of true predictions (positive and negative) to the sum of the total number of positive and negative samples.

準(zhǔn)確性。 如果您正在閱讀本文,我想您已經(jīng)對(duì)準(zhǔn)確性很熟悉,但對(duì)其他準(zhǔn)確性可能不太了解。 根據(jù)混淆矩陣的度量標(biāo)準(zhǔn),可以將準(zhǔn)確度描述為真實(shí)預(yù)測(cè)(陽性和陰性)與陽性和陰性樣本總數(shù)之和的比率。

G-Mean. A less common metric that is somewhat analogous to the F1 score is the G-Mean. This is often cast in two different formulations, the first being the precision-recall g-mean, and the second being the sensitivity-specificity g-mean. They can be used in a similar manner to the F1 score in terms of analyzing algorithmic performance. The precision-recall g-mean can also be referred to as the Fowlkes-Mallows Index.

G均值。 G均值是一種不太常見的指標(biāo),與F1分?jǐn)?shù)有些相似。 通常用兩種不同的公式表示,第一種是精確調(diào)用g均值,第二種是敏感性特異性g均值。 就分析算法性能而言,它們可以與F1分?jǐn)?shù)類似的方式使用。 精確調(diào)用g均值也可以稱為Fowlkes-Mallows索引 。

There are many other metrics that can be used, but most have specialized use cases and offer little additional utility over the metrics described here. Other metrics the reader may be interested in viewing are balanced accuracy, Matthews correlation coefficient, markedness, and informedness.

可以使用許多其他指標(biāo),但是大多數(shù)指標(biāo)都有專門的用例,并且與此處描述的指標(biāo)相比,幾乎沒有其他用途。 讀者可能感興趣的其他指標(biāo)是平衡的準(zhǔn)確性 , 馬修斯相關(guān)系數(shù) , 標(biāo)記性和信息靈通性 。

Receiver Operating Characteristic (ROC) Curve

接收器工作特性(ROC)曲線

An ROC curve is a two-dimensional graph to depicts trade-offs between benefits (true positives) and costs (false positives). It displays a relation between sensitivity and specificity for a given classifier (binary problems, parameterized classifier or a score classification).

ROC曲線是一個(gè)二維圖形,用于描述收益(真實(shí)肯定)和成本(錯(cuò)誤真實(shí))之間的權(quán)衡。 它顯示了給定分類器(二進(jìn)制問題,參數(shù)化分類器或分?jǐn)?shù)分類)的敏感性特異性之間的關(guān)系。

Here is an example of an ROC curve.

這是ROC曲線的示例。

There is a lot to unpack here. Firstly, the dotted line through the center corresponds to a classifier that acts as a ‘coin flip’. That is, it is correct roughly 50% of the time and is the worst possible classifier (we are just guessing). This acts as our baseline, against which we can compare all other classifiers — these classifiers should be closer to the top left corner of the plot since we want high true positive rates in all cases.

這里有很多要解壓的東西。 首先,通過中心的虛線對(duì)應(yīng)于充當(dāng)“硬幣翻轉(zhuǎn)”的分類器。 也就是說,大約50%的時(shí)間是正確的,并且是最糟糕的分類器(我們只是在猜測(cè))。 這是我們的基準(zhǔn),可以與所有其他分類器進(jìn)行比較-這些分類器應(yīng)更靠近圖的左上角,因?yàn)樵谒星闆r下我們都希望有較高的真實(shí)陽性率。

It should be noted that an ROC curve does not assess a group of classifiers. Rather, it examines a single classifier over a set of classification thresholds.

應(yīng)該注意的是,ROC曲線不評(píng)估一組分類器。 而是,它在一組分類閾值上檢查單個(gè)分類

What does this mean? It means that for one point, I take my classifier and set the threshold to be 0.3 (30% propensity) and then assess the true positive and false positive rates.

這是什么意思? 這意味著,我將分類器的閾值設(shè)置為0.3(傾向性為30%),然后評(píng)估真實(shí)的陽性和假陽性率。

True Positive Rate: Percentage of true positives (to the sum of true positives and false negatives) generated by the combination of a specific classifier and classification threshold.

真實(shí)肯定率: 特定分類器和分類閾值的組合所生成的 真實(shí)肯定率 (相對(duì)于真實(shí)肯定率和錯(cuò)誤否定率)。

False Positive Rate: Percentage of false positives (to the sum of false positives and true negatives) generated by the combination of a specific classifier and classification threshold.

誤報(bào)率: 特定分類器和分類閾值的組合所產(chǎn)生的誤報(bào)率(占誤報(bào)率和真實(shí)否定值的總和)。

This gives me two numbers, which I can then plot on the curve. I then take another threshold, say 0.4, and repeat this process. After doing this for every threshold of interest (perhaps in 0.1, 0.01, or 0.001 increments), we have constructed an ROC curve for this classifier.

這給了我兩個(gè)數(shù)字,然后可以在曲線上繪制它們。 然后,我將另一個(gè)閾值設(shè)為0.4,然后重復(fù)此過程。 在對(duì)每個(gè)感興趣的閾值執(zhí)行此操作后(可能以0.1、0.01或0.001為增量),我們?yōu)榇朔诸惼鳂?gòu)建了ROC曲線。

An example ROC curve showing how an individual point is plotted. A classifier is selected along with a classification threshold. Following this, the true positive rate and false positive rate for this combination of classification and threshold are calculated and subsequently plotted.示例ROC曲線顯示了如何繪制單個(gè)點(diǎn)。 選擇分類器以及分類閾值。 此后,針對(duì)分類和閾值的這種組合,計(jì)算出真陽性率和假陽性率,并隨后進(jìn)行繪圖。

What is the point of doing this? Depending on your application, you may be very averse to false positives as they may be very costly (e.g. launches of nuclear missiles) and thus would like a classifier that has a very low false-positive rate. Conversely, you may not care so much about having a highfalse positive rate as long as you get a high true positive rate (stopping most events of fraud may be worth it even if you have to check many more occurrences that are flagged by the algorithm as flawed). For the optimal balance between these two ratios (where false positives and false negatives are equally costly), we would take the classification threshold which results in the minimum diagonal distance from the top left corner.

這樣做有什么意義? 根據(jù)您的應(yīng)用,您可能會(huì)反對(duì)誤報(bào),因?yàn)檎`報(bào)的代價(jià)可能很高(例如,發(fā)射核導(dǎo)彈),因此希望分類器的誤報(bào)率非常低。 相反,只要您獲得很高的真實(shí)陽性率,您可能就不會(huì)太在意高假陽性率(即使必須檢查該算法標(biāo)記為的更多事件,停止大多數(shù)欺詐事件也是值得的)有缺陷的)。 為了在這兩個(gè)比率之間實(shí)現(xiàn)最佳平衡(假陽性和假陰性的代價(jià)均相同),我們將采用分類閾值,以使距左上角的對(duì)角線距離最小。

Why does the top left corner correspond to the ideal classifier? The ideal point on the ROC curve would be (0,100), that is, all positive examples are classified correctly and no negative examples are misclassified as positive. In a perfect classifier, there would be no misclassification!

為什么左上角對(duì)應(yīng)于理想分類器? ROC曲線上的理想點(diǎn)是(0,100) 也就是說,所有正樣本都正確分類,沒有負(fù)樣本被誤分類為正樣本。 在一個(gè)完美的分類器中,不會(huì)出現(xiàn)分類錯(cuò)誤!

Whilst a graph may not seem pretty useful in itself, it is helpful in comparing classifiers. One particular metric, the Area Under Curve (AUC) score, allows us to compare classifiers by comparing the total area underneath the line produced on the ROC curve. For an ideal classifier, the AUC equals 1, since we are multiplying 100% (1.0) true positive rate by 100% (1.0) false-positive rate. If a particular classifier has an ROC of 0.6 and another has an ROC of 0.8, the latter is clearly a better classifier. The AUC has the benefit that it is independent of the decision criteria — the classification threshold — and thus makes it easier to compare these classifiers.

雖然圖本身似乎不太有用,但它有助于比較分類器。 一種特殊的度量標(biāo)準(zhǔn),即曲線下面積(AUC)得分,使我們可以通過比較ROC曲線上生成的線下的總面積來比較分類器。 For an ideal classifier, the AUC equals 1, since we are multiplying 100% (1.0) true positive rate by 100% (1.0) false-positive rate. If a particular classifier has an ROC of 0.6 and another has an ROC of 0.8, the latter is clearly a better classifier. The AUC has the benefit that it is independent of the decision criteria — the classification threshold — and thus makes it easier to compare these classifiers.

A question may have come to mind now — what if some classifiers are better at lower thresholds and some are better at higher thresholds? This is where the ROC convex hull comes in. The convex hull provides us with a method of identifying potentially optimal classifiers — even though we may not have directly observed them, we can infer their existence. Consider the following diagram:

A question may have come to mind now — what if some classifiers are better at lower thresholds and some are better at higher thresholds? This is where the ROC convex hull comes in. The convex hull provides us with a method of identifying potentially optimal classifiers — even though we may not have directly observed them, we can infer their existence. Consider the following diagram:

Source: Source: QuoraQuora

Given a family of ROC curves, the ROC convex hull can include points that are more towards the top left corner (perfect classifier) of the ROC space. If a line passes through a point on the convex hull, then there is no other line with the same slope passing through another point with a larger true positive intercept. Thus, the classifier at that point is optimal under any distribution assumptions in tandem with that slope. This is perhaps easier to understand after examining the image.

Given a family of ROC curves, the ROC convex hull can include points that are more towards the top left corner (perfect classifier) of the ROC space. If a line passes through a point on the convex hull, then there is no other line with the same slope passing through another point with a larger true positive intercept. Thus, the classifier at that point is optimal under any distribution assumptions in tandem with that slope. This is perhaps easier to understand after examining the image.

How does undersampling/oversampling influence the ROC curve? A famous paper on SMOTE (discussed previously) titled “SMOTE: Synthetic Minority Over-sampling Technique” outlines that by undersampling the majority class, we force the ROC curve to move up and to the right, and thus has the potential to increase the AUC of a given classifier (this is essentially just validation that SMOTE functions correctly, as expected). Similarly, oversampling the minority class has a similar impact.

How does undersampling/oversampling influence the ROC curve? A famous paper on SMOTE (discussed previously) titled “ SMOTE: Synthetic Minority Over-sampling Technique ” outlines that by undersampling the majority class, we force the ROC curve to move up and to the right, and thus has the potential to increase the AUC of a given classifier (this is essentially just validation that SMOTE functions correctly, as expected). Similarly, oversampling the minority class has a similar impact.

Source: Source : ResearchgateResearchgate

Precision-Recall (PR) Curves

Precision-Recall (PR) Curves

An analogous diagram to an ROC curve can be recast from ROC space and reformulated into PR space. These diagrams are in many ways analogous to the ROC curve, but instead of plotting recall against fallout (true positive rate vs. false positive rate), we are instead plotting precision against recall. This produces a somewhat mirror-image (the curve itself will look somewhat different) of the ROC curve in the sense that the top right corner of a PR curve designates the ideal classifier. This can often be more understandable than an ROC curve but provides very similar information. The area under a PR curve is often called mAP and is analogous to the AUC in ROC space.

An analogous diagram to an ROC curve can be recast from ROC space and reformulated into PR space. These diagrams are in many ways analogous to the ROC curve, but instead of plotting recall against fallout (true positive rate vs. false positive rate), we are instead plotting precision against recall. This produces a somewhat mirror-image (the curve itself will look somewhat different) of the ROC curve in the sense that the top right corner of a PR curve designates the ideal classifier. This can often be more understandable than an ROC curve but provides very similar information. The area under a PR curve is often called mAP and is analogous to the AUC in ROC space.

Source: Source: Researchgate — Ten quick tips for machine learning in computational biologyResearchgate — Ten quick tips for machine learning in computational biology

Final Comments (Final Comments)

Imbalanced datasets are underrepresented (no pun intended) in many data science programs contrary to their prevalence and importance in many industrial machine learning applications. It is the job of the data scientist to be able to recognize when a dataset is imbalanced and follow procedures and utilize metrics that allow this imbalance to be sufficiently understood and controlled.

Imbalanced datasets are underrepresented (no pun intended) in many data science programs contrary to their prevalence and importance in many industrial machine learning applications. It is the job of the data scientist to be able to recognize when a dataset is imbalanced and follow procedures and utilize metrics that allow this imbalance to be sufficiently understood and controlled.

I hope that in the course of reading this article you have learned something about dealing with imbalanced datasets and are in the future will be comfortable in the face of such imbalanced problems. If you are a serious data scientist, it is only a matter of time before one of these applications will pop up!

I hope that in the course of reading this article you have learned something about dealing with imbalanced datasets and are in the future will be comfortable in the face of such imbalanced problems. If you are a serious data scientist, it is only a matter of time before one of these applications will pop up!

Newsletter (Newsletter)

For updates on new blog posts and extra content, sign up for my newsletter.

For updates on new blog posts and extra content, sign up for my newsletter.

翻譯自: https://towardsdatascience.com/guide-to-classification-on-imbalanced-datasets-d6653aa5fa23

數(shù)據(jù)安全分類分級(jí)實(shí)施指南

總結(jié)

以上是生活随笔為你收集整理的数据安全分类分级实施指南_不平衡数据集分类指南的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。

日韩在线观看一区二区 | 亚洲黄色影院 | 国产黄色精品在线观看 | 伊人网av | 久久免费视频这里只有精品 | 久久久久区 | 亚洲va欧美va人人爽春色影视 | 日韩中文字幕免费在线观看 | 国产精品第10页 | 蜜桃av观看 | 欧日韩在线视频 | 伊人手机在线 | 日韩免费观看视频 | 四虎免费在线观看视频 | 伊人www22综合色 | 日韩区在线观看 | 又长又大又黑又粗欧美 | 亚洲理论电影网 | 国产成人精品综合久久久 | 右手影院亚洲欧美 | 国产精品理论片 | 久久久精品国产一区二区三区 | 人人玩人人添人人澡超碰 | 超碰在线人人97 | 国产视频手机在线 | 久久精品国产一区二区 | 天天干天天拍天天操 | 18做爰免费视频网站 | 九色最新网址 | 日本久久久久久久久 | 久草网视频在线观看 | 天天操夜夜拍 | 日日干狠狠操 | 国产一级电影在线 | 国产在线色视频 | 国产亚洲成av人片在线观看桃 | 久久精品一区二区三区四区 | 久久中文网 | 日本精品久久久久中文字幕5 | 国产片免费在线观看视频 | 18女毛片| 欧美 日韩精品 | 最近免费观看的电影完整版 | 永久黄网站色视频免费观看w | 久久精品看片 | 婷婷去俺也去六月色 | 婷婷国产一区二区三区 | 成av在线 | 国产成人av在线影院 | 成年人在线观看网站 | 国产人成在线观看 | 99国产精品久久久久老师 | 久久久18 | 婷婷激情五月 | 黄污视频网站 | 808电影免费观看三年 | 97电影网站 | 日韩av专区 | 96国产精品视频 | 波多野结衣在线播放视频 | 日韩最新av | 天天干夜夜干 | 国产精品99久久久久久宅男 | 免费视频网 | 亚洲精品乱码久久久一二三 | 97人人模人人爽人人喊中文字 | 涩涩色亚洲一区 | 国产成人精品一区二区三区 | 一区二区成人国产精品 | 久久国产精品99国产 | 中文视频在线播放 | 久草在线视频看看 | 97超碰站 | 久久亚洲福利 | 一区二区不卡高清 | 午夜精品福利在线 | 亚洲成av人影院 | 国内成人精品2018免费看 | 国产永久免费高清在线观看视频 | 国产成人在线精品 | 国产在线观看你懂得 | 免费看片网址 | 日韩一级片观看 | 97成人在线 | 日韩av免费在线看 | 日本黄色免费大片 | av中文天堂 | 最新真实国产在线视频 | 九九九视频精品 | 在线观看av免费观看 | 亚洲精选在线 | 人人狠狠综合久久亚洲婷 | 99久久精品国产一区二区三区 | 97成人精品视频在线播放 | 久久久久久蜜桃一区二区 | 狠狠躁日日躁狂躁夜夜躁av | 天天操天天射天天 | 国产在线不卡一区 | 国产一二三区在线观看 | 久久99精品热在线观看 | 国产91精品久久久久 | 日韩av手机在线看 | 国产午夜三级一区二区三 | 这里有精品在线视频 | 久久国产美女视频 | 精品黄色在线观看 | 免费福利片| 九九九九九精品 | av在线精品 | 丁香六月婷婷激情 | 久久av网址 | 国产一在线精品一区在线观看 | 国产精品乱码久久久 | 91香蕉视频在线 | 久久久久久精 | 久久婷婷一区二区三区 | 少妇性色午夜淫片aaaze | 亚洲人成精品久久久久 | 成人免费xxx在线观看 | 激情五月婷婷网 | 国产亚洲精品久 | 成 人 黄 色 视频 免费观看 | 国产 视频 高清 免费 | 久久调教视频 | 开心色停停 | 在线观看中文字幕亚洲 | 久久精品99国产精品 | 91视频黄色| 成人在线观看影院 | 成人资源站 | 1024在线看片| 中文字幕a在线 | av成人在线电影 | 五月婷久久 | 在线视频手机国产 | 在线国产高清 | 国产小视频在线观看免费 | 中文字幕婷婷 | 麻豆视频入口 | 久久久国产在线视频 | 三级黄色片在线观看 | 27xxoo无遮挡动态视频 | 黄色精品一区 | 中国一区二区视频 | 免费在线观看国产黄 | 伊人国产在线观看 | 欧美激情视频在线观看免费 | 国产一区二区久久久 | 久久精品在线 | 中文字幕一区在线观看视频 | 看片的网址 | 日韩午夜小视频 | 国产视频精品免费播放 | 日韩电影一区二区在线 | av在线影片 | 在线看91| 娇妻呻吟一区二区三区 | 99精品视频观看 | 免费日韩一区二区三区 | 2019中文最近的2019中文在线 | 天天射天天干天天 | h网站免费在线观看 | 91探花在线| 日本公乱妇视频 | 中文字幕精品一区 | 九九欧美 | 中国一区二区视频 | 丁香六月网 | 99草在线视频 | 激情欧美丁香 | 欧美性色综合 | 最近日本字幕mv免费观看在线 | 日韩免费av片 | 在线观看日本高清mv视频 | 国产在线播放不卡 | 久久爽久久爽久久av东京爽 | 国产尤物一区二区三区 | 婷婷激情综合五月天 | 在线观看的黄色 | 欧美日韩在线电影 | 国产剧情在线一区 | 最近能播放的中文字幕 | 伊人网站| 国产精品久久三 | 天天射天天干天天操 | 91九色蝌蚪国产 | 国产高清黄| 国产精品观看视频 | 四虎影视精品永久在线观看 | 色婷婷视频在线 | 在线中文字幕视频 | 国内视频 | 91大神在线观看视频 | 久久久久久蜜av免费网站 | 免费看一及片 | 色婷婷电影网 | 视频成人永久免费视频 | 天天撸夜夜操 | av资源在线观看 | 亚洲永久精品视频 | 超碰av在线 | 国产精品成人免费精品自在线观看 | 99精品视频免费观看视频 | 亚洲aⅴ久久精品 | 五月开心婷婷 | 国产小视频福利在线 | 日本精品一 | 国产99久久精品一区二区300 | 久久久国产成人 | 亚洲欧美国产精品久久久久 | a天堂免费 | 超碰公开在线 | 日韩一级成人av | 日韩电影一区二区在线 | 探花视频在线版播放免费观看 | 一区二区三区 中文字幕 | 久久婷婷国产色一区二区三区 | 亚洲国产影院 | 久久99精品国产91久久来源 | 久草免费福利在线观看 | 久久99精品国产麻豆婷婷 | www.色综合.com | av成人免费在线看 | 一级片色播影院 | 国产精品99免费看 | 亚洲午夜久久久久久久久电影网 | 看片网站黄 | 国内精品久久久久影院日本资源 | 亚洲精品高清在线观看 | 中文字幕在线视频网站 | 日韩精品网址 | 97超级碰碰碰视频在线观看 | 欧美日韩亚洲精品在线 | 正在播放国产91 | 欧美日韩国产一区二区在线观看 | 成人91在线观看 | 91人人爱 | 国产高清视频在线免费观看 | 日韩在线免费 | 极品嫩模被强到高潮呻吟91 | 中文字幕在线观看的网站 | 五月天久久久 | 成人亚洲精品久久久久 | 丁香六月国产 | 99re中文字幕 | 精品久久久久久久久久岛国gif | 婷色在线 | 粉嫩av一区二区三区免费 | 91女神的呻吟细腰翘臀美女 | 亚洲综合色播 | 欧美性久久久久久 | 国产99久久久欧美黑人 | 日韩亚洲欧美中文字幕 | 色七七亚洲影院 | 欧美污在线观看 | 永久免费精品视频 | 久久久夜色 | 亚洲在线a | 九九九热精品免费视频观看 | 91亚洲精品久久久蜜桃 | 日韩免费看 | 国产精品国产三级国产不产一地 | 欧美日韩中文在线 | 国产精品原创av片国产免费 | 国产在线最新 | 中国一 片免费观看 | 99av国产精品欲麻豆 | 超碰在线人人艹 | 九九久久久 | 伊人婷婷网 | 日韩免费视频在线观看 | 成人av亚洲| 精品视频在线免费观看 | 免费一级特黄录像 | 免费看黄在线网站 | 久久久亚洲国产精品麻豆综合天堂 | 91伊人| 狠狠色丁香婷婷综合久小说久 | 欧美日韩国产一区二区三区 | 婷婷在线免费 | 免费观看国产精品视频 | 韩日色视频 | 国产一级片视频 | 国产精品久久久久aaaa | 亚洲精品日韩在线观看 | 激情网五月天 | 亚洲日本中文字幕在线观看 | 81精品国产乱码久久久久久 | 日韩国产精品一区 | 国产精品久久网站 | 久久国产一区二区三区 | 成人av一区二区兰花在线播放 | 亚洲在线黄色 | 欧美美女一级片 | 日韩videos高潮hd | 国产一区二区三区免费观看视频 | 在线观看免费av片 | 日韩在线视频观看免费 | 日韩在线一区二区免费 | 日韩久久精品一区二区 | 国产精品久久久久久久久久久久午 | 久久久国产99久久国产一 | 日韩欧美精品在线观看 | 91传媒免费在线观看 | 国产成人免费在线观看 | 中文字幕欧美日韩va免费视频 | 国产高清不卡一区二区三区 | 日本中文字幕在线播放 | 日批视频在线观看免费 | 久久久久久毛片精品免费不卡 | 国产黄网在线 | 911亚洲精品第一 | 久久夜色精品国产欧美乱极品 | 久久激情精品 | 一区二区三区在线观看免费视频 | 亚洲综合色播 | 色五月激情五月 | 精品国产网址 | 欧美一级乱黄 | 日韩av一区二区在线 | 成人免费在线播放视频 | 欧美在线1区 | 国产美女视频免费观看的网站 | 亚洲精品在线一区二区三区 | 人人狠狠综合久久亚洲 | 国产伦精品一区二区三区照片91 | 久久激情小视频 | 国产一级高清视频 | 99精品国产视频 | 精品视频久久 | 伊人久操 | www.久久精品视频 | 免费a级观看 | 欧美日韩亚洲在线观看 | 97偷拍在线视频 | 日日婷婷夜日日天干 | 日韩av一区二区三区 | 国产精品精品视频 | 国产精品xxxx18a99 | 午夜精品视频在线 | 中文字幕免费高清av | 激情五月婷婷丁香 | 天天操天天爱天天爽 | 中文字幕av在线播放 | 91九色在线播放 | 一级理论片在线观看 | 91自拍视频在线 | 日韩成人看片 | 91日韩在线播放 | 奇米777777 | 国产视频精品久久 | 欧美日韩三区二区 | 97超碰人人在线 | 少妇bbb好爽| 久久99操| 91黄色在线看 | 成人免费中文字幕 | 毛片一级免费一级 | 日韩欧美视频免费看 | 日本在线中文在线 | 日日操夜 | 在线免费亚洲 | 国产99久久久国产精品免费看 | 国产高清成人av | 有没有在线观看av | 蜜桃视频在线观看一区 | 亚洲精选在线观看 | 国产成人av在线 | 国产精品白浆视频 | 成人欧美一区二区三区黑人麻豆 | 在线中文字幕一区二区 | 成人影音在线 | 久久电影色 | 99视频在线 | 日韩免费播放 | 精品夜夜嗨av一区二区三区 | 欧美日韩一区二区三区在线免费观看 | 午夜视频福利 | 97在线观看视频 | 欧美贵妇性狂欢 | 黄色毛片网站在线观看 | 久久99久久99精品中文字幕 | 国产精品久久久久久久久软件 | 黄色的片子 | 久久综合九九 | 在线视频黄 | 日韩av免费一区二区 | 国内精品美女在线观看 | 久久久人人爽 | 亚洲精品在线免费观看视频 | 亚洲2019精品 | 久久国产日韩 | 超碰在线官网 | 极品久久久 | 国产精品久久在线 | 国产午夜精品一区二区三区在线观看 | 一级做a爱片性色毛片www | 青青河边草免费观看 | 成人黄色电影在线 | 最新免费av在线 | 97人人精品 | 午夜在线观看 | 日本黄色一级电影 | 色综合久久88色综合天天免费 | 91av视频在线观看免费 | 国产涩图 | 国产视频 亚洲精品 | 精品久久久影院 | 国产 一区二区三区 在线 | 日韩av影视 | 99热手机在线观看 | 国产色视频网站2 | 在线观看日本高清mv视频 | 日韩色高清 | 99视频在线免费播放 | 91亚·色| 99久久久国产精品免费99 | 国产又粗又猛又黄又爽 | 在线观看网站你懂的 | 人人看人人做人人澡 | 欧美日韩一区二区视频在线观看 | 亚洲黄色一级视频 | 91免费视频黄 | 奇米影视777影音先锋 | 在线激情av电影 | 国产精品毛片一区二区三区 | 黄色在线观看网站 | 天天综合网~永久入口 | 国产九色在线播放九色 | 四虎小视频 | 亚洲国产网站 | 成人播放器 | www.午夜视频 | 91试看| 国产精品99久久久久久宅男 | 婷婷九月丁香 | 色综合天天天天做夜夜夜夜做 | 亚洲欧洲一级 | 欧美激情第十页 | 香蕉视频日本 | 五月婷婷六月丁香激情 | 91精品综合在线观看 | 久久久999精品视频 国产美女免费观看 | 国产精品一区二区三区观看 | 久久美女电影 | 成人午夜电影在线播放 | 99成人免费视频 | 91麻豆产精品久久久久久 | 99久久精品国产欧美主题曲 | 成人在线免费视频观看 | 亚洲精品乱码久久久久久9色 | 国内精品中文字幕 | 亚洲jizzjizz日本少妇 | 不卡的av电影在线观看 | 久久桃花网 | 国内成人精品2018免费看 | 在线欧美a | 欧美性受极品xxxx喷水 | 中文字幕在线观看免费观看 | 亚洲黄色免费观看 | 亚洲精品在线观看的 | www.eeuss影院av撸 | 久久久久久黄色 | 免费中文字幕在线观看 | 在线观看中文字幕亚洲 | 一级片免费视频 | 在线日韩亚洲 | 伊人看片| 久章草在线 | 欧美日韩高清一区二区 国产亚洲免费看 | 亚洲女裸体 | 国产在线看一区 | 色资源在线观看 | 国产美女精品视频 | 亚洲精品自在在线观看 | 男女免费av | 欧美午夜精品久久久久久孕妇 | 日韩动态视频 | 久久精品福利视频 | 日韩在线视频看看 | 亚洲午夜久久久久久久久 | 久久艹影院 | 欧美精品免费一区二区 | 99久久精品久久久久久动态片 | 成年人黄色在线观看 | 99 精品 在线 | 成人9ⅰ免费影视网站 | 精品亚洲欧美无人区乱码 | 波多野结衣在线观看一区 | 亚洲精品小视频 | 麻豆影视网站 | 91精品视频观看 | 黄www在线观看 | 最新日韩在线 | 日狠狠| 国产在线一区观看 | 国产精品日韩在线观看 | 日日日日| 色五月激情五月 | 欧美网站黄色 | 国产视频欧美视频 | 国产精品久久久久久久久久久久 | 久久久免费电影 | 国产成人av免费在线观看 | 国产视频18 | 美女网站色在线观看 | 欧美做受高潮1 | www.久久精品视频 | 免费人成网ww44kk44 | 欧美亚洲成人免费 | 亚洲成人动漫在线观看 | 中文在线www| 99视频精品免费视频 | 黄污视频网站大全 | 四虎永久网站 | 日本三级国产 | 亚洲免费av片 | 欧美精品久久久久久久久久 | 婷婷在线色 | 91精品播放| 久久综合成人 | 欧美日韩免费视频 | 狠狠干综合网 | 精品黄色在线 | 免费av大片 | 午夜成人免费影院 | 免费电影一区二区三区 | 黄色影院在线观看 | 麻豆果冻剧传媒在线播放 | 成人毛片一区二区三区 | 精品女同一区二区三区在线观看 | 国产vs久久 | 成人影视免费看 | 国产理伦在线 | 午夜三级在线 | 999久久久久久久久 69av视频在线观看 | 国产精品美女久久久久久网站 | 欧美成年黄网站色视频 | 国产高清一级 | 国产精品一区二区免费看 | 中文字幕在线播出 | 综合在线观看 | 又黄又刺激又爽的视频 | av色图天堂网 | 在线观看免费高清视频大全追剧 | 午夜视频免费在线观看 | 黄色av成人在线观看 | 最新日韩在线观看 | 九色精品免费永久在线 | 国产人在线成免费视频 | 日日爱999 | 免费观看91视频大全 | 日韩欧美第二页 | 亚洲狠狠 | 久久久久久久久久久久久久av | 成年人黄色免费看 | 亚洲高清av | 最新av电影网站 | 日韩精品免费一区二区在线观看 | 国产69久久精品成人看 | 日韩午夜视频在线观看 | 在线观看亚洲精品 | 91九色国产视频 | 亚洲欧美日韩在线一区二区 | 国产女教师精品久久av | 国产精品精品久久久久久 | 久久成人国产精品 | 亚洲综合色婷婷 | 色中文字幕在线观看 | 亚州成人av在线 | 激情久久久久 | 黄色大全在线观看 | 中文字幕第一页av | 亚州av网站| 国产精品久久久久久久久久久杏吧 | bbbbb女女女女女bbbbb国产 | 日韩高清不卡在线 | 日本少妇久久久 | 欧美精品在线视频 | 最新av在线播放 | 免费精品视频在线观看 | 亚洲国产中文字幕在线 | 免费观看www小视频的软件 | 欧美肥妇free | 国产精品久久久久国产精品日日 | 成人黄色在线 | 免费在线观看一区 | 欧美 日韩 国产 中文字幕 | 日韩欧美高清在线观看 | 免费在线观看一区 | 中文av一区二区 | 亚洲第一区精品 | 伊人影院在线观看 | 99视频精品在线 | 久草视频在线观 | 日韩欧美精品一区二区三区经典 | 国产剧情一区二区在线观看 | 亚洲欧美日韩在线一区二区 | 超碰97在线资源 | 国产a级片免费观看 | 欧美精品在线观看一区 | 欧美日韩99 | 亚洲视频一级 | 国产美女久久 | 99久久精品国 | 成年人在线观看网站 | 激情丁香综合五月 | 中文字幕欲求不满 | 丝袜美腿av | 亚洲精品久久久蜜臀下载官网 | 国产精品24小时在线观看 | 国产高清在线视频 | 黄a在线观看 | 日韩欧美大片免费观看 | 免费看毛片在线 | 黄色av一区二区 | 久久午夜网 | 久久综合九色综合欧美就去吻 | 国产伦理剧 | 美女网站色在线观看 | 蜜臀av夜夜澡人人爽人人桃色 | 久久亚洲成人网 | 国产不卡av在线播放 | 娇妻呻吟一区二区三区 | 天天干亚洲 | 99欧美视频 | 99在线观看视频 | 欧美日韩国产色综合一二三四 | 美女视频黄免费的久久 | 亚洲粉嫩av | 91| 久久久在线视频 | 日本精品视频在线观看 | 欧美性色网站 | 二区三区av| 国产精品成人aaaaa网站 | 国产一区在线播放 | 婷婷丁香导航 | 国产精品字幕 | 91在线精品视频 | 亚洲精品一区中文字幕乱码 | 久久精品看 | 激情电影影院 | 欧美另类一二三四区 | 久久人人爽人人爽人人片 | 成年人在线免费看 | 国产福利精品在线观看 | 精品一区在线看 | 中文字幕乱码视频 | 欧美日韩调教 | 亚洲欧美视频在线观看 | 日本少妇久久久 | 草久久久久 | 四虎在线免费视频 | 99久久99久久 | 日韩成人中文字幕 | 亚洲电影久久久 | 亚洲影视九九影院在线观看 | 日韩中文字幕一区 | 亚洲综合成人婷婷小说 | 亚洲资源在线网 | 国产最新在线视频 | 国产精品久久99综合免费观看尤物 | 天天射天天做 | 不卡的av中文字幕 | 亚洲成免费| 久久九九久久精品 | 国产日韩在线观看一区 | 久久综合色综合88 | 丝袜制服综合网 | 亚洲aⅴ在线 | 最近高清中文字幕在线国语5 | 中文字幕精品www乱入免费视频 | 中文字幕区 | 88av色| 99精品美女| 精品欧美在线视频 | 在线观看黄色小视频 | 国产在线精品视频 | 久久99精品久久久久久三级 | 色婷婷激情四射 | 久久激情日本aⅴ | 综合色狠狠 | 国产精品一区二区av日韩在线 | 欧美日韩中文国产 | 国内丰满少妇猛烈精品播 | 视频一区二区精品 | 国产精品午夜在线 | 久久婷婷一区二区三区 | 丝袜少妇在线 | 中文久久精品 | 97视频亚洲 | 久久国产免费视频 | 久久精品官网 | 天天插天天干 | 国产伦精品一区二区三区四区视频 | 日韩av综合网站 | 国产.精品.日韩.另类.中文.在线.播放 | 成人欧美一区二区三区黑人麻豆 | 黄色三级免费观看 | 一级特黄aaa大片在线观看 | 九七视频在线 | 国产成人久久精品一区二区三区 | 日韩xxxx视频 | 伊人影院av | 国产区在线看 | 日韩高清久久 | 久久久久久免费毛片精品 | 欧美小视频在线 | 国产又粗又猛又黄 | 五月婷婷综合激情网 | 中文字幕国语官网在线视频 | 亚洲精品在线观 | 欧美日韩在线视频一区 | 亚洲性xxxx | 亚洲一级久久 | 成人91免费视频 | 国产精品99蜜臀久久不卡二区 | 日躁夜躁狠狠躁2001 | 国产精品美女视频网站 | 久久人人做 | 久久久在线免费观看 | 四虎4hu永久免费 | 亚洲精品成人网 | 国产91精品在线播放 | 99视频在线播放 | 久久国产露脸精品国产 | 91桃色在线观看视频 | 午夜精品福利一区二区 | 在线观看91网站 | 在线观看黄色 | 国产小视频国产精品 | av性网站| 国产精品福利在线观看 | 日韩在线视频免费看 | 香蕉日日 | 黄色www在线观看 | 在线观看视频 | www在线观看国产 | www.亚洲视频.com | 毛片www | 99热最新精品 | 啪嗒啪嗒免费观看完整版 | 丝袜美腿在线播放 | 麻豆影视网站 | 中文字幕日韩免费视频 | 日本中文字幕一二区观 | 黄色视屏免费在线观看 | 成年人免费看的视频 | 91在线视频精品 | 免费在线观看日韩 | 欧美日本国产在线观看 | 国产在线探花 | 狠狠狠色丁香综合久久天下网 | 亚洲精品黄色在线观看 | 成人av教育| av在线一 | 亚洲综合精品在线 | 91成年视频| 黄在线免费观看 | 91av免费观看 | 美女福利视频 | 午夜在线免费观看 | 欧美色久 | 蜜桃av人人夜夜澡人人爽 | av高清一区 | 婷婷 中文字幕 | 国产精品福利在线观看 | 国产三级视频 | 天天插天天爱 | 亚洲精品1区2区3区 超碰成人网 | 91最新在线观看 | 精品在线看 | 成人亚洲精品国产www | 尤物97国产精品久久精品国产 | 国产精品五月天 | 国产中文字幕免费 | 免费在线观看91 | 欧美另类高潮 | 天天艹天天 | 中文字幕高清有码 | 国产成人精品一区二区三区 | 日韩视频一区二区在线观看 | 精壮的侍卫呻吟h | 欧美精品一区二区在线观看 | 24小时日本在线www免费的 | 久草久热| 夜色资源网 | 久久成人高清 | 黄色免费网站下载 | 亚洲国产日韩av | 狠狠干综合网 | 婷婷五月在线视频 | 久久高清毛片 | 国产精品不卡在线观看 | 狠狠亚洲| 欧美一级视频免费 | 久久狠狠亚洲综合 | 99热高清| 国产精品免费在线播放 | 国产精品一区二区中文字幕 | 91网站在线视频 | 日本中文字幕在线观看 | 天天操人 | 精品国产91亚洲一区二区三区www | 欧美日韩午夜 | 亚洲欧美精品在线 | 波多野结衣精品在线 | 手机在线黄色网址 | 国产精品影音先锋 | 久久国产精品99精国产 | 国产精品99久久久久的智能播放 | 久草视频在线新免费 | 精品婷婷| 国产成人精品亚洲精品 | 欧美影院久久 | 免费高清av在线看 | 中文字幕精品一区二区三区电影 | 中文字幕的 | 国产午夜三级一区二区三桃花影视 | 国产国产人免费人成免费视频 | 五月天最新网址 | 国产精品免费av | 久久情爱 | 久久99国产综合精品 | 欧美日韩一区二区三区在线观看视频 | 人人网av| 欧美俄罗斯性视频 | 91日韩精品视频 | 亚洲五月 | 久久久久成人精品 | 久久免费福利 | 免费aa大片| 精品色综合 | 麻豆精品视频在线 | 国产精品黄色 | 亚洲精品乱码久久久久久高潮 | 国产97视频在线 | 久久久久久美女 | 91丨九色丨高潮丰满 | 午夜黄色一级片 | 久久av福利 | 高清精品在线 | 在线看片一区 | 丁香视频全集免费观看 | 在线看国产 | 91自拍视频在线观看 | 日韩一二区在线观看 | 久久久麻豆视频 | 97精品一区 | 国产va精品免费观看 | 国产日韩欧美网站 | 国产va饥渴难耐女保洁员在线观看 | 国产美女在线精品免费观看 | 香蕉97视频观看在线观看 | 免费在线播放视频 | 在线日韩 | 国产精品一区二区免费视频 | 日韩免费在线观看视频 | 91久久精品一区 | 国产精品原创视频 | 国产综合小视频 | 久久激情影院 | 91精品一区国产高清在线gif | 狠狠亚洲 | 欧美色图p | 中文在线中文资源 | 国产人成免费视频 | 欧美一区日韩一区 | 国产在线视频在线观看 | 国产 日韩 在线 亚洲 字幕 中文 | 国产高清不卡在线 | 最新av网址在线 | 色婷婷综合久久久久中文字幕1 | 九九九热精品 | 人人干网 | 狠狠天天 | 久草在线看片 | 九九九在线观看视频 | 天天射天天添 | 天天操操操操操 | 精品免费观看视频 | 国产精品乱看 | av片一区 | 中文字幕欲求不满 | 成人免费共享视频 | 国产精品久久久电影 | 丁香婷婷色综合亚洲电影 | 亚洲精品美女久久久久 | 日日夜夜狠狠操 | 毛片a级片 | 香蕉久久久久久av成人 | 国产亚洲精品美女久久 | 久久五月激情 | 久久久999精品视频 国产美女免费观看 | 五月天综合 | 色狠狠综合 | 午夜在线观看影院 | 亚洲 欧美变态 另类 综合 | 三级在线视频观看 | 成av人电影 | 人人爽夜夜爽 | 99久久99| 国产美女免费观看 | 成人国产电影在线观看 | 国产糖心vlog在线观看 | 久草视频免费看 | 亚洲三级av | 91人人爽久久涩噜噜噜 | 国产永久免费 | 国产高清日韩欧美 | 黄污网站在线观看 | 一级黄色片在线观看 | 色在线观看网站 | 天天做天天爱天天爽综合网 | 99热这里精品 | 激情喷水 | 99精品系列 | 免费网站看v片在线a | 人人舔人人舔 | 免费三级网 | 久久成人国产精品一区二区 | 丁香五月网久久综合 | 久久国精品 | 欧美在线观看视频一区二区三区 | 天天鲁一鲁摸一摸爽一爽 | 91超级碰| 婷婷av资源 | 国产亚洲精品久久久网站好莱 | 日韩成人在线一区二区 | 国产流白浆高潮在线观看 | 97香蕉久久超级碰碰高清版 | 久久99久久99精品免观看粉嫩 | 国产麻豆精品传媒av国产下载 | 国产精品毛片一区二区三区 | 在线免费观看一区二区三区 | 色99视频 | 久久黄色a级片 | 丁香导航 | 成人四虎 | www.色综合.com | 国产精品久久久久久一区二区三区 | 在线观看av网 | 久久久久久久久久久久国产精品 | 91精品婷婷国产综合久久蝌蚪 | 免费试看一区 | 一区二区三区动漫 | 国产97视频 | 欧美日韩伦理在线 | 日韩久久电影 | 日韩欧美在线观看 | 香蕉视频国产在线 | 日韩在线观看视频中文字幕 | 欧美精品视| 久久视频免费在线观看 | 日韩免费一区二区三区 | 国产麻豆精品95视频 | 91高清免费看 | 日本视频网 | 99热精品视| 精品999在线 | 国产无套视频 | 91精品国产福利在线观看 | a视频在线播放 | 国产精品1000 | 免费观看成人av | 成人免费视频网站在线观看 | 免费观看av网站 | 国产精品自在线 | 日韩精品在线视频免费观看 | 国产特级毛片aaaaaa | 天天操夜夜曰 | 五月天丁香 | 亚洲成aⅴ人在线观看 | 黄色国产区 | 久草在线高清视频 | 中文字幕在线网 | 久久免费观看少妇a级毛片 久久久久成人免费 | 欧美精品做受xxx性少妇 | 国产伦精品一区二区三区在线 | 中日韩欧美精彩视频 | 精品美女久久久久久免费 | 国产精品久一 | 亚洲激情在线观看 | 久久久久久久99 | 97夜夜澡人人双人人人喊 | 久久艹欧美 | 中文字幕精品一区二区三区电影 | 亚洲成av人片在线观看www | 97偷拍在线视频 | 五月天婷婷免费视频 | 夜色资源站wwwcom | 免费在线激情电影 | 九九九在线 | 国产精品久久久久久久99 | 波多野结衣视频一区二区三区 | av中文电影 | 在线播放91|