无监督学习 k-means_监督学习-它意味着什么?
無監督學習 k-means
When we think of a machine, we often think of it in the engineering sense — an actual physical device (with moving parts) that makes some work easier. In machine learning, we use the term machine much more liberally, such as in the support vector machine or the restricted Boltzmann machine — do not worry about these for now. Luckily, none of these machines comes any close to the kind we see in Terminator movies or the Marvel cinematic universe.
當我們想到一臺機器時,我們通常會從工程的角度來考慮它-一種實際的物理設備(帶有運動部件),使某些工作變得容易。 在機器學習中,我們更廣泛地使用機器一詞,例如在支持向量機或受限的玻爾茲曼機器中 -暫時不用擔心。 幸運的是,這些機器都不比我們在《終結者》電影或《漫威電影世界》中看到的那種機器更接近。
Instead, what we refer to as a machine is often an unassuming computer programme that you may feed with some kind of data, and it would, in turn, be able to make some predictions about the future, derive some insights about the past, or take some optimal decisions. Such a computer programme may be stored on your PC or your smartphone, or in the brain of a robot — it really doesn’t matter where — and it’d still be a machine, regardless. The most basic ingredient, however, is data.
取而代之的是,我們所說的機器通常是一個不帶假設性的計算機程序,您可能會提供一些數據,從而可以對未來做出一些預測,對過去有所了解,或者做出一些最佳決定。 這樣的計算機程序可以存儲在您的PC或智能手機上,也可以存儲在機器人的大腦中–并不重要,在任何地方–仍然可以是一臺機器 。 但是,最基本的要素是數據。
This data could come in many diverse forms: it could be data obtained from a survey or a poll, a physical or chemical experiment, medical records or diagnostics, images of food on the internet, or one’s Facebook posts, really. The data could as well be biometrics such as one’s fingerprints. For example, you may recall when you had a new smartphone and you had to set up fingerprint recognition. You provide the computer programme or machine residing inside the phone your fingerprint data (including those you rotate and deliberately distort); the machine then identifies a pattern in your fingerprint data that distinguishes it from everybody else’s; subsequently, it is able to predict whether any new fingerprint belongs to you or an intruder. This is the stuff of a subfield of machine learning known as semi-supervised learning, which combines elements of supervised and unsupervised learning principles. In this post, we will focus only on supervised learning.
這些數據可以有多種形式:可以是從調查或民意測驗,物理或化學實驗,病歷或診斷程序,互聯網上的食物圖像或Facebook帖子中獲取的數據。 數據也可以是生物特征,例如一個人的指紋。 例如,您可能想起了何時擁有新的智能手機,并且必須設置指紋識別。 您向手機內的計算機程序或機器提供指紋數據(包括您旋轉和故意扭曲的數據); 然后, 機器會在您的指紋數據中識別出一種模式,以區別于其他模式; 隨后,它可以預測是否有任何新指紋屬于您或入侵者。 這是機器學習稱為半監督學習 ,它結合的監督和無監督學習原理的元素的子場的東西。 在這篇文章中,我們將只關注監督學習。
To think more broadly of supervised learning, it may be useful to imagine this dialogue with a much younger sibling who encounters a dog for the first time on TV.
要更廣泛地考慮監督學習,可以想象一下與第一次在電視上遇到狗的年輕同胞的對話。
“What is this?” your sibling asks you, pointing to a group of dogs on the programme.
“這是什么?” 您的兄弟姐妹問您,指著該計劃中的一群狗。
“It’s what we call a dog,” you respond.
您回答:“這就是我們所說的狗?!?
The innocent child is content, because this is the first time they’re seeing this animal; they can’t disagree with you, at least until one week later when you’re watching the same TV programme again, and they see some cats.
無辜的孩子很滿足,因為這是他們第一次看到這種動物。 他們不會不同意您的意見,至少要等到一個星期后,當您再次觀看同一電視節目時,他們才會看到一些貓。
“Look! Here’s a group of small dogs,” they say.
“看! 下面是一組小型犬的,”他們說。
“No, those are cats,” you say, smiling, seeing the confusion in their face. Yet, your sibling raises no objections, because they probably reason, not so incorrectly, that a dog is generally large, and a cat is generally small… until the following week when you watch this programme again, and they see a group of puppies.
“不,那是貓。”看到他們的困惑,微笑著說。 但是,您的兄弟姐妹沒有提出異議,因為他們可能(不是很錯誤地)認為狗通常很大,貓通常很小……直到下周您再次觀看此程序時,他們才看到一群小狗。
“Hey look, here’s a group of brown cats,” your young sibling says.
“嘿,看,這是一群棕貓,”你的小兄弟姐妹說。
You smile again. “Those are actually dogs, believe it or not,” you say.
你又笑了 您說:“這些狗實際上是狗,信不信由你?!?
Now they don’t know if you’re messing with them or not, so they lean in closer toward the TV, and then observe that the dogs have prominent snouts while the cats they saw the week before had more or less flatter faces. That must be it, the child decides.
現在,他們不知道您是否在和他們開玩笑,所以他們靠在電視機前,然后觀察狗的鼻子是突出的,而前一周看到的貓的臉則或多或少地變得平坦。 那一定是,孩子決定。
Several things stand out from this analogy: first, and in fact the main thing that distinguishes supervised learning from other fields of machine learning, is the simple fact that you actually tell your sibling what animal it is, whenever they come across one. This may seem a rather trivial distinction, but consider the contrasting scenario where your sibling didn’t have you around, and they probably end up assuming that the universe is populated with dogs, and that a cat is just a small dog. We refer to this paradigm of machine learning as supervised learning because you essentially act like some kind of teacher or a supervisor who puts a label or an annotation (i.e., “dog” or “cat”) on any new animal (i.e., data) your sibling (who’s acting as our machine) comes across. For this reason, we often refer to the data that is employed in supervised learning settings as labelled or annotated data. In the fingerprint recognition example, what label is used to train the machine to detect an intruder’s prints may not be so obvious. But if one considered it critically, with you giving the machine many examples of your fingerprints, the machine learns to associate your prints to a label which is a binary indicator, i.e., 1 for your fingerprints, and 0 for all other fingerprints it did not see during the setup phase of the phone. This falls under yet another subfield known as anomaly detection, since an intruder’s prints are considered as anomalies to what the machine has come to know.
從這個類比中可以看出幾點:首先,事實上,有監督的學習與其他機器學習領域之間的區別是,一個簡單的事實,就是當兄弟姐妹碰到動物時,您實際上告訴了它是什么動物。 這看起來似乎是微不足道的區別,但是考慮一下相反的情況,即您的兄弟姐妹沒有您在附近,而他們最終可能會假設宇宙中充滿了狗,而貓只是一只小狗。 我們將這種機器學習范式稱為監督學習,因為您的行為本質上就像是在任何新動物(即數據)上貼上標簽或注釋(即“狗”或“貓”)的某種老師或主管一樣您的兄弟姐妹(充當我們機器的兄弟姐妹)遇到了。 因此,我們通常將在有監督的學習設置中使用的數據稱為標記或注釋數據。 在指紋識別示例中,用來訓練機器以檢測入侵者的指紋的標簽可能不是那么明顯。 但是,如果認為這是關鍵的是,你給機器指紋的例子很多, 機器學會了你的指紋,因為它沒有其他的指紋關聯到一個標志 ,是一個二進制值即,1代表你的指紋,和0請參閱手機的設置階段。 這屬于另一個稱為“ 異常檢測”的子領域,因為入侵者的打印被認為是機器已知信息的異常 。
The second thing that stands out from the analogy is that the child is never explicitly told precisely what defines a dog or a cat; if they were told, it wouldn’t really be learning, but more like memorising. Instead, they have to figure out themselves by observing the characteristics of the two animals: they identify the size of the animal, as well as the presence of a prominent snout, as being indicative of the target, i.e., whether the animal is a dog or a cat. These things that help in identifying the animal, i.e., the size of the animal and the presence of a prominent snout, are often referred to as features in machine learning. As you may expect, the set of features that are indicative of the target, i.e., the animal being a dog or a cat, are not limited to just those two, but can possibly be quite large. For example, a meticulous child may also observe differences in features such as the lengths of the tails of the two animals, the sizes of their ears or the length of their paws. All these characteristics may constitute the feature set. In machine learning, we think of the set of all features as a vector, and the dimension or size of this vector (which is just the number of features) is referred to as the dimensionality.
從類推中脫穎而出的第二件事是,從來沒有明確地告訴孩子確切的定義是狗還是貓。 如果被告知,那將不是真正的學習,而更像是回憶。 相反,他們必須通過觀察兩只動物的特征來弄清楚自己:他們確定了動物的體型以及突出的鼻子的存在,以此作為目標的指示,即動物是否是狗。或貓。 這些東西,在識別所述動物,即幫助,動物的大小和一個突出的口鼻部的存在,常常被稱為在機器學習功能 。 如您所料,指示目標的一組特征(即,動物是狗還是貓)不僅限于這兩個特征,而且可能很大。 例如,一個細心的孩子可能還會觀察到特征上的差異,例如兩只動物的尾巴長度,耳朵的大小或爪子的長度。 所有這些特征可以構成特征集 。 在機器學習中,我們將所有特征的集合視為一個向量,并且該向量的維數或大小(即特征的數量)稱為維數 。
Eventually, the child learns certain rules on their own — we will later see in subsequent posts exactly how this is done — about these features with which they are able to predict on their own whether a given animal is a dog or a cat. Such a rule might be: if the height of the animal is less than twenty centimetres, and it has no prominent snout, and its tail is at most ten centimetres long, and its ear is at most three centimetres in diameter, then it is a cat; otherwise, it’s a dog.
最終,孩子會自己學習某些規則-我們稍后將在后續文章中確切地了解這是如何完成的-有關這些功能的信息,他們可以自己預測給定動物是狗還是貓。 這樣的規則可能是:如果動物的身高小于20厘米,并且沒有明顯的鼻子,并且其尾巴最多10厘米長,耳朵的直徑最多3厘米,那么它就是貓; 否則,它是一只狗。
Learning such rules about the features is usually only the first of three main phases in machine learning, and is known as the training phase; the second phase involves validating how correct the rules learned in the training phase are, and is known as validation; in this validation stage, we test the learned rules on new or unseen data in the hopes of tweaking the rules, if those rules don’t really apply. For example, after your sibling saw the cat, they must have learned a rule like so: “a dog is generally large, while a cat is generally small”. However, upon coming across a small dog, they changed the rules and then included the presence of a prominent snout. This is what happens in the validation phase of machine learning — adjusting the learned rules usually via adjusting certain high-level parameters known as hyperparameters. The third and final phase is known as testing and is very similar to the validation phase, in that the rules learned in the training/validation phases are put to the test again on new or unseen data. However, unlike in the validation phase, there are usually no (or restricted) avenues to tweak the learned rules at this stage, because the rules (which now constitute the machine), are often deployed in a product such as your smartphone or computer system. There are, of course, systems or machines that are designed so that they are capable of constantly training themselves using the data they encounter even during the testing phase.
學習有關這些功能的規則通常只是機器學習三個主要階段中的第一個階段,稱為訓練階段。 第二階段涉及驗證在訓練階段學到的規則的正確性,稱為驗證 ; 在此驗證階段,我們將在新數據或看不見的數據上測試學習到的規則,以期對規則進行調整(如果這些規則并非真正適用的話)。 例如,在您的兄弟姐妹看到貓之后,他們一定學會了這樣的規則:“狗通常很大,而貓通常很小”。 但是,當遇到一只小狗時,他們改變了規則,然后加入了一個突出的鼻子。 這就是在機器學習的驗證階段發生的事情-通常通過調整某些稱為“ 超參數”的高級參數來調整學習的規則。 第三也是最后一個階段稱為測試 ,它與驗證階段非常相似,因為在訓練/驗證階段中學習的規則將根據新數據或看不見的數據再次進行測試。 但是,與驗證階段不同,在此階段,通常沒有(或受限制的)方法來調整學習的規則,因為規則(現在構成了機器 )通常部署在智能手機或計算機系統等產品中。 當然,有些系統或機器的設計使其即使在測試階段也能夠使用遇到的數據不斷進行自我訓練。
So far, it may not be obvious what makes supervised machine learning challenging, if all it entails is learning rules from features about some targets. (Recall based on the analogy used that the targets are the labels on the animals, i.e., “dog” or “cat”, and the features are the characteristics of the animals by which we can decide that it is a dog or a cat, i.e., its size, the presence of a prominent snout, etc.) Yet, the peculiarities of many real-world problems for which we wish to employ machine learning are such that: (1) the rules we want to learn from the features about the targets may be rather too complex; (2) the targets and/or features may be noisy; (3) the features on which the rules ought to be based may not even be so obvious to us. I will now describe these specific challenges in a little more detail.
到目前為止,使受監督的機器學習更具挑戰性的可能不是很明顯,只要它所需要的就是從某些目標的特征中學習規則。 (根據類推,回想一下,目標是動物的標簽,即“狗”或“貓”,特征是動物的特征,通過它們我們可以確定它是狗還是貓,例如,它的大小,突出的鼻子等的存在。)然而,我們希望采用機器學習的許多現實世界問題的特點是:(1)我們想從以下特征中學習的規則:目標可能太復雜了; (2)目標和/或功能可能嘈雜; (3)規則應以其為基礎的功能對我們來說可能并不那么明顯。 我現在將更詳細地描述這些具體挑戰。
First, recall the rules by which your sibling learned to distinguish a dog from a cat? If the height of the animal is less than twenty centimetres, and it has no prominent snout, and its tail is at most ten centimetres long, and its ear is at most three centimetres in diameter, then it is a cat; otherwise, it’s a dog. This rule uses only four features related to: height, snout, tail and ears. Now imagine you had a million features — yes, that’s a realistic number in some machine learning applications such as computer vision — with which to train a machine that can identify all the objects in your house, then you can very well imagine that the rules about these million features aren’t going to be as trivial as the one we have seen. Just for the avoidance of doubt, these rules are, in fact, mathematical relationships between the features and the targets. Except for simple machine learning problems, these mathematical relationships are rarely simple comparators like “if the height is greater than twenty centimetres, then it is a dog”, but are often ones involving complex operations such as exponentiation of these features.
首先,還記得您的兄弟姐妹學會了區分狗和貓的規則嗎? 如果動物的身高不到20厘米,并且沒有明顯的鼻子,并且尾巴最長不超過10厘米,耳朵直徑不超過3厘米,則說明它是貓; 否則,它是一只狗。 該規則僅使用與以下四個特征相關的特征:身高,鼻子,尾巴和耳朵。 現在,假設您擁有一百萬個功能-是的,在某些機器學習應用程序(例如計算機視覺)中,這是一個現實的數字,通過這些功能訓練一臺可以識別房屋中所有物體的機器 ,那么您可以很好地想象一下有關這100萬個功能不會像我們所看到的那樣微不足道。 為了避免疑問,這些規則實際上是特征和目標之間的數學關系。 除了簡單的機器學習問題外,這些數學關系很少是簡單的比較器,例如“如果高度大于二十厘米,那么它就是一條狗”,但經常涉及復雜的操作,例如對這些特征求冪。
It is often joked that the engineer thinks their equations are an approximation to reality, and the physicist that reality is an approximation to their equations, while the mathematician just doesn’t care. In coming up with these complex mathematical relationships between the features and the target for any given problem, our machine often balances a tradeoff between being an engineer and being a mathematician. If we didn’t care that our rules or mathematical relationships are close to our understanding of reality, then we may possibly come up with very accurate relationships. But if we insist on the rules being explainable based on our rough approximation of reality, then this may be at the expense of some loss in accuracy in our machine’s output. This is known as the accuracy-explainability tradeoff.
經常開玩笑的是,工程師認為他們的方程是對現實的近似,而物理學家認為現實是對方程的近似,而數學家根本不在乎。 在針對任何給定問題提出特征與目標之間的這些復雜數學關系時,我們的機器通常會在工程師和數學家之間進行權衡。 如果我們不在乎我們的規則或數學關系是否接近我們對現實的理解,那么我們可能會提出非常準確的關系。 但是,如果我們堅持基于對現實的粗略近似就可以解釋這些規則,那么這可能是以犧牲機器輸出的準確性為代價的。 這稱為精度-可解釋性折衷。
Furthermore, in the analogy we used, we have assumed that you always correctly tell your sibling what the right animal is, whenever they encounter it. Thus, your sibling always has the right label or target to reason about the features. In practice, this is hardly the case; the targets can be deliberately or inadvertently flipped. For example, if while watching the TV programme, you were seated quite far from the TV when the puppy came on, your sibling might have shouted, “Is this another cat?” And because you’re probably myopic and couldn’t see the animal quite clearly, you might have simply responded “Yes.” Alternatively, you might have actually seen the puppy quite clearly, but when you shouted back to your sibling: “No, it’s a dog!”, this response got lost in some ongoing conversation in the room, and your sibling heard you as saying, “Yes, it’s a cat!”. Thus, your sibling ends up learning the wrong rules to distinguish a dog from a cat. In this case, we refer to the targets as being noisy, because they are no longer error-free. The features may also be noisy, for example, the image of the cats and dogs your sibling saw on TV might have been distorted or occluded around the snout of a dog. Due to such noisy observations we cannot learn rules that are absolutely correct; we can only be probably approximately correct (PAC), which is a mathematical framework for analysing machine learning methods. There are even worse scenarios where whole noisy inputs are introduced into the machine learning deliberately by adversaries with malicious intents. For example, the machine in a self-driving car was fooled by adversarial inputs to drive 50 mph over the speed limit. This has led to research into what’s referred to as adversarial machine learning, dealing with how to simulate and detect adversarial examples.
此外,在我們所使用的類比中,我們假設您總是在正確的時候告訴您的兄弟姐妹什么是對的動物。 因此,您的兄弟姐妹始終具有正確的標簽或目標來推理特征。 實際上,情況并非如此。 可以故意或無意地翻轉目標。 例如,如果在看電視節目的時候,當小狗來時您坐在電視旁邊很遠的地方,您的兄弟姐妹可能會喊道:“這是另一只貓嗎?” 而且,由于您可能是近視者,而且看不清動物的身影,因此您可能只是回答“是”。 或者,您實際上可能已經很清楚地看到了這只小狗,但是當您對同胞大喊:“不,那是一條狗!”時,在房間里正在進行的對話中,這種回應就消失了,同胞聽到了您的聲音, “是的,它是只貓!” 因此,您的兄弟姐妹最終學習了錯誤的規則以區分狗和貓。 在這種情況下,我們將目標稱為“ 嘈雜” ,因為它們不再沒有錯誤。 這些功能也可能很吵,例如,您的兄弟姐妹在電視上看到的貓和狗的圖像可能在狗的鼻子周圍變形或被遮擋了。 由于這種嘈雜的觀察,我們無法學習絕對正確的規則; 我們大概只能是近似正確的 (PAC),這是一種用于分析機器學習方法的數學框架。 在更糟糕的情況下,具有惡意意圖的對手會故意將整個嘈雜的輸入引入機器學習。 例如,無人駕駛汽車欺騙了自動駕駛汽車中的機器 ,使其以超過每小時50英里的速度行駛。 這導致對所謂的對抗機器學習的研究 ,涉及如何模擬和檢測對抗示例。
Finally, in our analogy, we have made a very fundamental assumption that the child easily picks up on the relevant features by which to distinguish a cat from a dog: first, they consider the sizes of the animals — supposing that a dog is large and a cat is small — and when presented with a small dog, they adjusted the rules and considered features such as the presence of a prominent snout. While this astuteness may come easily to humans, this is not the case with machines. If we were to replace the child in our analogy with our machine, and then present it with pictures of dogs and cats, the machine would not easily know to focus on the sizes of the animals or the presence of prominent snouts or whiskers as features from the image pixels. It could, in fact, consider as features the number of legs of the animals — which obviously is irrelevant — if there was an object obstructing one of the dog’s legs in at least one of the images! In contrast, a human child might not be easily fooled by that.
最后,以類推的方式,我們做出了一個非常基本的假設,即孩子很容易掌握將貓和狗區分開的相關特征:首先,他們考慮了動物的體型-假設狗很大,而且一只貓很小-當和一只小狗一起出現時,他們調整了規則并考慮了諸如突出的鼻子之類的特征。 盡管這種敏銳度對人來說很容易,但是機器卻不是這種情況。 如果我們要用機器代替類比中的孩子,然后用狗和貓的圖片展示它,那么機器就不容易知道專注于動物的大小或突出的鼻子或胡須等特征。圖像像素。 實際上,如果至少有一張圖像中有物體擋住了一只狗的一只腿,那么它可以考慮將動物的腿的數量作為特征,這顯然是無關緊要的! 相比之下,人類孩子可能不會因此而輕易被愚弄。
Thus, one painstaking step in classical machine learning is what we refer to as feature engineering or feature extraction. Basically, we need to tell the machine what features it needs to look out for; the machine may then hopefully come up with relevant rules about these features. For example, in order to train a machine to distinguish between people who identify as males or females from pictures, we may need to specify certain distances in the face, such as the separation between the eyes, the width of the nose and the location of the centres and corners of the eyes as features to the machine. In other words, we have to extract these features from the images for the machine, and sometimes we have to engineer others; for example, we might take the ratio of the x- and y-coordinates of the centres of the eyes, or the logarithm of the separation between the eyes.
因此,經典機器學習中的一個艱辛步驟就是我們所說的特征工程或特征提取 。 基本上,我們需要告訴機器它需要尋找什么功能。 然后,機器可能希望提出有關這些功能的相關規則。 例如,為了訓練機器來從圖片中識別出是男性還是女性,我們可能需要指定面部的特定距離,例如眼睛之間的距離,鼻子的寬度和位置。眼睛的中心和角是機器的特征 。 換句話說,我們必須從機器圖像中提取這些特征,有時還需要設計其他特征。 例如,我們可以取眼睛中心的x坐標和y坐標的比值,或兩眼間距離的對數。
Yet, even when we extract features, we do not even know the optimal number of features to select. If they are too few, we may lose certain information necessary to build accurate rules about the problem, and if they are too many, certain problems could arise, among them the so-called curse of dimensionality and the ever-present issue of overfitting which we would certainly devote another post to discuss. For example, in our analogy, while having more features than “size” alone can arguably help us develop more accurate rules to distinguish a dog from a cat, when the features become too many, a lot of it — such as the colour of the eyes or the number of limbs — may be irrelevant, and we may face the risk of overdoing or overfitting it.
但是,即使提取特征,我們也不知道要選擇的最佳特征數。 如果它們太少,我們可能會丟失某些必要的信息以建立關于該問題的準確規則;如果它們太多,則可能會出現某些問題,其中包括所謂的維數詛咒和永遠存在的過擬合問題。我們當然會另辟一席討論。 例如,以我們的類比來說,雖然功能本身比“大小”更多,可以說可以幫助我們制定更準確的規則以區分狗和貓,但是當這些功能變得太多時,其中的很多功能(例如眼睛或四肢的數量-可能無關緊要,我們可能面臨過度或過度安裝的風險。
Rather than engineer or extract features, one of the utilities of the subfield of machine learning known as deep learning is to have the machine learn the features and then learn the rules about those features. While this promises to resolve the issue about feature engineering, we will later see the unique challenges deep learning itself presents.
而不是設計或提取特征,機器學習子領域的一種實用程序(稱為深度學習)是讓機器學習特征,然后學習有關這些特征的規則。 盡管這有望解決有關功能工程的問題,但我們稍后將看到深度學習本身所面臨的獨特挑戰。
翻譯自: https://medium.com/ai-in-plain-english/supervised-learning-what-does-it-entail-e7e265ea7868
無監督學習 k-means
總結
以上是生活随笔為你收集整理的无监督学习 k-means_监督学习-它意味着什么?的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 双帆电瓶是杂牌子吗
- 下一篇: logistic 回归_具有Logist