當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

无监督学习与监督学习_有监督与无监督学习

發(fā)布時間：2023/12/15 编程问答 46 豆豆

生活随笔收集整理的這篇文章主要介紹了无监督学习与监督学习_有监督与无监督学习小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

無監(jiān)督學習與監(jiān)督學習

If we don’t know what the objective of the machine learning algorithm is, we may fail to build an accurate model. Knowing the types of Machine learning algorithms is essential. It helps us to see a bigger picture of machine learning, what is the goal of all the things that are being done in the field and especially, put us in a better position to break down a real problem and design a machine learning system.

如果我們不知道機器學習算法的目標是什么，則可能無法建立準確的模型。了解機器學習算法的類型至關重要。它可以幫助我們看到更大的機器學習圖景，這是該領域中所有工作的目標，尤其是使我們處于更好的位置，可以解決一個實際問題并設計一個機器學習系統(tǒng)。

The goal of most machine learning algorithms is to construct a model or a hypothesis. All machine learning models categorize as either supervised or unsupervised. In this note, we will discuss these two types, how they worked and how to use each of them in various fields.

大多數(shù)機器學習算法的目標是構建模型或假設。所有機器學習模型都分為有監(jiān)督的或無監(jiān)督的。在本說明中，我們將討論這兩種類型，它們是如何工作的以及如何在各個領域中使用它們。

The structure of this note:

本注釋的結構：

Supervised learning: categorizations and its applications.

監(jiān)督學習：分類及其應用。

Unsupervised learning: categorizations and its applications.

無監(jiān)督學習：分類及其應用。

Supervised learning vs. unsupervised learning.

監(jiān)督學習與無監(jiān)督學習。

Let’s begin by taking a look at supervised learning.

讓我們從監(jiān)督學習開始。

什么是監(jiān)督學習？ (What is Supervised Learning?)

Supervise means to watch over, to provide direction for someone or something. Supervised learning is a process in which we teach or train the machine using data that is well labeled.

監(jiān)督是指監(jiān)視，為某人或某物提供指示。監(jiān)督學習是一個過程，在該過程中，我們使用標記良好的數(shù)據(jù)來教學或訓練機器。

The most important concept to remember:

要記住的最重要概念：

Supervised learning means learning by example.

監(jiān)督學習意味著通過榜樣學習。

The objective of a supervised learning model is to predict the correct label for the newly presented input data. When training a supervised learning algorithm, the computer learns by example. The machine learns from past data and applies the learning to present data to predict future events. Our training data will consist of inputs paired with the correct outputs; that is, the input data is labeled or tagged as the right answer. In short, the machine already knows the output of the algorithm before it starts working on it or learning it.

監(jiān)督學習模型的目的是為新顯示的輸入數(shù)據(jù)預測正確的標簽。在訓練監(jiān)督學習算法時，計算機將通過示例進行學習。機器從過去的數(shù)據(jù)中學習，并將學習的結果應用于當前數(shù)據(jù)以預測未來事件。我們的訓練數(shù)據(jù)將由輸入和正確的輸出組成；也就是說，將輸入數(shù)據(jù)標記或標記為正確答案。簡而言之，機器在開始執(zhí)行或?qū)W習算法之前已經(jīng)知道算法的輸出。

During training, the algorithm will search for patterns in the data that correlate with the desired outputs. After training, a supervised learning algorithm will take in new unseen inputs and classify the label for them based on prior training data.

在訓練過程中，該算法將在數(shù)據(jù)中搜索與所需輸出相關的模式。訓練后，監(jiān)督學習算法將接收新的看不見的輸入，并根據(jù)先前的訓練數(shù)據(jù)為它們分類標簽。

That definition might be too academic, so we could instead think about a real-life example related to that concept. Let’s say; we show a picture to a baby. We tell the baby that “these are ice-creams.” The baby here plays a role as a computer. The ice creams photo is our input, and annotation is our output data. The baby keeps in the mind that if the color is red, and it has a cone shape, then it’s an ice-cream. That’s how she learns. The baby will recognize the ice-cream picture the next time she sees it. That is because, well, we have already labeled the image, the baby knows what ice-cream is. That’s how supervised learning works.

該定義可能過于學術化，因此我們可以考慮一個與該概念相關的現(xiàn)實示例。比方說；我們給嬰兒看照片。我們告訴嬰兒“這些是冰淇淋”。嬰兒在這里扮演著計算機的角色。冰淇淋照片是我們的輸入，注釋是我們的輸出數(shù)據(jù)。嬰兒要記住，如果顏色是紅色并且呈圓錐形，那就是冰淇淋。她就是這樣學習的。嬰兒在下次看到冰淇淋時會認出它。那是因為，好，我們已經(jīng)給圖像貼上標簽，嬰兒知道什么是冰淇淋。這就是監(jiān)督學習的工作方式。

監(jiān)督機器學習分類 (Supervised Machine Learning Categorization)

Supervised learning classified into two categories: classification and regression.

監(jiān)督學習分為兩類：分類和回歸。

分類 (Classification)

A classification problem is when the output variable is a category, such as a pass or fail, red or blue, etc. We use classification algorithms to predict a group of data.

分類問題是當輸出變量是類別(例如通過或失敗，紅色或藍色等)時。我們使用分類算法來預測一組數(shù)據(jù)。

During training, a classification algorithm will be given data points with an assigned category. The job of a classification algorithm is to take an input value then and assign it a class or group that it fits into based on the training data provided. The most common example of classification is determining if an email is spam or not. This problem is called a binary classification problem. The algorithm will be given training data with all emails (spam and note spam.) The model will find the features within the data that correlate to either class and create the mapping function to map input data to output: Y=f(x). Then, when provided with an unseen email, the model will use this function to predict whether the email is spam or not.

在訓練期間，將為分類算法提供具有指定類別的數(shù)據(jù)點。分類算法的工作是獲取輸入值，然后根據(jù)提供的訓練數(shù)據(jù)為它分配適合的類或組。最常見的分類示例是確定電子郵件是否為垃圾郵件。該問題稱為二進制分類問題。將為該算法提供所有電子郵件(垃圾郵件和便箋垃圾郵件)的訓練數(shù)據(jù)。該模型將在數(shù)據(jù)中找到與任一類相關的特征，并創(chuàng)建將輸入數(shù)據(jù)映射到輸出的映射函數(shù)： Y=f(x) 。然后，當收到看不見的電子郵件時，模型將使用此功能來預測電子郵件是否為垃圾郵件。

Note:

注意：

We use Binary or Binomial Classification grouping data using two kinds of labels.
我們使用兩種標簽使用二元或二項分類分組數(shù)據(jù)。
We use Multi-class or Multinomial Classification grouping data using more than two kinds of labels.
我們使用兩種以上的標簽來使用多類或多項式分類分組數(shù)據(jù)。

Here are a few popular classification algorithms:

以下是一些流行的分類算法：

Decision Trees are one of the simplest and yet most useful machine learning algorithms. We split the data according to a certain parameter. The tree has two entities, namely decision nodes and leaves. The leaves are the decisions or outcomes. And the decision nodes are where we split the data.
決策樹是最簡單但也是最有用的機器學習算法之一。我們根據(jù)某個參數(shù)拆分數(shù)據(jù)。該樹具有兩個實體，即決策節(jié)點和葉子。葉子是決定或結果。決策節(jié)點是我們分割數(shù)據(jù)的地方。

Random Forest is a set of decision trees on various subsets of the given dataset. It takes the average to improve the predictive accuracy of that dataset. Instead of relying on one decision tree, the random forest takes the prediction from each tree and based on the majority votes of predictions; it predicts the final output.
隨機森林是在給定數(shù)據(jù)集的各個子集上的一組決策樹。需要平均才能提高該數(shù)據(jù)集的預測準確性。隨機森林不依賴一棵決策樹，而是根據(jù)預測的多數(shù)票從每一棵樹獲取預測。它可以預測最終的輸出。

Support Vector Machines (SVM): The objective of the SVM algorithm is to find a hyperplane in N-dimensional space(N — the number of features). A hyperplane distinctly classifies the data points. That is, given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one group or the other, making it a non-probabilistic binary linear classifier.
支持向量機(SVM) ：SVM算法的目標是在N維空間(N —特征數(shù))中找到一個超平面。超平面將數(shù)據(jù)點明確分類。也就是說，給定一組訓練示例，每個訓練示例都標記為屬于兩個類別中的一個或另一個，則SVM訓練算法會建立一個模型，該模型分配一組或一組新示例，使其成為非概率二進制線性分類器。

We can use the K-Nearest Neighbor (KNN) for both classification and regression predictive problems. However, KNN is more widely used in classification problems in the industry. In the KNN algorithm, “k” means the number of nearest neighbors the model will consider. KNN is a model that classifies data points based on the points that are most similar to it. It uses test data to make an “educated guess” on what an unclassified position should be classified. If k=1, then the situation is simply attached to the class of its nearest neighbor.
我們可以將K最近鄰(KNN)用于分類和回歸預測問題。但是，KNN在行業(yè)中被更廣泛地用于分類問題。在KNN算法中，“ k”表示最近的數(shù)字模型將考慮的鄰居。 KNN是一個模型，該模型根據(jù)與數(shù)據(jù)點最相似的點對數(shù)據(jù)點進行分類。它使用測試數(shù)據(jù)對未分類的職位應進行何種分類做出“有根據(jù)的猜測”。如果k = 1，那么情況將被簡單地附加到其最近鄰居的類中。

回歸 (Regression)

A regression problem is when the output variable has a real value, such as weight, height, or dollars. It is most often used to predict numerical values based on previous data observations. The typical example of regression is predicting housing prices of future sales based on the prevailing market price.

回歸問題是輸出變量具有真實值(例如重量，高度或美元)時。它最常用于根據(jù)先前的數(shù)據(jù)觀察來預測數(shù)值。回歸的典型示例是根據(jù)當前市場價格預測未來銷售的房屋價格。

Some of the more familiar regression algorithms include

一些更熟悉的回歸算法包括

Linear regression performs the task of predicting a target y (output) based on given features x (input). The input variable is called the Independent Variable, and the output variable is called the Dependent Variable. This regression technique finds a linear relationship between the independent variables and dependent variables. Linear regression classifies into two categories: simple linear regression and multiple linear regression. Simple linear regression has only one x and one y variable. In comparison, multiple linear regression has one y and two or more x variables.
線性回歸基于給定特征x(輸入)執(zhí)行預測目標y(輸出)的任務。輸入變量稱為自變量，輸出變量稱為因變量。這種回歸技術找到了自變量和因變量之間的線性關系。線性回歸分為兩類：簡單線性回歸和多重線性回歸。簡單的線性回歸只有一個x和一個y變量。相比之下，多元線性回歸具有一個y和兩個或多個x變量。

Multiple linear regression with PythonPython的多元線性回歸

Logistic regression performs the task of predicting the discrete values for the set of independent variables that have been passed to it. It predicts by mapping the unseen data to the logit function that has been programmed into it. The algorithm predicts the probability of the new data, and so it’s output lies between the range of 0 and 1.
Logistic回歸執(zhí)行預測傳遞給它的一組獨立變量的離散值的任務。它通過將看不見的數(shù)據(jù)映射到已編程到其中的logit函數(shù)來進行預測。該算法可預測新數(shù)據(jù)的概率，因此其輸出介于0到1之間。

Polynomial regression: Polynomial regression is a particular case of linear regression. This regression technique finds the curvilinear relationship between the independent variable x and the dependent variable y.
多項式回歸：多項式回歸是線性回歸的一種特殊情況。該回歸技術找到曲線關系在自變量x和因變量y之間。

Ridge regression is a technique for analyzing multiple regression data that suffer from multicollinearity. Multicollinearity is a state of very high intercorrelations or inter-association among the independent variables. When multicollinearity occurs, least squares estimates are unbiased, but their variances are significant so that they may be far from the actual value.
Ridge回歸是用于分析遭受多重共線性的多個回歸數(shù)據(jù)的技術。多重共線性是自變量之間具有非常高的相互關系或相互聯(lián)系的狀態(tài)。當發(fā)生多重共線性時，最小二乘估計是無偏的，但是它們的方差很大，因此它們可能與實際值相去甚遠。

Note:

注意：

If the label is categorical, the model is known as a “classification.”
如果標簽是分類的，則該模型稱為“ 分類”。
If the label is numeric, the model is known as a “regression.”
如果標簽為數(shù)字 ，則該模型稱為“ 回歸”。

Some practical applications of supervised learning algorithms in real life:

監(jiān)督學習算法在現(xiàn)實生活中的一些實際應用：

BioInformatics: fingerprints, iris texture, earlobe, and so on.
生物信息學：指紋，虹膜紋理，耳垂等。
Face detection, spam detection.
人臉檢測，垃圾郵件檢測。
Signature recognition, speech recognition.
簽名識別，語音識別。
Weather forecasting
天氣預報
Stock price predictions, among others
股票價格預測等

什么是無監(jiān)督學習？ (What is Unsupervised Learning?)

Now we know the basic to supervised learning, it would be pertinent to hop on unsupervised learning.

現(xiàn)在我們知道了監(jiān)督學習的基礎，因此跳到無監(jiān)督學習就很有意義了。

Unsupervised learning is the method that trains machines to use data that is neither classified nor labeled. It means there is no training data set, and the machine learns by itself. The computer needs to be programmed to learn by itself. It needs to understand and provide insights from both structured and unstructured data.

無監(jiān)督學習是一種訓練機器使用既未分類也未標記的數(shù)據(jù)的方法。這意味著沒有訓練數(shù)據(jù)集，機器可以自行學習。需要對計算機進行編程以自行學習。它需要從結構化和非結構化數(shù)據(jù)中理解并提供見解。

The idea is to expose the machines to large volumes of varying data and allow it to learn from that data to provide insights that were previously unknown and to identify hidden patterns. As such, there aren’t necessarily defined outcomes from unsupervised learning algorithms. Instead, it determines what is different or exciting from the given dataset.

這個想法是將機器暴露給大量變化的數(shù)據(jù)，并允許其從數(shù)據(jù)中學習，以提供以前未知的見解并識別隱藏的模式。因此，不一定要定義無監(jiān)督學習算法的結果。相反，它確定與給定數(shù)據(jù)集有何不同或令人興奮。

During the process of unsupervised learning, the system does not have particular data sets, and the outcomes to most of the problems are mostly unknown. In simple terminology, the AI system and the machine learning objective is blinded when it goes into the operation. The lack of proper input and output algorithms makes the process even more challenging.

在無監(jiān)督學習的過程中，系統(tǒng)沒有特定的數(shù)據(jù)集，并且大多數(shù)問題的結果大多是未知的。用簡單的術語來說，人工智能系統(tǒng)和機器學習目標在投入運營時是盲目的。缺乏適當?shù)妮斎牒洼敵鏊惴ㄊ乖撨^程更具挑戰(zhàn)性。

Let’s make the concept simpler through the use of an example. We have shown a group of ice-creams and cupcakes pictures to the baby. Assume the baby hasn’t seen ice-creams and cakes earlier. So the baby doesn’t know what the feature of an ice-cream or a cupcake is. In this example, the baby is not able to categorize ice-creams and cakes as a supervised learning example. The whole process that follows supervised learning is simple. It is incredibly straightforward, as we teach the baby all the details on the figures.

讓我們通過使用示例來簡化概念。我們給嬰兒看了一組冰淇淋和紙杯蛋糕的照片。假設嬰兒較早沒有看過冰淇淋和蛋糕。因此，嬰兒不知道冰淇淋或紙杯蛋糕的特征是什么。在此示例中，嬰兒無法將冰淇淋和蛋糕分類為有監(jiān)督的學習示例。監(jiān)督學習之后的整個過程很簡單。這是非常簡單明了的，因為我們教給嬰兒有關數(shù)字的所有細節(jié)。

However, in unsupervised learning, the whole process becomes a little trickier. The algorithm for an unsupervised learning system has the same input data as the one for its supervised counterpart (in our case, ice-creams and cupcakes have different shapes and colors). However, have no specific outcomes. In a simple word, there is no label associated with this learning. Once the baby (the computer) has seen the picture (our input data), she learns from the information at hand. Now, with information related to the problem, our baby will recognize all similar objects and group them. In other words, the computer will design and label the objects itself. Technically, there are bound to be wrong answers, since there is a certain degree of probability. However, just like how humans work, the strength of machine learning lies in its ability to recognize mistakes, learn from them, and make better estimations next time. That process is known as unsupervised learning.

但是，在無監(jiān)督學習中，整個過程變得有些棘手。無監(jiān)督學習系統(tǒng)的算法與受監(jiān)督學習系統(tǒng)的算法具有相同的輸入數(shù)據(jù)(在我們的案例中，冰淇淋和紙杯蛋糕的形狀和顏色不同)。但是，沒有具體結果。簡而言之，沒有與此學習相關的標簽。嬰兒(計算機)看到圖片(我們的輸入數(shù)據(jù))后，她將從手頭的信息中學習。現(xiàn)在，有了與問題相關的信息，我們的寶寶將識別出所有相似的物體并將其分組。換句話說，計算機將設計和標記對象本身。從技術上講，肯定存在錯誤的答案，因為存在一定程度的可能性。但是，就像人類的工作方式一樣，機器學習的優(yōu)勢在于它能夠識別錯誤，從錯誤中學習并在下一次做出更好的估計。該過程稱為無監(jiān)督學習。

無監(jiān)督機器學習分類 (Unsupervised Machine Learning Categorization)

Unsupervised learning classified into two categories: clustering and association problems.

無監(jiān)督學習分為兩類：聚類和關聯(lián)問題。

Clustering: A clustering problem involves organizing unlabelled data into similar groups, such as grouping customers by purchasing behavior. It is one of the most common unsupervised learning methods. We often use clustering in marketing campaigns. For example, clustering algorithms can group people with similar traits and likelihood to purchase. Once we have the groups, we can run tests on each group with different marketing copy that will help us better target our messaging to them in the future.

群集：群集問題涉及將未標記的數(shù)據(jù)組織到相似的組中，例如通過購買行為對客戶進行分組。它是最常見的無監(jiān)督學習方法之一。我們經(jīng)常在營銷活動中使用群集。例如，聚類算法可以將具有相似特征和購買可能性的人分組。一旦有了小組，我們就可以對每個小組使用不同的營銷副本進行測試，這將有助于我們將來更好地針對他們發(fā)送消息。

Hierarchical clustering — is an algorithm that groups similar objects into groups called clusters. In this technique, initially, each data point is considered an individual cluster. The algorithm goes over the various features of the data points and looks for the similarity between them. If the algorithm finds similar data, they group those data. The process continues until the dataset has been grouped, which creates a hierarchy for each of these clusters.
分層聚類 —是一種將相似對象分為幾類的算法。在此技術中，最初，每個數(shù)據(jù)點被視為一個單獨的群集。該算法遍歷數(shù)據(jù)點的各種特征，并尋找它們之間的相似性。如果算法找到相似的數(shù)據(jù)，則將這些數(shù)據(jù)分組。該過程將繼續(xù)進行，直到將數(shù)據(jù)集分組為止，這將為每個群集創(chuàng)建層次結構。

researchgateresearchgate

K-Means Clustering — This algorithm works step-by-step, where the main goal is to achieve clusters that have labels to identify them. K-means is a centroid-based algorithm, or a distance-based algorithm, where we calculate the distances to assign a point to a cluster. The smallest distance between the data point and the centroid determines which group it belongs to while making sure the clusters do not interlay with each other. The centroid acts like the heart of the cluster. That ultimately gives us the cluster, which can be labeled as needed.
K均值聚類 -此算法分步工作，主要目標是獲得帶有標簽的聚類以識別它們。 K均值是基于質(zhì)心的算法或基于距離的算法，我們在其中計算將點分配給聚類的距離。數(shù)據(jù)點和形心之間的最小距離確定了它屬于哪個組，同時確保群集之間不會相互交錯。質(zhì)心的作用就像集群的心臟。最終為我們提供了集群，可以根據(jù)需要對其進行標記。

Association problem is where you want to discover rules that describe large portions of your data, for example, if a person buys hamburger buns, she will likely buy hamburgers.

關聯(lián)問題是您想要發(fā)現(xiàn)描述數(shù)據(jù)大部分的規(guī)則的地方，例如，如果某人購買漢堡包，那么她很可能會購買漢堡包。

Apriori algorithm is used for frequent mining itemsets and relevant association rules. This support maps the dependency of one data item with another, which can help us understand what data item influences the possibility of something happening to the other data item. For example, bread affects the buyer to buy milk and eggs. So that mapping helps increase profits for the store. This mapping process can be learned using this algorithm, which yields rules for its output.
Apriori算法用于頻繁挖掘項目集和相關的關聯(lián)規(guī)則。這種支持將一個數(shù)據(jù)項與另一個數(shù)據(jù)項之間的依賴關系映射到一起，這可以幫助我們了解哪些數(shù)據(jù)項會影響另一數(shù)據(jù)項發(fā)生某些事情的可能性。例如，面包會影響買方購買牛奶和雞蛋。這樣映射有助于增加商店的利潤。可以使用此算法學習此映射過程，該算法為其輸出產(chǎn)生規(guī)則。
Frequent Pattern Growth Algorithm(FP-Growth Algorithm) is the method of finding frequent patterns without candidate generation. The algorithm finds the count of the repeated pattern, adds that to a table, and then finds the most plausible item and sets that as the root of the tree. We then add other data items into the tree and calculate the support. If that particular branch fails to meet the threshold of the support, it is pruned. Once all the iterations are completed, a tree with the root of the item will be created, which are then used to make rules of the association. FP-Growth algorithm is faster than apriori as the support is calculated and checked for increasing iterations rather than creating a standard and testing the support from the dataset.
頻繁模式增長算法(FP-Growth Algorithm，FP-Growth Algorithm)是一種無需候選者生成即可找到頻繁模式的方法。該算法找到重復模式的計數(shù)，將其添加到表中，然后找到最合理的項目并將其設置為樹的根。然后，我們將其他數(shù)據(jù)項添加到樹中并計算支持度。如果該特定分支未達到支持的閾值，則將其修剪。一旦所有迭代都完成，將創(chuàng)建帶有項目根的樹，然后將其用于制定關聯(lián)規(guī)則。 FP-Growth算法比先驗算法要快，因為要計算和檢查支撐以增加迭代次數(shù)，而不是創(chuàng)建標準并從數(shù)據(jù)集中測試支撐。

無監(jiān)督學習算法的應用 (Applications of Unsupervised Learning Algorithms)

Some practical applications of unsupervised learning algorithms include:

無監(jiān)督學習算法的一些實際應用包括：

Credit-Card Fraud Detection.
信用卡欺詐檢測。
Identification of human errors during data entry
識別數(shù)據(jù)輸入過程中的人為錯誤
Amazon uses unsupervised learning to learn the customer’s purchase and recommend the products which are most frequently bought together (an example of association rule mining).
亞馬遜使用無監(jiān)督學習來學習客戶的購買并推薦最常一起購買的產(chǎn)品(關聯(lián)規(guī)則挖掘的示例)。

監(jiān)督學習與無監(jiān)督學習 (Supervise Learning vs. Unsupervised Learning)

The most significant difference between supervised and unsupervised learning is that each data have a label in the case of supervised learning. In contrast, there is NO label for each input in the case of unsupervised learning, implying, our data have not been classified.

監(jiān)督學習和非監(jiān)督學習之間的最大區(qū)別是，在監(jiān)督學習的情況下，每個數(shù)據(jù)都有一個標簽。相反，在無監(jiān)督學習的情況下，每個輸入都沒有標簽 ，這意味著我們的數(shù)據(jù)尚未分類。

Note:

注意：

Supervised learning will always have an input-output pair.
監(jiān)督學習將始終具有輸入輸出對。
Unsupervised learning is just data without a label nor meaning that we try to make some sense out of it.
無監(jiān)督學習只是沒有標簽的數(shù)據(jù)，也不意味著我們試圖從中獲得一些意義。

Quick summary快速總結

什么時候應該選擇監(jiān)督學習與無監(jiān)督學習？ (When Should you Choose Supervised Learning vs. Unsupervised Learning?)

A good strategy for honing in on the right machine learning approach is to:

磨練正確的機器學習方法的一個好策略是：

Evaluate the data: Is our data labeled or unlabelled? Is there available expert knowledge to support additional labeling? That will help to determine whether a supervised, we should use unsupervised.
評估數(shù)據(jù)：我們的數(shù)據(jù)是帶標簽的還是未帶標簽的？是否有可用的專家知識來支持附加標簽？那將有助于確定是否有監(jiān)督，我們應該使用無監(jiān)督。
Review available algorithms that may suit the problem with regards to dimensionality (number of features, attributes, or characteristics). Candidate algorithms should be tailored to the overall volume of data and its structure.
復習可能在維度(特征數(shù)量，屬性或特征)方面適合該問題的可用算法 。應根據(jù)整體數(shù)據(jù)量及其結構來調(diào)整候選算法。

In general, we use unsupervised machine learning when we do not have data on desired outcomes, such as determining a target market for a new product that a business has never sold before. However, if we are trying to get a better understanding of our existing consumer base, then supervised learning is the optimal technique.

通常，當我們沒有所需結果的數(shù)據(jù)時(例如，確定企業(yè)從未出售過的新產(chǎn)品的目標市場)，我們將使用無監(jiān)督機器學習。但是，如果我們試圖更好地了解我們現(xiàn)有的消費者基礎，那么監(jiān)督學習是最佳技術。

尾注 (End Notes)

Supervised learning and unsupervised learning are critical concepts in the field of machine learning. A proper understanding of the basics is crucial before you jump into the pool of different machine learning algorithms.

監(jiān)督學習和無監(jiān)督學習是機器學習領域中的關鍵概念。在跳入不同的機器學習算法之前，對基礎知識有適當?shù)牧私庵陵P重要。

Learn on!

繼續(xù)學習！

me.meme.me

資源： (Resource:)

There are many machine learning books you can read. I certainly didn’t cover enough information here to fill a chapter, but that doesn’t mean you can’t keep learning! Fill your mind with more awesomeness, starting with the excellent links below.

您可以閱讀許多機器學習書籍。我當然沒有在此處提供足夠的信息來填寫一章，但這并不意味著您無法繼續(xù)學習！從下面的出色鏈接開始，讓您更加精采。

Supervised and unsupervised learning

有監(jiān)督和無監(jiān)督學習

Machine learning course by Andrew Ng

Ng的機器學習課程

5 Beginner Friendly Steps to Learn Machine Learning and Data Science with Python

5個初學者友好的步驟，以使用Python學習機器學習和數(shù)據(jù)科學

Machine Learning (ML) vs. AI and their Important Differences

機器學習(ML)與AI及其重要區(qū)別

Dojo Data Science

Dojo數(shù)據(jù)科學

https://datasciencedojo.com/https://datasciencedojo.com/

翻譯自: https://medium.com/nothingaholic/supervised-vs-unsupervised-learning-eb4edc1c803b

無監(jiān)督學習與監(jiān)督學習

總結

以上是生活随笔為你收集整理的无监督学习与监督学习_有监督与无监督学习的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： fast.ai_使用fast.ai自组织
下一篇：分类决策树回归决策树_决策树分类器背后

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

编程问答

无监督学习与监督学习_有监督与无监督学习

什么是監(jiān)督學習？ (What is Supervised Learning?)

監(jiān)督機器學習分類 (Supervised Machine Learning Categorization)

分類 (Classification)

回歸 (Regression)

什么是無監(jiān)督學習？ (What is Unsupervised Learning?)

無監(jiān)督機器學習分類 (Unsupervised Machine Learning Categorization)

無監(jiān)督學習算法的應用 (Applications of Unsupervised Learning Algorithms)

監(jiān)督學習與無監(jiān)督學習 (Supervise Learning vs. Unsupervised Learning)

什么時候應該選擇監(jiān)督學習與無監(jiān)督學習？ (When Should you Choose Supervised Learning vs. Unsupervised Learning?)

尾注 (End Notes)

資源： (Resource:)

總結