日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

机器学习week9 ex8 review

發布時間:2023/12/15 编程问答 38 豆豆
生活随笔 收集整理的這篇文章主要介紹了 机器学习week9 ex8 review 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

機器學習week9 ex8 review

這周學習異常監測, 第一部分完成對一個網絡中故障的服務器的監測。第二部分使用協同過濾來實現一個電影推薦系統。

?

1 Anomaly Detection

監測服務器工作狀態的指標:吞吐量(throughput)延遲(latency)。
我們有 的無標簽數據集,這里認為其中絕大多數都是正常工作的服務器,其中少量是異常狀態。
先通過散點圖來直觀判斷。

1.1 Gaussian distribution

對數據的分布情況選擇一個模型。
高斯分布的公式如下:

其中 是平均值,是標準差。

1.2 Estimating parameters for Gaussian distribution

根據如下公式計算高斯分布的參數:


完成estimateGaussian.m如下:

function [mu sigma2] = estimateGaussian(X) %ESTIMATEGAUSSIAN This function estimates the parameters of a %Gaussian distribution using the data in X % [mu sigma2] = estimateGaussian(X), % The input X is the dataset with each n-dimensional data point in one row % The output is an n-dimensional vector mu, the mean of the data set % and the variances sigma^2, an n x 1 vector % % Useful variables [m, n] = size(X);% You should return these values correctly mu = zeros(n, 1); sigma2 = zeros(n, 1);% ====================== YOUR CODE HERE ====================== % Instructions: Compute the mean of the data and the variances % In particular, mu(i) should contain the mean of % the data for the i-th feature and sigma2(i) % should contain variance of the i-th feature. %mu = mean(X); sigma2 = var(X,1); % choose the way to divide by N rather than N-1% =============================================================end

完成之后,腳本文件會執行繪制等高線的操作,即得到如下圖像:

1.3 Selecting the threshold

以 為臨界值, 的情況被認為是異常狀況。
通過交叉驗證集來選擇這樣的 。
交叉驗證集中的數據是帶標簽的。根據之前學到的 來評價選擇的優劣。


其中 分別代表true positive,false positive, false negative。

function [bestEpsilon bestF1] = selectThreshold(yval, pval) %SELECTTHRESHOLD Find the best threshold (epsilon) to use for selecting %outliers % [bestEpsilon bestF1] = SELECTTHRESHOLD(yval, pval) finds the best % threshold to use for selecting outliers based on the results from a % validation set (pval) and the ground truth (yval). %bestEpsilon = 0; bestF1 = 0; F1 = 0;stepsize = (max(pval) - min(pval)) / 1000; for epsilon = min(pval):stepsize:max(pval)% ====================== YOUR CODE HERE ======================% Instructions: Compute the F1 score of choosing epsilon as the% threshold and place the value in F1. The code at the% end of the loop will compare the F1 score for this% choice of epsilon and set it to be the best epsilon if% it is better than the current choice of epsilon.% % Note: You can use predictions = (pval < epsilon) to get a binary vector% of 0's and 1's of the outlier predictionsprediction = (pval < epsilon);tp = sum((prediction == 1) & (yval == 1)); % true positivefp = sum((prediction == 1) & (yval == 0)); % false positivefn = sum((prediction == 0) & (yval == 1)); % false negativeprec = tp / (tp + fp); % precisionrec = tp / (tp + fn); % recallF1 = 2 * prec * rec/ (prec + rec); % F1% =============================================================if F1 > bestF1bestF1 = F1;bestEpsilon = epsilon;end endend

按照選定的 ,判斷異常情況如下圖:

1.4 High dimensional Dataset

對上述函數,換用更高維的數據集。(11 features)
與之前2維的情況并沒有什么區別。


2 Recommender system

對關于電影評分的數據集使用協同過濾算法,實現推薦系統。
Datasets來源:MoiveLens 100k Datasets.
對矩陣可視化:

作為對比,四階單位矩陣可視化情況如下:

2.1 Movie rating dataset

矩陣 (大小為num_movies num_users);
矩陣 ( 表示電影 被用戶 評分過).

2.2 Collaborating filtering learning algorithm

整個2.2都是對cofiCostFunc.m的處理。
原文件中提供的代碼如下:

function [J, grad] = cofiCostFunc(params, Y, R, num_users, num_movies, ...num_features, lambda) %COFICOSTFUNC Collaborative filtering cost function % [J, grad] = COFICOSTFUNC(params, Y, R, num_users, num_movies, ... % num_features, lambda) returns the cost and gradient for the % collaborative filtering problem. %% Unfold the U and W matrices from params X = reshape(params(1:num_movies*num_features), num_movies, num_features); Theta = reshape(params(num_movies*num_features+1:end), ...num_users, num_features);% You need to return the following values correctly J = 0; X_grad = zeros(size(X)); Theta_grad = zeros(size(Theta));% ====================== YOUR CODE HERE ====================== % Instructions: Compute the cost function and gradient for collaborative % filtering. Concretely, you should first implement the cost % function (without regularization) and make sure it is % matches our costs. After that, you should implement the % gradient and use the checkCostFunction routine to check % that the gradient is correct. Finally, you should implement % regularization. % % Notes: X - num_movies x num_features matrix of movie features % Theta - num_users x num_features matrix of user features % Y - num_movies x num_users matrix of user ratings of movies % R - num_movies x num_users matrix, where R(i, j) = 1 if the % i-th movie was rated by the j-th user % % You should set the following variables correctly: % % X_grad - num_movies x num_features matrix, containing the % partial derivatives w.r.t. to each element of X % Theta_grad - num_users x num_features matrix, containing the % partial derivatives w.r.t. to each element of Theta %% =============================================================grad = [X_grad(:); Theta_grad(:)];end

2.2.1 Collaborating filtering cost function

未經過regularization的代價函數如下:

故增加如下代碼:

diff = (X * Theta' - Y); vari = diff.^2; J = 1/2 * sum(vari(R == 1));

2.2.2 Collaborating filtering gradient

公式如下:

按照文檔里的Tips進行向量化,加入如下代碼:

for i = 1: num_movies,X_grad(i,:) = sum(((diff(i,:).* R(i,:))'.* Theta)); end;for j = 1: num_users,Theta_grad(j,:) = sum(((diff(:,j).* R(:,j)) .* X)); end;

想了一會,發現好像可以更徹底地向量化

X_grad = diff.* R * Theta; Theta_grad = (diff.*R)' * X;

2.2.3 Regularized cost function

2.2.4 Regularized gradient


只需要在上述代碼中加入regularization的部分即可。
如下:

J = 1/2 * sum(vari(R == 1)) + lambda/2 * (sum((Theta.^2)(:)) + sum((X.^2)(:)));X_grad = diff.*R*Theta + lambda * X; Theta_grad = (diff.*R)' * X + lambda * Theta;

2.3 Learning movie recommendations

2.3.1 Recommendations

在腳本文件中填入自己對movie_list.txt中部分電影的評分。
似乎提供的電影都是新世紀以前上映的,因此我沒有看過太多。我挑選了如下幾部評分:

推薦系統給我推薦了如下電影:

我沒有辦法判斷準不準,因為我一部也沒有看過。但隨便搜了其中的幾部,感覺我可能并不會喜歡。
也許是我提供的樣本太小了,也許是這個推薦系統太簡陋了吧。

?

轉載于:https://www.cnblogs.com/EtoDemerzel/p/7919953.html

總結

以上是生活随笔為你收集整理的机器学习week9 ex8 review的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。