當前位置：首頁 >

吴恩达 coursera ML 第二课总结+作业答案

發布時間：2025/3/15 15 豆豆

生活随笔收集整理的這篇文章主要介紹了吴恩达 coursera ML 第二课总结+作业答案小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

前言

學以致用，以學促用，通過筆記總結，鞏固學習成果，復習新學的概念。

前言
目錄
正文
- 線性模型
- 模型判斷準則
- 損失函數解析
- 損失函數解析2
- 梯度下降
- 梯度下降解析
- 應用梯度下降的線性模型
- 術語補充
編程作業
- ex1.m
- computeCost.m
- featureNormalize.m
- gradientDescent.m
- computeCostMulti
- gradientDescentMulti.m

正文

本周學習內容為：線性模型

線性模型

監督學習的第一個例子，基于尺寸的房價預測。

這個問題可以建模成y=f(x),其中y是房價，x是尺寸，f(x)=wx,這個w就是我們要訓練出來的東西。
訓練出模型假設h。

模型判斷準則

w可以是任意數，那什么樣的w是好的呢？

我們肯定希望訓練出來的模型能盡量準確的預測房價，因此，選擇的準則是使得誤差最小的w。

損失函數解析

這張圖上的公式還是比較抽象難懂的。

這張圖則形象化的展示了，模型參數和損失函數值的關系。
這張圖則形象的展示了，通過畫出誤差函數，我們可以輕易的選取出最優的參數。

損失函數解析2

這張圖則形象的展示了，有兩個參數的損失函數的圖像
這張圖則形象的展示了，找到最優參數時的模型圖像和誤差圖。

梯度下降

現在判斷準則有了，如何去尋找最優的參數呢？

這是一個圖像化的最優參數尋找過程，通過梯度下降，模型最終找到了最優的參數。

梯度下降的公式。

梯度下降解析

梯度下降的算法可以總結成圖上那樣。
梯度下降的直觀圖形展示，解釋了，為啥無論當前點在哪里，梯度下降方法都會使它向最優點前進。
展示了梯度下降核心參數alpha的影響。
梯度下降算法的補充介紹，損失函數會自動變小，因此，不需要時刻調整alpha。

應用梯度下降的線性模型

總結一下本章學習的理論內容，梯度下降算法和線性回歸模型。
針對線性模型使用時梯度下降的具體展示。

術語補充

batch 的意思是每次梯度下降時使用所有的樣本。

編程作業

ex1.m

%% Machine Learning Online Class - Exercise 1: Linear Regression% Instructions % ------------ % % This file contains code that helps you get started on the % linear exercise. You will need to complete the following functions % in this exericse: % % warmUpExercise.m % plotData.m % gradientDescent.m % computeCost.m % gradientDescentMulti.m % computeCostMulti.m % featureNormalize.m % normalEqn.m % % For this exercise, you will not need to change any code in this file, % or any other files other than those mentioned above. % % x refers to the population size in 10,000s % y refers to the profit in $10,000s %%% Initialization clear ; close all; clc%% ==================== Part 1: Basic Function ==================== % Complete warmUpExercise.m fprintf('Running warmUpExercise ... \n'); fprintf('5x5 Identity Matrix: \n'); warmUpExercise()fprintf('Program paused. Press enter to continue.\n'); pause;%% ======================= Part 2: Plotting ======================= fprintf('Plotting Data ...\n') data = load('ex1data1.txt'); X = data(:, 1); y = data(:, 2); m = length(y); % number of training examples% Plot Data % Note: You have to complete the code in plotData.m plotData(X, y);fprintf('Program paused. Press enter to continue.\n'); pause;%% =================== Part 3: Cost and Gradient descent ===================X = [ones(m, 1), data(:,1)]; % Add a column of ones to x theta = zeros(2, 1); % initialize fitting parameters% Some gradient descent settings iterations = 1500; alpha = 0.01;fprintf('\nTesting the cost function ...\n') % compute and display initial cost J = computeCost(X, y, theta); fprintf('With theta = [0 ; 0]\nCost computed = %f\n', J); fprintf('Expected cost value (approx) 32.07\n');% further testing of the cost function J = computeCost(X, y, [-1 ; 2]); fprintf('\nWith theta = [-1 ; 2]\nCost computed = %f\n', J); fprintf('Expected cost value (approx) 54.24\n');fprintf('Program paused. Press enter to continue.\n'); pause;fprintf('\nRunning Gradient Descent ...\n') % run gradient descent theta = gradientDescent(X, y, theta, alpha, iterations);% print theta to screen fprintf('Theta found by gradient descent:\n'); fprintf('%f\n', theta); fprintf('Expected theta values (approx)\n'); fprintf(' -3.6303\n 1.1664\n\n');% Plot the linear fit hold on; % keep previous plot visible plot(X(:,2), X*theta, '-') legend('Training data', 'Linear regression') hold off % don't overlay any more plots on this figure% Predict values for population sizes of 35,000 and 70,000 predict1 = [1, 3.5] *theta; fprintf('For population = 35,000, we predict a profit of %f\n',...predict1*10000); predict2 = [1, 7] * theta; fprintf('For population = 70,000, we predict a profit of %f\n',...predict2*10000);fprintf('Program paused. Press enter to continue.\n'); pause;%% ============= Part 4: Visualizing J(theta_0, theta_1) ============= fprintf('Visualizing J(theta_0, theta_1) ...\n')% Grid over which we will calculate J theta0_vals = linspace(-10, 10, 100); theta1_vals = linspace(-1, 4, 100);% initialize J_vals to a matrix of 0's J_vals = zeros(length(theta0_vals), length(theta1_vals));% Fill out J_vals for i = 1:length(theta0_vals)for j = 1:length(theta1_vals)t = [theta0_vals(i); theta1_vals(j)];J_vals(i,j) = computeCost(X, y, t);end end% Because of the way meshgrids work in the surf command, we need to % transpose J_vals before calling surf, or else the axes will be flipped J_vals = J_vals'; % Surface plot figure; surf(theta0_vals, theta1_vals, J_vals) xlabel('\theta_0'); ylabel('\theta_1');% Contour plot figure; % Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100 contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20)) xlabel('\theta_0'); ylabel('\theta_1');zlabel('J value') hold on; plot(theta(1), theta(2), 'rx', 'MarkerSize', 10, 'LineWidth', 2);

computeCost.m

function J = computeCost(X, y, theta) %COMPUTECOST Compute cost for linear regression % J = COMPUTECOST(X, y, theta) computes the cost of using theta as the % parameter for linear regression to fit the data points in X and y% Initialize some useful values m = length(y); % number of training examples% You need to return the following variables correctly J = 0;% ====================== YOUR CODE HERE ====================== % Instructions: Compute the cost of a particular choice of theta % You should set J to the cost. predict=X*theta; error=predict-y; J=sum(error.^2)/(2*m);% =========================================================================end

featureNormalize.m

function [X_norm, mu, sigma] = featureNormalize(X) %FEATURENORMALIZE Normalizes the features in X % FEATURENORMALIZE(X) returns a normalized version of X where % the mean value of each feature is 0 and the standard deviation % is 1. This is often a good preprocessing step to do when % working with learning algorithms.% You need to set these values correctly X_norm = X; mu = zeros(1, size(X, 2)); sigma = zeros(1, size(X, 2));% ====================== YOUR CODE HERE ====================== % Instructions: First, for each feature dimension, compute the mean % of the feature and subtract it from the dataset, % storing the mean value in mu. Next, compute the % standard deviation of each feature and divide % each feature by it's standard deviation, storing % the standard deviation in sigma. % % Note that X is a matrix where each column is a % feature and each row is an example. You need % to perform the normalization separately for % each feature. % % Hint: You might find the 'mean' and 'std' functions useful. % for i=1:size(X,2);mu(i)=mean(X(:,i));sigma(i)=std(X(:,i));X_norm(:,i)=X_norm(:,i)-mu(i);X_norm(:,i)=X_norm(:,i)/sigma(i);end% ============================================================end

gradientDescent.m

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta % theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha % Initialize some useful values m = length(y); % number of training examples J_history = zeros(num_iters, 1); for iter = 1:num_iters% ====================== YOUR CODE HERE ======================% Instructions: Perform a single gradient step on the parameter vector% theta. %% Hint: While debugging, it can be useful to print out the values% of the cost function (computeCost) and gradient here.%error_0=0;error_1=0;for i=1:merror_0=error_0+(X(i,:)*theta-y(i))*X(i,1);error_1=error_1+(X(i,:)*theta-y(i))*X(i,2);end theta(1)=theta(1)-alpha*error_0/m;theta(2)=theta(2)-alpha*error_1/m;% ============================================================% Save the cost J in every iteration J_history(iter) = computeCost(X, y, theta); end end

##　ex1_multi.m

computeCostMulti

function J = computeCostMulti(X, y, theta) %COMPUTECOSTMULTI Compute cost for linear regression with multiple variables % J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the % parameter for linear regression to fit the data points in X and y% Initialize some useful values m = length(y); % number of training examples% You need to return the following variables correctly J = 0;% ====================== YOUR CODE HERE ====================== % Instructions: Compute the cost of a particular choice of theta % You should set J to the cost. J=1/(2*m)*(X*theta-y)'*(X*theta-y);% =========================================================================end

gradientDescentMulti.m

function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters) %GRADIENTDESCENTMULTI Performs gradient descent to learn theta % theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha % Initialize some useful values m = length(y); % number of training examples J_history = zeros(num_iters, 1); for iter = 1:num_iters% ====================== YOUR CODE HERE ======================% Instructions: Perform a single gradient step on the parameter vector% theta. %% Hint: While debugging, it can be useful to print out the values% of the cost function (computeCostMulti) and gradient here.%error=zeros(size(X,2),1);for i=1:merror=error+(X(i,:)*theta-y(i))*X(i,:)';end theta=theta-alpha*error/m;% ============================================================% Save the cost J in every iteration J_history(iter) = computeCostMulti(X, y, theta);endend 與50位技術專家面對面20年技術見證，附贈技術全景圖

總結

以上是生活随笔為你收集整理的吴恩达 coursera ML 第二课总结+作业答案的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： excel 单元格求和大于某个数后返回列
下一篇：有用博客