日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 人文社科 > 生活经验 >内容正文

生活经验

Cost Function

發(fā)布時(shí)間:2023/11/27 生活经验 51 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Cost Function 小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

首先本人一直有一個(gè)疑問纏繞了我很久,就是吳恩達(dá)老師所講的機(jī)器學(xué)習(xí)課程里邊的邏輯回歸這點(diǎn),使用的是交叉熵?fù)p失函數(shù),但是在進(jìn)行求導(dǎo)推導(dǎo)時(shí),google了很多的課件以及教程都是直接使用的,這個(gè)問題困擾了很久,最后了解了在國外的教程中都是默認(rèn)log就是ln。所以在機(jī)器學(xué)習(xí)中見到log就在腦海中自動(dòng)轉(zhuǎn)變一下思想他這里說的是ln就這樣去理解吧。

How to fit the parameters theta for logistic regression,In particular, I'd like to define the optimization objective or the cost function that we'll use to fit the parameters.

Here's to supervised learning problem of fitting a logistic regression model.

是n+1維的特征向量。是hypothesis,and the parameters of the hypothesis is this??over here.

Because this is a classification problem,our training set has the property that every label y is either 0 or 1.

?Back when we were developing the linear regression model,we use the followinbg cost function.

???Now,this cost function function worked fine for linear regression,but here we're interested in logistic regression.

?for?logistic regression,We're using a multipower hypothetical function,this would be a non-convex function of the parameters theta.Here is what I mean by non-convex.We have some cost function J of theta(J(θ)), and for logistic regression this function H here has a non linearity,right?因?yàn)檫@里是sigmoid函數(shù),so it's a pretty complicated nonlinear function.And if you take the sigmoid function and plug it in here.J(θ)looks like:

J(θ) can look like a unction just like this, with many local optima and the formal term for this is that this a nion convex function. If you were to turn gradient descent on this sort of function,it is not guaranteed to converge to the global minimum.whereas in contrast,what we would like is to have a cost function J of theta that is convex, that is a single bow-shaped function that looks like this, so that if you run gradient descent ,we would be guaranteed the gradient descent would converge to the global minimun.And the problem of using the squared cost function is that because of this very non linear sigmoid function that appear in the middle here,J of theta ends up being a non convex function if you were to define it sa the squared cost function.So what wewould? like to do is to instead come up with a different cost function that is convex and so that we can apply a great algorithm like gradient descent and be guaranteed to find a global minimum.

------------------------------邏輯回歸------------------

Here is a cost function that we're going to use for logistic regression.?

?

We are going to the cost or the penalty that the algorithm pays.

When y=1,the curve looks like this:

?Now,this cost function has a few interesting and desirable properties,First you notice that:

預(yù)測對了cost就是0,沒有預(yù)測對cost就是1.

First you notice that if y is equal to 1,and h(x)=1,in other words,if the hypothesis exactly predicts h(x) equals 1, and y is exactly equal to what I predicted, Then the cost is equal 0,right?First notice that if h(x)=1,and if the hypothesis predicts that? y is equal?to 1,and if indeed y is equal to 1,then the cost is equal to 0(the case that y equals 1 here).

But if h(x) is equal to 1,the cost is down here is equal to 0,and that's what we like it to be.Because if we correctly predict,the output?y then the cost is 0。

But now notice also that as h(x) approaches ,the output of the hypoyhesis approaches 0,the cost blows up,and it goes to infinity,And what this does is it caputres the intuition.(如果假設(shè)函數(shù)輸出0,相當(dāng)于說我們的假設(shè)函數(shù)Y=1的概率等于0,這類似于我們對病人說你有一個(gè)惡性腫瘤的概率y=1的概率是0,就是說你的惡性腫瘤完全不可能是惡性,但是如果結(jié)果這個(gè)病人的腫瘤的確是惡性的即y=1,雖然我們告訴他,它發(fā)生的概率是0,他完全不可能是惡性的,如果我們這樣確定無疑的告訴他我們的預(yù)測,結(jié)果卻發(fā)現(xiàn)我們是錯(cuò)的,那么我們用非常非常大的代價(jià)值乘法這個(gè)學(xué)習(xí)算法,她是被這樣呈現(xiàn)的,這個(gè)代價(jià)值區(qū)域無窮。如果y=1,但是h(x)=0)

------------------上述是y=1的情況

下面是y=0時(shí)代價(jià)函數(shù)是什么樣的?

If y turns out to be equal to 0,But we predicted y is equal to 1 with almost certainty with probability 1.Then we end up paying a very large cost.

相反,如果h(x)=0 and y=0,then the hypothesis nailed it(那么假設(shè)函數(shù)預(yù)測對了).?The predicted y is equal to 0,and? it turns out y is equal to 0,so at this point the cost function is going to be 0(就是上圖中的原點(diǎn)).

-------------------------上述定義了單訓(xùn)練樣本的代價(jià)函數(shù),我們所選的代價(jià)函數(shù)會(huì)給我們一個(gè)凸優(yōu)化問題,整體的代價(jià)函數(shù)j(θ)將會(huì)是一個(gè)convex and local optima free凸函數(shù)和局部最優(yōu)。

the cost functionfor a single training example to develop further and define??the cost functionfor the entire training set。

Those are examples of more sophisticated optimization algorithms,that need a way to compute J of θ,and need a way to computer the derivatives,and can then use more sophisticated strategies than gradient descent to minimize the cost function.

The details of exactly what these three alhorithms do is well beyond the scope of this course.

--------how to get logistic regression to work for mutil-class classification problrms.---an algorithm called one-versus-all classfication.

What's a multi-class classfication problem.

??

邏輯回歸的損失函數(shù)推導(dǎo)以及損失函數(shù)的導(dǎo)數(shù)推導(dǎo)過程見:

記住幾個(gè)公式:

?

?

有了上述幾個(gè)公式以及性質(zhì)的理解就可以求導(dǎo)了。?

開頭已經(jīng)說過默認(rèn)log是ln所以下邊推導(dǎo)就沒有疑點(diǎn)了。

總結(jié)

以上是生活随笔為你收集整理的Cost Function的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò),歡迎將生活随笔推薦給好友。