日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

交叉熵求导

發布時間:2025/4/5 编程问答 17 豆豆
生活随笔 收集整理的這篇文章主要介紹了 交叉熵求导 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.


. 輸入為z向量,z=[z1,z2,...,zn]z=[z_{1},z_{2},...,z_{n}]z=[z1?,z2?,...,zn?],維度為(1,n)輸出s=[e1∑k=1nek,e2∑k=1nek,...,en∑k=1nek]s=[\frac{e^{1}}{\sum_{k=1}^{n}e^{k}},\frac{e^{2}}{\sum_{k=1}^{n}e^{k}},...,\frac{e^{n}}{\sum_{k=1}^{n}e^{k}}]s=[k=1n?eke1?,k=1n?eke2?,...,k=1n?eken?],
維度為(1,n)
2. 經過softmax函數, si=ei∑k=1neks_{i}=\frac{e^{i}}{\sum_{k=1}^{n}e^{k}}si?=k=1n?ekei?
3. Softmax Loss損失函數定義為L, L=?∑k=1nyiln?(si)L=-\sum_{k=1}^{n}y_{i}\ln \left ( s_{i}\right )L=?k=1n?yi?ln(si?),L是一個標量,維度為(1,1)
其中y向量為模型的Label,維度也是(1,n),為已知量,一般為onehot形式。
我們假設第 j 個類別是正確的,則y=[0,0,…1,…,0],只有yj=1y_{j}=1yj?=1,其余yj=0y_{j}=0yj?=0
L=?yjln?(sj)==?ln?(sj)L=-y_{j}\ln \left ( s_{j}\right )==-\ln \left ( s_{j}\right )L=?yj?ln(sj?)==?ln(sj?)
我們的目標是求 標量L對向量 Z 的導數?L?Z\frac{\partial L}{\partial Z}?Z?L?
由鏈式法則,?L?z=?L?s??s?z\frac{\partial L}{\partial z}=\frac{\partial L}{\partial s}\cdot\frac{\partial s}{\partial z}?z?L?=?s?L???z?s?
其中s和z均為維度為(1,n)的向量。

?L?s=[0,0,...,?1sj,0,...,0],dim=[1?n]\frac{\partial L}{\partial s}=[0,0,...,-\frac{1}{s_{j}},0,...,0] ,dim=[1*n]?s?L?=[0,0,...,?sj?1?,0,...,0],dim=[1?n]

?s?z=\frac{\partial s}{\partial z}=?z?s?=如下,dim=[n*n]

?s?z=[s1?[1?s1]?s1?s2?s1?s3...?s1?sj...?s1?sn?s2?s1s2?[1?s2]?s2?s2....?s2?sj...?s2?sn?s3?s1?s3?s2s3?[1?s3]...?s3?sj...?s3?sn..................?sj?s1?sj?s2?sj?s3...sj?[1?sj]...?sj?sn..................?sn?s1?sn?s2?sn?s3....?sn?sj...sn?[1?sn]]\frac{\partial s}{\partial z}=\begin{bmatrix} s_{1}*[1- s_{1}]& -s_{1}* s_{2}& -s_{1}* s_{3}& ... & -s_{1}* s_{j}&...&-s_{1}* s_{n}& \\ -s_{2}* s_{1}& s_{2}*[1- s_{2}] & -s_{2}* s_{2}& ....&-s_{2}* s_{j}&...&-s_{2}* s_{n} \\ -s_{3}* s_{1}& -s_{3}* s_{2}& s_{3}* [1-s_{3}] & ...&-s_{3}* s_{j}&...&-s_{3}* s_{n} \\ ...& ... & ...& ...& ...& ...& \\ -s_{j}* s_{1}& -s_{j}* s_{2}& -s_{j}* s_{3}& ...&s_{j}* [1-s_{j}]&...&-s_{j}* s_{n} \\ ...& ... & ...& ...& ...& ...& \\ -s_{n}*s_{1}& -s_{n}*s_{2}& - s_{n}*s_{3}& ....& - s_{n}*s_{j}&...&s_{n}*[1-s_{n} ]& \end{bmatrix} ?z?s?=???????????s1??[1?s1?]?s2??s1??s3??s1?...?sj??s1?...?sn??s1???s1??s2?s2??[1?s2?]?s3??s2?...?sj??s2?...?sn??s2???s1??s3??s2??s2?s3??[1?s3?]...?sj??s3?...?sn??s3??.......................??s1??sj??s2??sj??s3??sj?...sj??[1?sj?]...?sn??sj??.....................??s1??sn??s2??sn??s3??sn??sj??sn?sn??[1?sn?]?????????????

[1*n] ?L?s\frac{\partial L}{\partial s}?s?L?的矩陣左乘n*n的矩陣?s?z\frac{\partial s}{\partial z}?z?s?

?L?z=?L?s??s?z=[s1,s2,...,sj?1,...,sn]=s?y\frac{\partial L}{\partial z}=\frac{\partial L}{\partial s}\cdot\frac{\partial s}{\partial z}=[s_{1},s_{2},...,s_{j}-1,...,s_{n}]=s-y?z?L?=?s?L???z?s?=[s1?,s2?,...,sj??1,...,sn?]=s?y

主要鏈接
在線latex
一個國外的小哥的推導

總結

以上是生活随笔為你收集整理的交叉熵求导的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。