當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

知识蒸馏 knowledge distill 相关论文理解

發(fā)布時(shí)間：2025/3/8 编程问答 33 豆豆

生活随笔收集整理的這篇文章主要介紹了知识蒸馏 knowledge distill 相关论文理解小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Knowledge Distil 相關(guān)文章

1.FitNets : Hints For Thin Deep Nets （ICLR2015）
2.A Gift from Knowledge Distillation：Fast Optimization, Network Minimization and Transfer Learning (CVPR 2017)
3.Matching Guided Distillation（ECCV2020）
4.A Comprehensive Overhaul of Feature Distillation（ICCV2019）
5.Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons （AAAI2019）
6.Compressing GANs using Knowledge Distillation
7.GhostNet: More Feature from Cheap Operation（CVPR2020）
8.Data-Free Adversarial Distillation
9.Data-Free Learning of Student Networks (ICCV2020)

1.FitNets : Hints For Thin Deep Nets （ICLR2015）

論文目的：
蒸餾訓(xùn)練中，為了訓(xùn)練更加深的網(wǎng)絡(luò)，在某個(gè)層中設(shè)置hint（暗示）,再與老師網(wǎng)絡(luò)中的hint對比。這樣做是使訓(xùn)練更加快，好。

實(shí)驗(yàn)：
分別在 CIFAR-10 和 CIFAR-100 SVHN MNIST AFLW進(jìn)行了實(shí)驗(yàn)

2.A Gift from Knowledge Distillation：Fast Optimization, Network Minimization and Transfer Learning (CVPR 2017)

論文鏈接
論文目的:
發(fā)現(xiàn)蒸餾可以用來

對模型快速訓(xùn)練，訓(xùn)練更少的時(shí)間就能達(dá)到效果。

對模型進(jìn)行初始化，

對模型進(jìn)行轉(zhuǎn)移學(xué)習(xí)（老師網(wǎng)絡(luò)用于貓狗分類，學(xué)生網(wǎng)絡(luò)用于馬和斑馬分類）

主要貢獻(xiàn)：
1.提出了一蒸餾訓(xùn)練方法，認(rèn)為教學(xué)生網(wǎng)絡(luò)不同層輸出的feature之間的關(guān)系比教學(xué)生網(wǎng)絡(luò)結(jié)果好
The student DNN does not necessarily have to learn the intermediate output when the specific question is input but can learn the solution method when a specific type of question is encountered

論文內(nèi)容：

1.定義了FSP matrix矩陣來表明兩個(gè)層之間的關(guān)系流
The FSP matrix is generated by the features from two layers

網(wǎng)絡(luò)模型

2.訓(xùn)練過程
先訓(xùn)練FSPloss ,然后再用數(shù)據(jù)集訓(xùn)練學(xué)生網(wǎng)絡(luò)進(jìn)行微調(diào)。

3.Matching Guided Distillation（ECCV2020）

論文鏈接

論文目的：
提出了一種新方法用于解決老師網(wǎng)絡(luò)和學(xué)生網(wǎng)絡(luò)輸出feature維度不一致問題，進(jìn)而導(dǎo)致對比的時(shí)候有一定誤差。其中，其他老的方法是新增一個(gè)卷積，或者attention 去匹配維度。

本文提出三個(gè)方法去裁剪老師網(wǎng)絡(luò)生成的feature通道數(shù)，進(jìn)而與學(xué)生網(wǎng)絡(luò)進(jìn)行匹配，不需要增加一個(gè)橋梁（1*1卷積）去解決features不匹配的情況。

論文內(nèi)容：
1.通道匹配
尋找一個(gè)矩陣M建立S和T特征的聯(lián)系，
其中S是預(yù)訓(xùn)練學(xué)生網(wǎng)絡(luò)輸出的feature
T是預(yù)訓(xùn)練老師網(wǎng)絡(luò)輸出的feature
$S = M T$
$S∈RS×N,M∈RS×C,T∈RC×NS\in \mathcal{R}^{S \times N},M\in \mathcal{R}^{S \times C},T\in \mathcal{R}^{C\times N}$

M 還要滿足以下條件

2.通道裁剪
找到M之后進(jìn)行裁剪，裁剪分為三個(gè)方法。
（1）sparse matching

（2）random drop
（3） max pooling

論文不足：使用的預(yù)訓(xùn)練的student模型，然后再利用teacher微調(diào)。其中M是兩者的相關(guān)程度，可以直接對teacher生成的feature進(jìn)行運(yùn)算，找到有代表性的。

4.A Comprehensive Overhaul of Feature Distillation（ICCV2019）

論文鏈接

論文目的：
設(shè)計(jì)一種蒸餾方法，對teacher transform, student transform, distillation feature position and distance function 進(jìn)行了設(shè)計(jì)

論文內(nèi)容：
teacher transform 加了a new ReLU activation
student transform 加了1*1conv
distillation feature position 在pre-RELU
distance function 提出了新的 partial L2 distance

5.Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons （AAAI2019）

提供了一個(gè)新思路：讓老師網(wǎng)絡(luò)層的神經(jīng)元的激活平面盡量和學(xué)生網(wǎng)絡(luò)的一樣

6.Compressing GANs using Knowledge Distillation

論文鏈接

上面幾篇都是對普通的卷積網(wǎng)絡(luò)進(jìn)行壓縮，這個(gè)文章是對gan進(jìn)行壓縮。

貢獻(xiàn)：
1.認(rèn)為一個(gè)超參數(shù)的老師網(wǎng)絡(luò)去蒸餾有更好的效果
2.全文證明了學(xué)生網(wǎng)絡(luò)的參數(shù)越多，效果越好

論文內(nèi)容

7.GhostNet: More Feature from Cheap Operation（CVPR2020）

主要內(nèi)容;
使用線性操作，復(fù)制更多地feature maps，以此代替卷積操作，這樣可以簡化模型

思想很簡單

8.Data-Free Adversarial Distillation

動(dòng)機(jī)：

原始的訓(xùn)練數(shù)據(jù)不存在，
2.訓(xùn)練S時(shí)，使用一些具有代表性特征的數(shù)據(jù) （hard sample）
方法：
1.利用G隨機(jī)生成數(shù)據(jù)，盡量拉遠(yuǎn)S和T的距離
2.訓(xùn)練S，使S和T的距離變小

9.Data-Free Learning of Student Networks (ICCV2020)

動(dòng)機(jī)：
1.不使用原數(shù)據(jù)集

內(nèi)容：
1.迭代訓(xùn)練 G 和 S
2.學(xué)習(xí)原始數(shù)據(jù)集的分布，更快的生成圖片
3.設(shè)置了三個(gè)損失函數(shù)去限制生成器，生成更加好的圖片

總結(jié)

以上是生活随笔為你收集整理的知识蒸馏 knowledge distill 相关论文理解的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： libzdb 连接mysql,数据库连接
下一篇：架构师一般做到多少岁_《迷茫中的我们该如

日韩av黄I国产麻豆传媒I国产91av视频在线观看I日韩一区二区三区在线看I美女国产在线I麻豆视频国产在线观看I成人黄色短片

编程问答

知识蒸馏 knowledge distill 相关论文理解

Knowledge Distil 相關(guān)文章

1.FitNets : Hints For Thin Deep Nets （ICLR2015）

2.A Gift from Knowledge Distillation：Fast Optimization, Network Minimization and Transfer Learning (CVPR 2017)

3.Matching Guided Distillation（ECCV2020）

4.A Comprehensive Overhaul of Feature Distillation（ICCV2019）

5.Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons （AAAI2019）

6.Compressing GANs using Knowledge Distillation

7.GhostNet: More Feature from Cheap Operation（CVPR2020）

8.Data-Free Adversarial Distillation

9.Data-Free Learning of Student Networks (ICCV2020)

總結(jié)