深度学总结:Image Style Transfer pytorch方式实现,这个是非基于autoencoder和domain adversrial方式
文章目錄
- 論文鏈接:
- 主要思路:
- pytorch實(shí)現(xiàn):
- 計(jì)算content的Loss:
- 計(jì)算style 的Loss:
- 計(jì)算total的Loss:
- 訓(xùn)練過程:
論文鏈接:
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf
作者運(yùn)用了一個(gè)披著深度學(xué)習(xí)外表的傳統(tǒng)方法做這個(gè)問題,技術(shù)不提倡,思路、想法很天才。
主要思路:
1、想辦法分離style features和content features;
2、怎么得到content features,作者偷懶直接取vgg19里面conv4_2的features,當(dāng)作content features;
3、怎么得到style features,作者發(fā)揮天才的偷懶,認(rèn)為style就是features之間的相關(guān)度,作者直接取vgg19里面conv1_1,conv2_1,conv3_1,conv4_1,conv5_1這5層的feature map來算風(fēng)格相關(guān)度,作者又發(fā)現(xiàn)不同的layer對(duì)風(fēng)格有影響,前面的風(fēng)格細(xì)膩,后面的風(fēng)格粗獷,又給這5個(gè)層的loss誤差加了權(quán)重。
4、那么目標(biāo)是什么了?就是new_image和content_image的內(nèi)容接近,new_image和stytle_image的風(fēng)格接近,LtotalL_{total}Ltotal? = LcontentL_{content}Lcontent?+LstyleL_{style}Lstyle?,首先要像,其次才是風(fēng)格,所以LcontentL_{content}Lcontent?的比重要大。
pytorch實(shí)現(xiàn):
計(jì)算content的Loss:
前面提到了作者偷懶直接取vgg19里面conv4_2的features,當(dāng)作content features,那么只需要比較兩個(gè)feature map的差別就行了。
計(jì)算style 的Loss:
def gram_matrix(tensor):""" Calculate the Gram Matrix of a given tensor Gram Matrix: https://en.wikipedia.org/wiki/Gramian_matrix"""# get the batch_size, depth, height, and width of the Tensor_, d, h, w = tensor.size()# reshape so we're multiplying the features for each channeltensor = tensor.view(d, h * w)# calculate the gram matrixgram = torch.mm(tensor, tensor.t())return gram
計(jì)算total的Loss:
content_weight = 1 # alpha style_weight = 1e6 # beta# calculate the *total* loss total_loss = content_weight * content_loss + style_weight * style_loss訓(xùn)練過程:
有人可能問VGG參數(shù)固定住了,需要的參數(shù)是什么了,參數(shù)就是新圖啊?通過上述loss調(diào)整圖像的像素值,這個(gè)和我們一般了解到的有點(diǎn)不一樣。
新圖就是在原圖的基礎(chǔ)上慢慢變化,f復(fù)制原圖并設(shè)置新圖為trainable的:
訓(xùn)練大概代碼如下:
# for displaying the target image, intermittently show_every = 400# iteration hyperparameters optimizer = optim.Adam([target], lr=0.003) steps = 2000 # decide how many iterations to update your image (5000)for ii in range(1, steps+1):# get the features from your target imagetarget_features = get_features(target, vgg)# the content losscontent_loss = torch.mean((target_features['conv4_2'] - content_features['conv4_2'])**2)# the style loss# initialize the style loss to 0style_loss = 0# then add to it for each layer's gram matrix lossfor layer in style_weights:# get the "target" style representation for the layertarget_feature = target_features[layer]target_gram = gram_matrix(target_feature)_, d, h, w = target_feature.shape# get the "style" style representationstyle_gram = style_grams[layer]# the style loss for one layer, weighted appropriatelylayer_style_loss = style_weights[layer] * torch.mean((target_gram - style_gram)**2)# add to the style lossstyle_loss += layer_style_loss / (d * h * w)# calculate the *total* losstotal_loss = content_weight * content_loss + style_weight * style_loss# update your target imageoptimizer.zero_grad()total_loss.backward()optimizer.step()總結(jié)
以上是生活随笔為你收集整理的深度学总结:Image Style Transfer pytorch方式实现,这个是非基于autoencoder和domain adversrial方式的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 深度学总结:CNN Decoder, U
- 下一篇: 深度学总结:RNN训练需要注意地方:py