當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

转置卷积Transposed Convolution

發布時間：2023/11/28 生活经验 30 豆豆

生活随笔收集整理的這篇文章主要介紹了转置卷积Transposed Convolution 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

轉置卷積Transposed Convolution

我們為卷積神經網絡引入的層，包括卷積層和池層，通常會減小輸入的寬度和高度，或者保持不變。然而，語義分割和生成對抗網絡等應用程序需要預測每個像素的值，因此需要增加輸入寬度和高度。轉置卷積，也稱為分步卷積或反卷積，就是為了達到這一目的。

from mxnet import np, npx, init

from mxnet.gluon import nn

from d2l import mxnet as d2l

npx.set_np()

Basic 2D Transposed Convolution

讓我們考慮一個基本情況，輸入和輸出通道都是1，填充為0，步長為1。圖1說明了如何用2×2輸入矩陣計算2×2內核的。

Fig. 1. Transposed convolution layer with a 2×22×2 kernel.

可以通過給出矩陣核來實現這個運算
K和矩陣輸入X。

def trans_conv(X, K):

h, w = K.shapeY = np.zeros((X.shape[0] + h - 1, X.shape[1] + w - 1))for i in range(X.shape[0]):for j in range(X.shape[1]):Y[i: i + h, j: j + w] += X[i, j] * K

Return

卷積通過Y[i, j] = (X[i: i + h, j: j + w] * K).sum()計算結果，它通過內核匯總輸入值。而轉置卷積則通過核來傳輸輸入值，從而得到更大的輸出。

X = np.array([[0, 1], [2, 3]])

K = np.array([[0, 1], [2, 3]])

trans_conv(X, K)

array([[ 0., 0., 1.],

   [ 0., 4.,  6.],[ 4., 12., 9.]])

或者我們可以用nn.Conv2D轉置得到同樣的結果。作為nn.Conv2D，輸入和核都應該是四維張量。

X, K = X.reshape(1, 1, 2, 2), K.reshape(1, 1, 2, 2)

tconv = nn.Conv2DTranspose(1, kernel_size=2)

tconv.initialize(init.Constant(K))

tconv(X)

array([[[[ 0., 0., 1.],

     [ 0., 4.,  6.],[ 4., 12., 9.])

Padding, Strides, and Channels

在卷積中，我們將填充元素應用于輸入，而在轉置卷積中將它們應用于輸出。A 1×1 padding意味著我們首先正常計算輸出，然后刪除第一行/最后一列。

tconv = nn.Conv2DTranspose(1, kernel_size=2, padding=1)

tconv.initialize(init.Constant(K))

tconv(X)

array([[4.])

同樣，在輸出中也應用了這個策略。

tconv = nn.Conv2DTranspose(1, kernel_size=2, strides=2)

tconv.initialize(init.Constant(K))

tconv(X)

array([0., 0., 0., 1.],

     [0., 0., 2., 3.],[0., 2., 0., 3.],[4., 6., 6., 9.])

X = np.random.uniform(size=(1, 10, 16, 16))

conv = nn.Conv2D(20, kernel_size=5, padding=2, strides=3)

tconv = nn.Conv2DTranspose(10, kernel_size=5, padding=2, strides=3)

conv.initialize()

tconv.initialize()

tconv(conv(X)).shape == X.shape

True

Analogy to Matrix Transposition

轉置卷積因矩陣轉置而得名。實際上，卷積運算也可以通過矩陣乘法來實現。在下面的示例中，我們定義了一個3×3× input XX with a 2×22×2 kernel K，然后使用corr2d計算卷積輸出。

X = np.arange(9).reshape(3, 3)

K = np.array([[0, 1], [2, 3]])

Y = d2l.corr2d(X, K)

array([[19., 25.], [37., 43.]])

Next, we rewrite convolution kernel KK as a matrix WW. Its shape will be (4,9)(4,9), where the ithith row present applying the kernel to the input to generate the ithith output element.

def kernel2matrix(K):

k, W = np.zeros(5), np.zeros((4, 9))k[:2], k[3:5] = K[0, :], K[1, :]W[0, :5], W[1, 1:6], W[2, 3:8], W[3, 4:] = k, k, k, kreturn W

W = kernel2matrix(K)

array([[0., 1., 0., 2., 3., 0., 0., 0., 0.],

   [0., 0., 1., 0., 2., 3., 0., 0., 0.],[0., 0., 0., 0., 1., 0., 2., 3., 0.],[0., 0., 0., 0., 0., 1., 0., 2., 3.]])

然后通過適當的整理，用矩陣乘法實現卷積算子。

Y == np.dot(W, X.reshape(-1)).reshape(2, 2)

array([[ True, True],

   [ True, True]])

We can implement transposed convolution as a matrix multiplication as well by reusing kernel2matrix. To reuse the generated WW, we construct a 2×22×2 input, so the corresponding weight matrix will
have a shape (9,4)(9,4), which is W?W?. Let us verify the results.

X = np.array([0, 1], [2, 3])

Y = trans_conv(X, K)

Y == np.dot(W.T, X.reshape(-1)).reshape(3, 3)

array([[ True, True, True],

   [ True, True,  True],[ True, True,  True]])

Summary

· Compared to convolutions that reduce inputs through kernels, transposed convolutions broadcast inputs.

· If a convolution layer reduces the input width and height by nwnw and hhhh time, respectively. Then a transposed convolution layer with the same kernel sizes, padding and strides will increase the input width and height by nwnw and nhnh, respectively.

· We can implement convolution operations by the matrix multiplication, the corresponding transposed convolutions can be done by transposed matrix multiplication.

總結

以上是生活随笔為你收集整理的转置卷积Transposed Convolution的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：语义分割与数据集
下一篇：多尺度目标检测 Multiscale O