當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

einops和einsum：直接操作张量的利器

發(fā)布時間：2025/3/8 编程问答 23 豆豆

生活随笔收集整理的這篇文章主要介紹了 einops和einsum：直接操作张量的利器小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

einops和einsum：直接操作張量的利器

einops和einsum是Vision Transformer的代碼實現里出現的兩個操作tensor維度和指定tensor計算的神器，在卷積神經網絡里不多見，本文將介紹簡單介紹一下這兩樣工具，方便大家更好地理解Vision Transformer的代碼。

einops：直接操作tensor維度的神器

github地址：https://github.com/arogozhnikov/einops

einops：靈活和強大的張量操作，可讀性強和可靠性好的代碼。支持numpy、pytorch、tensorflow等。

有了他，研究者們可以自如地操作張量的維度，使得研究者們能夠簡單便捷地實現并驗證自己的想法，在Vision Transformer等需要頻繁操作張量維度的代碼實現里極其有用。

這里簡單地介紹幾個最常用的函數。

安裝

einops的安裝非常簡單，直接pip即可：

pip install einops

rearrange

import torch from einops import rearrangei_tensor = torch.randn(16, 3, 224, 224) # 在CV中很常見的四維tensor：（N，C，H，W） print(i_tensor.shape) o_tensor = rearrange(i_tensor, 'n c h w -> n h w c') print(o_tensor.shape)

輸出：

torch.Size([16, 3, 224, 224]) torch.Size([16, 224, 224, 3])

在CV中很常見的四維tensor：（N，C，H，W），即表示（批尺寸，通道數，圖像高，圖像寬），在Vision Transformer中，經常需要對tensor的維度進行變換操作，rearrange函數可以很方便地、很直觀地操作tensor的各個維度。

除此之外，rearrange還有稍微進階一點的玩法：

i_tensor = torch.randn(16, 3, 224, 224) o_tensor = rearrange(i_tensor, 'n c h w -> n c (h w)') print(o_tensor.shape) o_tensor = rearrange(i_tensor, 'n c (m1 p1) (m2 p2) -> n c m1 p1 m2 p2', p1=16, p2=16) print(o_tensor.shape)

輸出：

torch.Size([16, 3, 50176]) torch.Size([16, 3, 14, 16, 14, 16])

可以進行指定維度的合并和拆分，注意拆分時需要在變換規(guī)則后面指定參數。

repeat

from einops import repeati_tensor = torch.randn(3, 224, 224) print(i_tensor.shape) o_tensor = repeat(i_tensor, 'c h w -> n c h w', n=16) print(o_tensor.shape)

repeat時記得指定右側repeat之后的維度值

輸出：

torch.Size([3, 224, 224]) torch.Size([16, 3, 224, 224])

reduce

from einops import reducei_tensor = torch.randn((16, 3, 224, 224)) o_tensor = reduce(i_tensor, 'n c h w -> c h w', 'mean') print(o_tensor.shape) o_tensor_ = reduce(i_tensor, 'b c (m1 p1) (m2 p2) -> b c m1 m2 ', 'mean', p1=16, p2=16) print(o_tensor_.shape)

輸出：

torch.Size([3, 224, 224]) torch.Size([16, 3, 14, 14])

reduce時記得指定左側要被reduce的維度值

Rearrange

import torch from torch.nn import Sequential, Conv2d, MaxPool2d, Linear, ReLU from einops.layers.torch import Rearrangemodel = Sequential(Conv2d(3, 64, kernel_size=3),MaxPool2d(kernel_size=2),Rearrange('b c h w -> b (c h w)'), # 相當于 flatten 展平的作用Linear(64*15*15, 120), ReLU(),Linear(120, 10) )i_tensor = torch.randn(16, 3, 32, 32) o_tensor = model(i_tensor) print(o_tensor.shape)

輸出：

torch.Size([16, 10])

einops.layers.torch.Rearrange 是nn.Module的子類，可以放在網絡里面直接當作一層。

torch.einsum：愛因斯坦簡記法

愛因斯坦簡記法：是一種由愛因斯坦提出的，對向量、矩陣、張量的求和運算 $∑\sum$ 的求和簡記法。

在該簡記法當中，省略掉的部分是：

求和符號

∑\sum

求和號的下標

i

省略規(guī)則為：默認成對出現的下標（如下例1中的 $i$ 和例2中的 $k$ ）為求和下標，被省略。

1） $x_iy_i$ 簡化表示內積 $<x,y><\mathbf{x},\mathbf{y}>$
$xiyi:=∑ixiyi=ox_iy_i := \sum_i x_iy_i = o$

其中o為輸出。

用

X_{ik}Y_{kj}

簡化表示矩陣乘法

XY\mathbf{X}\mathbf{Y}

XikYkj:=∑kXikYkj=OijX_{ik}Y_{kj}:=\sum_k X_{ik}Y_{kj}=\mathbf{O}_{ij}

其中

Oij\mathbf{O}_{ij}

為輸出矩陣的第ij個元素。

這樣的求和簡記法，能夠以一種統一的方式表示各種各樣的張量運算（內積、外積、轉置、點乘、矩陣的跡、其他自定義運算），為不同運算的實現提供了一個統一模型。

einsum在numpy和pytorch中都有實現，下面我們以在torch中為例，展示一下最簡單的用法

import torchi_a = torch.randn(16, 32, 4, 8) i_b = torch.randn(16, 32, 8, 16)out = torch.einsum('b h i j, b h j d -> b h i d', i_a, i_b) print(out.shape)

輸出：

torch.Size([16, 32, 4, 16])

可以看到，torch.einsum可以簡便地指定tensor運算，輸入的兩個tensor維度分別為 $bhijb\ h\ i\ j$ 和 $bhjdb\ h\ j\ d$ ，經過tensor運算后，得到的張量維度為 $bhidb\ h\ i\ d$ 。代碼運行結果與我們的預期一致。

總結

以上是生活随笔為你收集整理的einops和einsum：直接操作张量的利器的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。