胶囊路由_评论:胶囊之间的动态路由
膠囊路由
Link to paper: https://arxiv.org/pdf/1710.09829.pdf
鏈接到論文: https : //arxiv.org/pdf/1710.09829.pdf
The paper introduced an implementation of Capsule Networks which use an iterative routing-by-agreement mechanism: A lower-level capsule prefers to send its output to higher level capsules whose activity vectors have a big scalar product with the prediction coming from the lower-level capsule.
本文介紹了一個膠囊網絡的實現,該網絡使用了一種按協議進行迭代路由的機制:低級膠囊傾向于將其輸出發送到活動向量具有大標量積的高級膠囊,而預測來自低級膠囊膠囊。
Motivation:
動機 :
Human visual system uses a sequence of fixation points to ensure that a tiny fraction of optic array is processed at highest resolution. For a single fixation, a parse tree is carved out of a fixed small groups of neurons called “capsules” and each node in the parse tree will correspond to an active capsule. By using an iterative process, each capsule will choose a higher-level capsule to be its parent. This process will solve the problem of assigning parts to wholes.
人類視覺系統使用一系列固定點來確保以高分辨率對光學陣列的一小部分進行處理。 對于單個注視,解析樹是從固定的一小組稱為“膠囊”的神經元中雕刻出來的,解析樹中的每個節點都將對應一個活動的膠囊。 通過使用迭代過程,每個膠囊將選擇一個更高級別的膠囊作為其父對象。 此過程將解決將零件分配給整體的問題。
For activity vector of each active capsule:
對于每個活性膠囊的活性載體:
- Its length is the probability that an entity exists in the image. 它的長度是圖像中實體存在的概率。
- Its orientation is object’s estimated pose parameters like pose (position, size, orientation), deformation, velocity, albedo, hue, texture, etc. 它的方向是對象估計的姿勢參數,例如姿勢(位置,大小,方向),變形,速度,反照率,色相,紋理等。
Idea:
主意 :
Since the output of a capsule is a vector, it is possible to use a powerful dynamic routing mechanism to ensure the output is sent to an appropriate parent. For each possible parent, the capsule computes a “prediction vector” by multiplying its own output by a weight matrix. If this prediction vector has a large scalar product with the output of a possible parent, a coupling coefficient for that parent will be increased and for other parents will be decreased, thus increases the contribution a capsule makes to that parent, increasing the scalar product of the capsule’s prediction with the parent’s output. This is much more effective when compared to max-pooling, which allows neurons in one layer to care only about the most active feature detector in the previous layer. Also, unlike max-pooling, capsules don’t throw away information about the precise location of the entity or its pose.
由于封裝的輸出是矢量,因此可以使用強大的動態路由機制來確保將輸出發送到適當的父級。 對于每個可能的父對象,膠囊通過將其自身的輸出乘以權重矩陣來計算“預測向量”。 如果此預測向量與可能的父對象的輸出具有較大的標量積,則該父對象的耦合系數將增加,而其他父對象的耦合系數將減小,因此將增加膠囊對該父對象的貢獻,從而增加的標量積。膠囊的預測以及父母的輸出。 與最大池化相比,此方法要有效得多,最大池化可使一層中的神經元只關心上一層中最活躍的特征檢測器。 而且,與最大池化不同,膠囊不會丟棄有關實體或其姿勢的精確位置的信息。
Calculating vector inputs and outputs of a capsule:
計算膠囊的矢量輸入和輸出 :
Because the length of the activity vector represents the probability that an entity exists in the image, it has to be between 0 and 1. Squash function will ensure that short vectors’ length will get shrunk to almost 0 and long vectors’ one will get shrunk to slightly below 1.
由于活動矢量的長度表示圖像中存在實體的概率,因此活動矢量的長度必須介于0和1之間。Squash函數將確保短矢量的長度縮小到幾乎0,長矢量的長度縮小。到略低于1。
Except the first layer of capsules, the total input to a capsule is a weighted sum over all prediction vectors from the capsules in the previous layer.
除膠囊的第一層外,膠囊的總輸入是來自上一層膠囊的所有預測向量的加權總和。
Total input to a capsule. c_ij are coupling coefficient.膠囊的總輸入量。 c_ij是耦合系數。These prediction vectors are produced by multiplying the output of a capsule in the layer below by a weight matrix.
這些預測向量是通過將下面一層中膠囊的輸出乘以權重矩陣而產生的。
Prediction vectors.預測向量。The coupling coefficients c_ij are determined by the iterative dynamic routing process. Between a capsule and all the capsules in the layer above, they are sum to 1 and are determined by a softmax function whose initial logits b_ij are the log prior probabilities that capsule i should be coupled to capsule j.
耦合系數c_ij由迭代動態路由過程確定。 在一個膠囊和上一層中的所有膠囊之間,它們之和為1,并由softmax函數確定,該函數的初始對數b_ij是應將膠囊i耦合到膠囊j的對數先驗概率。
Coupling coefficients.耦合系數。The initial logit b_ij are later updated by adding scalar product:
初始logit b_ij隨后通過添加標量乘積進行更新:
Routing algorithm for CapsNetCapsNet的路由算法Margin loss for digit existence:
數字存在的保證金損失 :
The top-level capsule for an object class should have a long instantiation vector if that object is present in the image. To allow multiple class, the authors use a separate margin loss for each capsule:
如果圖像中存在該對象,則該對象類的頂級膠囊應具有長的實例化向量。 為了允許多個類別,作者對每個膠囊使用單獨的保證金損失:
Margin loss for each capsule k. T_k = 1 if object of class k is present. m+ = 0.9, m- = 0.1, λ = 0.5.每個膠囊k的保證金損失。 如果存在類別k的對象,則T_k = 1。 m + = 0.9,m- = 0.1,λ= 0.5。This ensures that if an object of class k present, the loss should be no less than 0.9 and if it doesn’t, the loss should be no more than 0.1.
這樣可以確保,如果存在k類對象,則損失應不小于0.9,如果不存在,則損失應不大于0.1。
The total loss is the sum of the losses of all object capsules.
總損失是所有目標膠囊損失的總和。
CapsNet architecture for MNIST
MNIST的CapsNet架構
A simple CapsNet architecture consist of 2 convolutional layers and one fully connected layer.一個簡單的CapsNet體系結構由2個卷積層和1個完全連接的層組成。CapsNet achieved state-of-the-art performance on MNIST after just a few training epoch. After training for about 6-7 epoch with this implementation, CapsNet was able to achieve about 99% accuracy on test set. The rest were negligible improvement.
只需幾次培訓,CapsNet就在MNIST上取得了最先進的性能。 在使用此實現訓練了大約6-7個紀元之后,CapsNet能夠在測試集上實現大約99%的準確性。 其余的改善可忽略不計。
Regularization by reconstruction
重建正則化
Decoder structure to reconstruct digit from DigitCaps layer.解碼器結構,可從DigitCaps層重建數字。The authors used reconstruction loss to encourage the digit capsules to encode the instantiation parameters of the input digit. It learns to reconstruct the image by minimizing the squared difference between the reconstructed image and the input image. The loss will be the sum of margin loss (||L2||) and reconstruction loss. However, to prevent the domination of reconstruction loss, it was scaled down to 0.0005.
作者使用重建損失來鼓勵手指囊對輸入手指的實例化參數進行編碼。 它通過最小化重建圖像和輸入圖像之間的平方差來學習重建圖像。 該損失將是余量損失(|| L2 ||)與重建損失之和。 但是,為了防止控制重建損失,將其縮小到0.0005。
An example of image reconstruction.圖像重建的例子。Drawbacks:
缺點 :
When dealing with dataset that the backgrounds are much too varied (like CIFAR-10), CapsNet performs poorly compared to other state-of-the-art architectures.
當處理背景差異太大的數據集(如CIFAR-10)時,與其他最新體系結構相比,CapsNet的性能較差。
Youtube: https://www.youtube.com/watch?v=pPN8d0E3900
YouTube: https : //www.youtube.com/watch?v = pPN8d0E3900
翻譯自: https://medium.com/xulab/review-dynamic-routing-between-capsules-ea9c2fb64765
膠囊路由
總結
以上是生活随笔為你收集整理的胶囊路由_评论:胶囊之间的动态路由的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 主题建模lda_使用LDA的Google
- 下一篇: 机器学习模型的性能指标