當前位置：首頁 >

DeepFM算法

發布時間：2023/12/8 48 豆豆

生活随笔收集整理的這篇文章主要介紹了 DeepFM算法小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

一：背景與特點

之前為了同時學習低階和高階組合特征，提出了?Wide&Deep 模型。它混合了一個?線性模型（Wide part）?和?Deep 模(Deep part)。這兩部分模型需要不同的輸入，而?Wide part?部分的輸入，依舊?依賴人工特征工程。但是，這些模型普遍都存在兩個問題：

偏向于提取低階或者高階的組合特征。不能同時提取這兩種類型的特征。

需要專業的領域知識來做特征工程。

于是DeepFM 應運而生，成功解決了這兩個問題，并做了一些改進，其優點如下：

不需要預訓練 FM 得到隱向量。

不需要人工特征工程。

能同時學習低階和高階的組合特征。

FM 模塊和 Deep 模塊共享?Feature Embedding?部分，可以更快的訓練，以及更精確的訓練學習。

二：模型結構

DeepFM主要做法如下：

FM Component + Deep Component。FM 提取低階組合特征，Deep 提取高階組合特征。但是和 Wide&Deep 不同的是，DeepFM 是端到端的訓練，不需要人工特征工程。

共享 feature embedding。FM 和 Deep 共享輸入和feature embedding不但使得訓練更快，而且使得訓練更加準確。相比之下，Wide&Deep 中，input vector 非常大，里面包含了大量的人工設計的 pairwise 組合特征，增加了它的計算復雜度。

需要訓練的主要有兩部分：

input_vector 和 Addition Unit 相連的全連接層，也就是 1 階的 Embedding 矩陣。

Sparse Feature 到 Dense Embedding 的 Embedding 矩陣，中間也是全連接的，要訓練的是中間的權重矩陣，這個權重矩陣也就是隱向量 Vi。

整體結構圖如下所示：

由上面網絡結構圖可以看到，DeepFM 包括 FM和 DNN兩部分，所以模型最終的輸出也由這兩部分組成：

下面，把結構圖進行拆分，分別來看這兩部分。

2.1:FM Component

FM 部分的輸出如下：

這里需要注意兩點：

由于輸入特征one-hot編碼，所以embedding vector也就是輸入層到Dense Embeddings層的權重。

Deep輸入層的神經元個數是由embedding vector和field_size共同確定，直觀來說就是：神經元的個數為embedding vector*field_size。

FM Component 總結：

FM 模塊實現了對于 1 階和 2 階組合特征的建模。

無須預訓練。

沒有人工特征工程。

embedding 矩陣的大小是：特征數量 * 嵌入維度。然后用一個 index 表示選擇了哪個特征。

2.2 :Deep Component

這里DNN的作用是構造高階組合特征，網絡里面黑色的線是全連接層，參數需要神經網絡去學習。且有一個特點：DNN的輸入也是embedding vector。所謂的權值共享指的就是這里。

這里假設α(0)=(e1,e2,...em)表示 embedding層的輸出，那么α(0)作為下一層 DNN隱藏層的輸入，其前饋過程如下：

三：總結

DeepFM優點：

沒有用 FM 去預訓練隱向量 Vi，并用 Vi去初始化神經網絡。（相比之下 FNN 就需要預訓練 FM 來初始化 DNN）。

FM 模塊不是獨立的，是跟整個模型一起訓練學習得到的。（相比之下 Wide&Deep 中的 Wide 和 Deep 部分是沒有共享的）

不需要特征工程。（相比之下 Wide&Deep 中的 Wide 部分需要特征工程）

訓練效率高。（相比 PNN 沒有那么多參數）

其中最核心的：

沒有預訓練（no pre-training）

共享 Feature Embedding，沒有特征工程（no feature engineering）

同時學習低階和高階組合特征（capture both low-high-order interaction features）

四：代碼核心部分

# ---------- first order term ----------self.y_first_order = tf.nn.embedding_lookup(self.weights["feature_bias"], self.feat_index) # None * F * 1self.y_first_order = tf.reduce_sum(tf.multiply(self.y_first_order, feat_value), 2) # None * Fself.y_first_order = tf.nn.dropout(self.y_first_order, self.dropout_keep_fm[0]) # None * F# ---------- second order term ---------------# sum_square partself.summed_features_emb = tf.reduce_sum(self.embeddings, 1) # None * Kself.summed_features_emb_square = tf.square(self.summed_features_emb) # None * K# square_sum partself.squared_features_emb = tf.square(self.embeddings)self.squared_sum_features_emb = tf.reduce_sum(self.squared_features_emb, 1) # None * K# second orderself.y_second_order = 0.5 * tf.subtract(self.summed_features_emb_square,self.squared_sum_features_emb) # None * Kself.y_second_order = tf.nn.dropout(self.y_second_order, self.dropout_keep_fm[1]) # None * K# ---------- Deep component ----------self.y_deep = tf.reshape(self.embeddings, shape=[-1, self.field_size * self.embedding_size]) # None * (F*K)self.y_deep = tf.nn.dropout(self.y_deep, self.dropout_keep_deep[0])for i in range(0, len(self.deep_layers)):self.y_deep = tf.add(tf.matmul(self.y_deep, self.weights["layer_%d" % i]),self.weights["bias_%d" % i]) # None * layer[i] * 1if self.batch_norm:self.y_deep = self.batch_norm_layer(self.y_deep, train_phase=self.train_phase,scope_bn="bn_%d" % i) # None * layer[i] * 1self.y_deep = self.deep_layers_activation(self.y_deep)self.y_deep = tf.nn.dropout(self.y_deep, self.dropout_keep_deep[1 + i]) # dropout at each Deep layer# ---------- DeepFM ----------if self.use_fm and self.use_deep:concat_input = tf.concat([self.y_first_order, self.y_second_order, self.y_deep], axis=1)elif self.use_fm:concat_input = tf.concat([self.y_first_order, self.y_second_order], axis=1)elif self.use_deep:concat_input = self.y_deepself.out = tf.add(tf.matmul(concat_input, self.weights["concat_projection"]), self.weights["concat_bias"],name="output")

?

總結

以上是生活随笔為你收集整理的DeepFM算法的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： Centos7下的LibreOffice
下一篇： itext 导出word