Node2Vec实战
生活随笔
收集整理的這篇文章主要介紹了
Node2Vec实战
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
Node2Vec實(shí)戰(zhàn)
數(shù)據(jù)結(jié)構(gòu)
兩個(gè)兩個(gè)連接的節(jié)點(diǎn)
1 2 2 3 4 5主程序構(gòu)建
G = nx.read_edgelist('../data/text.txt',create_using=nx.DiGraph(), nodetype=None, data=[('weight', int)]) ## 構(gòu)建模型 model = Node2Vec(G, walk_length=10, num_walks=80,p=0.25, q=4, workers=1, use_rejection_sampling=0) ## 訓(xùn)練 model.train(embed_size=4, window_size=5, iter=3) embeddings = model.get_embeddings() print(embeddings)初始生成節(jié)點(diǎn)到節(jié)點(diǎn)的概率
def preprocess_transition_probs(self):'''Preprocessing of transition probabilities for guiding the random walks.'''####get_alias_edge這個(gè)函數(shù)是對每條邊設(shè)定為二階randomwalk的概率形式###這個(gè)函數(shù)的作用是生成每個(gè)邊界的概率,同時(shí)會(huì)有alias_setup這個(gè)函數(shù)將概率進(jìn)行轉(zhuǎn)換,方便后面抽樣G = self.Gis_directed = self.is_directedalias_nodes = {}for node in G.nodes():unnormalized_probs = [G[node][nbr]['weight'] for nbr in sorted(G.neighbors(node))]#讀取每個(gè)鄰點(diǎn)權(quán)重norm_const = sum(unnormalized_probs)###權(quán)重求和,作為公式中正則項(xiàng)常數(shù)的那個(gè)分母normalized_probs = [float(u_prob)/norm_const for u_prob in unnormalized_probs]###除以分母alias_nodes[node] = alias_setup(normalized_probs)alias_edges = {}triads = {}if is_directed:for edge in G.edges():alias_edges[edge] = self.get_alias_edge(edge[0], edge[1])else:for edge in G.edges():alias_edges[edge] = self.get_alias_edge(edge[0], edge[1])alias_edges[(edge[1], edge[0])] = self.get_alias_edge(edge[1], edge[0])self.alias_nodes = alias_nodesself.alias_edges = alias_edgesreturnget_alias_edge是得到節(jié)點(diǎn)到節(jié)點(diǎn)的概率
def get_alias_edge(self, src, dst):####二階ramdom walk#src是隨機(jī)游走序列中的上一個(gè)節(jié)點(diǎn),dst是當(dāng)前節(jié)點(diǎn)'''Get the alias edge setup lists for a given edge.'''G = self.Gp = self.pq = self.qunnormalized_probs = []for dst_nbr in sorted(G.neighbors(dst)):if dst_nbr == src:unnormalized_probs.append(G[dst][dst_nbr]['weight']/p)elif G.has_edge(dst_nbr, src):unnormalized_probs.append(G[dst][dst_nbr]['weight'])else:unnormalized_probs.append(G[dst][dst_nbr]['weight']/q)norm_const = sum(unnormalized_probs)normalized_probs = [float(u_prob)/norm_const for u_prob in unnormalized_probs]return alias_setup(normalized_probs)alias_setup :輸入概率,得到對應(yīng)的兩組數(shù),方便后面的抽樣調(diào)用
def alias_setup(probs):'''alias_setup的作用是根據(jù)二階random walk輸出的概率變成每個(gè)節(jié)點(diǎn)對應(yīng)兩個(gè)數(shù),被后面的alias_draw函數(shù)所進(jìn)行抽樣'''K = len(probs)q = np.zeros(K)J = np.zeros(K, dtype=np.int)smaller = []larger = []for kk, prob in enumerate(probs):q[kk] = K*probif q[kk] < 1.0:smaller.append(kk)else:larger.append(kk)##kk是下標(biāo),表示哪些下標(biāo)小while len(smaller) > 0 and len(larger) > 0:small = smaller.pop()##smaller自己也會(huì)減少最右邊的值large = larger.pop()J[small] = largeq[large] = q[large] + q[small] - 1.0if q[large] < 1.0:smaller.append(large)else:larger.append(large)return J, qalias_draw 抽樣函數(shù)
def alias_draw(J, q):'''Draw sample from a non-uniform discrete distribution using alias sampling.'''K = len(J)kk = int(np.floor(np.random.rand()*K))if np.random.rand() < q[kk]:return kkelse:return J[kk]node2vec_walk就是對于給定的長度,對于開始節(jié)點(diǎn)開始模擬這個(gè)節(jié)點(diǎn)的路徑,涉及的函數(shù)都在上面提及
def node2vec_walk(self, walk_length, start_node):'''Simulate a random walk starting from start node.'''G = self.Galias_nodes = self.alias_nodesalias_edges = self.alias_edgeswalk = [start_node]######alias_draw這個(gè)函數(shù)是等于是根據(jù)二階random walk概率選擇下一個(gè)點(diǎn)while len(walk) < walk_length:cur = walk[-1]cur_nbrs = sorted(G.neighbors(cur))###G.neighbors(cur)得到cur一級(jí)關(guān)聯(lián)的節(jié)點(diǎn)if len(cur_nbrs) > 0:if len(walk) == 1:####cur[0]walk.append(cur_nbrs[alias_draw(alias_nodes[cur][0], alias_nodes[cur][1])])else:prev = walk[-2]next = cur_nbrs[alias_draw(alias_edges[(prev, cur)][0], alias_edges[(prev, cur)][1])]walk.append(next)else:breakreturn walk總結(jié)
以上是生活随笔為你收集整理的Node2Vec实战的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Window下MySQL 8.0重新设置
- 下一篇: 通过官网下载KITTI数据集失败解决方法