當前位置：首頁 > 人工智能 > ChatGpt >内容正文

ChatGpt

基于神经网络的混合计算(DNC)-Hybrid computing using a NN with dynamic external memory

發布時間：2023/12/31 ChatGpt 29 豆豆

生活随笔收集整理的這篇文章主要介紹了基于神经网络的混合计算(DNC)-Hybrid computing using a NN with dynamic external memory 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

前言：

DNC可以稱為NTM的進一步發展，希望先看看這篇譯文，關于NTM的譯文：人工機器-NTM-Neutral Turing Machine

基于神經網絡的混合計算

Hybrid computing using a neural network with dynamic external memory

原文：Nature：doi: 10.1038/nature20101

異義祠：memory matrix ：存儲矩陣，內存以矩陣方式編碼，亦成為記憶矩陣。

??????????? ?? the neural Turing machine：神經圖靈機[16]。看做是DNC的早期版本。

???????????? ? differentiable attention mechanisms：可微注意力機制。

???????????? ? The read vector：結合操作符和數據結構的操作。

使用神經網絡和動態外部存儲器進行混合計算

Hybrid computing using a neural network with dynamic external memory

1. 摘要

ANN非常擅長感知處理、序列學習、增強學習，而由于外部存儲器的缺失，在表達變量、數據結構和存儲長時間數據上能力有限。
在此我們介紹一種機器學習模型稱為可微神經計算機 (DNC) ，包含一個可以讀取和寫入外部存儲器的神經網絡，類似于傳統計算機的隨機存儲器。正如傳統計算機，可以用內存來表達和操縱復雜的數據結構，并且，類似于一個神經網絡，依然可以從數據中進行學習。
當使用監督學習進行訓練時，我們可以確定，DNC 可以成功地解答用來模仿自然語言中的推理和判斷的綜合問題。我們可以得到，它可以進行任務學習，例如查找隨機圖中指定點之間的最短路徑和推斷的缺失環節，之后再將這種能力泛化，用于交通線路圖、家譜等特定的圖。
使用強化學習訓練后，DNC 能夠完成移動拼圖這個益智游戲，其中游戲目標可以使用序列符號進行表示。
綜上所述，我們的成果展示了 DNC 擁有解決復雜、結構化任務的能力，這些任務是沒有外部可讀寫的存儲器的神經網絡難以勝任的。

2. 前言

現代計算機普遍使用計算和數據分離的計算體系，計算和輸入輸出分離。這包含兩個便利：分層的存儲結構帶來價格和存儲的折中。但是變量的讀取和生成需要運算器對地址進行操作，不好之處就是，在內存動態增長的網絡中，網絡不能進行隨機動態進行存儲操作。
最近的在信號處理、序列學習、強化學習、認知科學和神經科學有很大突破，但在表達變量和數據結構時受到限制。此文旨在通過提供一個結合神經網絡和外部存儲器的結構，結合神經網絡和計算處理的優勢，方法是聚焦于最小化備忘錄memoranda/內存和長時間存儲器的接口。整個系統是可微的，因此可以使用隨機梯度下降法進行端到端的訓練，允許網絡學習如何在有目的行為中操作和組織內存。

3.系統概覽

DNC 是一種耦合到外部存儲矩陣的神經網絡（只要內存不被占用完全，網絡的行為與內存塊的大小獨立|應該是使用了分布表進行去位置相關|，因此我們認為內存是“外部的”）。如果內存可以被認為是 DNC 的 RAM，網絡則可以被稱為控制器，CPU可微的操作是通過梯度下降法直接進行學習。DNC的早期結構，神經圖靈機，擁有相似的結構，但使用了更受限的內存存取方法。
DNC 架構不同于最近提出的Memory networks和Pointer networks的神經記憶框架，其區別在于DNC內存有選擇性地可以寫入和讀取，允許迭代修改內存內容。
相比傳統計算機使用唯一編址內存，DNC使用可微注意/分析機制[2,16-18]定義指派內存第N行或者“位置”，在N*W的矩陣M中（這樣直接定義內存有問題啊），這些分派，這里我們成為權值，表示此處位置涉及到讀或者寫的程度/度量？。讀向量r通過對記憶矩陣M的一個讀權值操作wr返回( 記憶位置的權值累加和 )：
類比，寫操作符使用一個寫權值wW首先擦除向量e，然后加和一個向量v：
??????????????????????? M[ i, j ] <—— M[ i, j ]
決定和應用權值的單元叫做讀寫頭。頭的操作可由表1進行闡述。

表1 DNC的結構

a,A recurrent controller network receives input from an external data source and produces output.b, c, The controller also outputs vectors that parameterize one write head (green) and multiple read heads (two in this case, blue and pink). (A reduced selection of parameters is shown.) The write head defines a write and an erase vector that are used to edit the N × W memory matrix, whose elements’ magnitudes and signs are indicated by box area and shading, respectively. Additionally, a write key is used for content lookup to find previously written locations to edit. The write key can contribute to defining a weighting that selectively focuses the write operation over the rows, or locations, in the memory matrix. The read heads can use gates called read modes to switch between content lookup using a read key (‘C’) and reading out locations either forwards (‘F’) or backwards (‘B’) in the order they were written. d, The usage vector records which locations have been used so far, and a temporal link matrix records the order in which locations were written; here, we represent the order locations were written to using directed arrows.

a,一個DNC結構從額外的數據源接受數據輸入并產生輸出;

b,c,控制器可以寫/輸出向量（參數化一個寫磁頭-綠色）且并聯一個讀磁頭（上圖中有兩個，藍色和粉色）。

寫磁頭定義了一個寫和擦除向量（用于編輯N*M內存塊），其元素的量級和符號通過塊區域和shading唯一表示。另外，一個寫鍵用來查找內容去尋找先前寫過的位置（待編輯）。寫鍵可以用于定義一個權值（有選擇的）確定于寫操作在矩陣塊的行或者位置。

讀磁頭可以使用門（被稱作讀模式）來進行使用一個讀鍵（“C”）進行內容查找，和讀出位置后（使用F鍵進行前向搜索或者“B”鍵進行后項）寫入。

d.使用標記位置向量記錄目前已使用位置，一個緩存鏈接矩陣記錄被寫入的順序；圖中，我們使用有向箭頭表示寫入的順序。

4 EXPERIMENT SETTINGS

?????? We evaluate the proposed approach on the task of English-to-French translation. We use the bilingual, parallel corpora provided by ACL WMT ’14.3 As a comparison, we also report the performance of an RNN Encoder–Decoder which was proposed recently by Cho et al. (2014a). We use the same training procedures and the same dataset for both models.4

????????? 不要再翻譯了，可能不小心nature會找上門來。

References

1. Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems Vol. 25 (eds Pereira, F. et al.) 1097–1105 (Curran Associates, 2012).
2. Graves, A. Generating sequences with recurrent neural networks. Preprint at http://arxiv.org/abs/1308.0850 (2013).
3. Sutskever, I., Vinyals, O. & Le, Q. V. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems Vol. 27 (eds Ghahramani, Z. et al.) 3104–3112 (Curran Associates, 2014).
4. Mnih, V. et al. Human-level control through deep reinforcement learning.Nature 518, 529–533 (2015).
5. Gallistel, C. R. & King, A. P. Memory and the Computational Brain: Why Cognitive Science Will Transform Neuroscience (John Wiley & Sons, 2011).
6. Marcus, G. F. The Algebraic Mind: Integrating Connectionism and Cognitive Science (MIT Press, 2001).
7. Kriete, T., Noelle, D. C., Cohen, J. D. & O’Reilly, R. C. Indirection and symbol-like processing in the prefrontal cortex and basal ganglia. Proc. Natl Acad. Sci. USA110, 16390–16395 (2013).
8. Hinton, G. E. Learning distributed representations of concepts. In Proc. Eighth Annual Conference of the Cognitive Science Society Vol. 1, 1–12 (Lawrence Erlbaum Associates, 1986).
9. Bottou, L. From machine learning to machine reasoning. Mach. Learn. 94, 133–149 (2014).
10. Fusi, S., Drew, P. J. & Abbott, L. F. Cascade models of synaptically stored memories. Neuron 45, 599–611 (2005).
11. Ganguli, S., Huh, D. & Sompolinsky, H. Memory traces in dynamical systems. Proc. Natl Acad. Sci. USA 105, 18970–18975 (2008).
12. Kanerva, P. Sparse Distributed Memory (MIT press, 1988).
13. Amari, S.-i. Characteristics of sparsely encoded associative memory. Neural Netw. 2, 451–457 (1989).
14. Weston, J., Chopra, S. & Bordes, A. Memory networks. Preprint at http://arxiv.org/abs/1410.3916 (2014).
15. Vinyals, O., Fortunato, M. & Jaitly, N. Pointer networks. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C et al.) 2692–2700(Curran Associates, 2015).
16. Graves, A., Wayne, G. & Danihelka, I. Neural Turing machines. Preprint at http://arxiv.org/abs/1410.5401 (2014).
17. Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. Preprint at http://arxiv.org/abs/1409.0473 (2014).
18. Gregor, K., Danihelka, I., Graves, A., Rezende, D. J. & Wierstra, D. DRAW: a recurrent neural network for image generation. In Proc. 32nd International Conference on Machine Learning (eds Bach, F. & Blei, D.) 1462–1471 (JMLR, 2015).
19. Hintzman, D. L. MINERVA 2: a simulation model of human memory. Behav. Res. Methods Instrum. Comput. 16, 96–101 (1984).
20. Kumar, A. et al. Ask me anything: dynamic memory networks for natural language processing. Preprint at http://arxiv.org/abs/1506.07285 (2015).
21. Sukhbaatar, S. et al. End-to-end memory networks. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C et al.) 2431–2439(Curran Associates, 2015).
22. Magee, J. C. & Johnston, D. A synaptically controlled, associative signal for Hebbian plasticity in hippocampal neurons. Science 275, 209–213 (1997).
23. Johnston, S. T., Shtrahman, M., Parylak, S., Gonc? alves, J. T. & Gage, F. H. Paradox of pattern separation and adult neurogenesis: a dual role for new neurons balancing memory resolution and robustness. Neurobiol. Learn. Mem. 129, 60–68 (2016).
24. O’Reilly, R. C. & McClelland, J. L. Hippocampal conjunctive encoding, storage, and recall: avoiding a trade-off. Hippocampus 4, 661–682 (1994).
25. Howard, M. W. & Kahana, M. J. A distributed representation of temporal context. J. Math. Psychol. 46, 269–299 (2002).
26. Weston, J., Bordes, A., Chopra, S. & Mikolov, T. Towards AI-complete question answering: a set of prerequisite toy tasks. Preprint at http://arxiv.org/abs/1502.05698 (2015).
27. Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9,1735–1780 (1997).
28. Bengio, Y., Louradour, J., Collobert, R. & Weston, J. Curriculum learning. In Proc.26th International Conference on Machine Learning (eds Bottou, L. & Littman, M.)41–48 (ACM, 2009).
29. Zaremba, W. & Sutskever, I. Learning to execute. Preprint at http://arxiv.org/abs/1410.4615 (2014).
30. Winograd, T. Procedures as a Representation for Data in a Computer Program for Understanding Natural Language. Report No. MAC-TR-84 (DTIC, MIT Project MAC, 1971).

31. Epstein, R., Lanza, R. P. & Skinner, B. F. Symbolic communication between two pigeons (Columba livia domestica). Science 207, 543–545 (1980).
32. McClelland, J. L., McNaughton, B. L. & O’Reilly, R. C. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102, 419–457 (1995).
33. Kumaran, D., Hassabis, D. & McClelland, J. L. What learning systems do intelligent agents need? Complementary learning systems theory updated. Trends Cogn. Sci. 20, 512–534 (2016).
34. McClelland, J. L. & Goddard, N. H. Considerations arising from a complementary learning systems perspective on hippocampus and neocortex. Hippocampus 6, 654–665 (1996).
35. Lake, B. M., Salakhutdinov, R. & Tenenbaum, J. B. Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015).
36. Rezende, D. J., Mohamed, S., Danihelka, I., Gregor, K. & Wierstra, D. One-shot generalization in deep generative models. In Proc. 33nd International Conference on Machine Learning (eds Balcan, M. F. & Weinberger, K. Q.) 1521–1529 (JMLR, 2016).
37. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D. & Lillicrap, T. Meta-learning with memory-augmented neural networks. In Proc. 33nd International Conference on Machine Learning (eds Balcan, M. F. & Weinberger, K. Q.)1842–1850 (JMLR, 2016).
38. Oliva, A. & Torralba, A. The role of context in object recognition. Trends Cogn.Sci. 11, 520–527 (2007).
39. Hermann, K. M. et al. Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C. et al.)1693–1701 (Curran Associates, 2015).
40. O’Keefe, J. & Nadel, L. The Hippocampus as a Cognitive Map (Oxford Univ. Press,1978).

總結

以上是生活随笔為你收集整理的基于神经网络的混合计算(DNC)-Hybrid computing using a NN with dynamic external memory的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：支持向量机的近邻理解：图像二分类为例（3
下一篇：消息称拼多多旗下 TEMU 将在美国开放