论文笔记03
目錄
1. Scene Graph Generation with External Knowledge and Image Reconstruction
2. Knowledge Acquisition for Visual Question Answering via Iterative Querying
Author: 李飛飛
publish: CVPR 2017
3. Towards VQA Models That Can Read
4. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
1.1 將BERT用于image roi 和text, 然后將text的word embedding和image roi的embedding用于其他任務,取得2-10個百分點的提升。
5. VL-BERT: PRE-TRAINING OF GENERIC VISUAL LINGUISTIC REPRESENTATIONS
6. VISUALBERT: A SIMPLE AND PERFORMANT BASELINE FOR VISION AND LANGUAGE
7.Dynamic Memory Networks for Visual and Textual Question Answering
publlish: ICML 2016
轉載于:https://www.cnblogs.com/yeran/p/11447810.html
總結
- 上一篇: 为什么京东这么任性 几千块的手机就是
- 下一篇: 火焰图(Flame Graphs)的安装