日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

NEON简介

發布時間:2023/12/8 编程问答 43 豆豆
生活随笔 收集整理的這篇文章主要介紹了 NEON简介 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
  • 這篇博客旨在介紹NEON的基礎知識,同時會給出一個簡單可用的example。

NEON

  • Arm NEON technology is an advanced SIMD(Single Instruction Multiple Data) architecture extension for the Arm Cortex-A series and Cortex-R52. processors.
  • NEON technology was introduced to the Armv7-A and Armv7-R profiles. It is also now an extension to the Armv8-A and Armv8-R profiles.
  • NEON technology is intended to improve the multimedia user experience by accelerating audio and video encoding/decoding, user interface, 2D/3D graphics or gaming. NEON can also accelerate signal processing algorithms and functions to speed up applications such as audio and video processing, voice and facial recognition, computer vision and deep learning. SIMD Architecture as Figure below:


Overview

  • The NEON technology is a packed SIMD architecture. NEON registers are considered as vectors of elements of the same data type. Multiple data types are supported by the technology. The following table describes data types as supported by the architecture version.

    TypesArmv7-A/RArmv8-A/RArmv8-A
    ArchNULLAArch32AArch64
    float32-bit16/32-bit16/32/64-bit
    int8/16/32-bit8/16/32/64-bit8/16/32/64-bit
  • The NEON instructions perform the same operations in all lanes of the vectors. The number of operations performed depends on the data types. NEON instructions allow up tp:

    • 16x8-bit, 8x16-bit, 4x32-bit,, 2x64-bit integer operations
    • 8x16-bit, 4x32-bit, 2x644-bit, floating-point operations
  • The implementation on NEON technology can also support issue of multiple instructions in parallel.
    • Only in Armv8.2-A
    • Only in Armv8-A/R

How to use NEON ?

  • NEON can be used multiple ways, including NEON enabled libraries, compiler’s auto-vectorization feature, NEON intrinsics, and finally, NEON assembly code. Detailed information on NEON programming can be found in the NEON Programmer’s Guide Version:1.0.

Libraries

  • compute-library
  • Ne10
  • libyuv
  • skia

Autovectorization

  • The auto-vectorization feature is supported by Arm compilers wherein they exploit NEON functionality automatically.
  • This feature is supported by:
    • Arm Compiler 5
    • Arm LLVM-based Compiler 6
    • GCC

Compiler Intrinsics

  • NEON intrinsics are function calls that the compiler replaces with an appropriate NEON instruction or sequence of NEON instructions. Intrinsics provide almost as much control as writing assembly language but leave the allocation of registers to the compiler, so that developers can focus on the algorithms. It can also perform instruction scheduling to remove pipeline stalls for the specified target processor. This leads to more maintainable source code than using assembly language. NEON intrinsics is supported by Arm Compilers, gcc and LLVM.

Assembly code

  • For very high performance, hand-coded NEON assembler is the best approach for experienced programmers. Both GNU assembler(gas) and Arm Compiler toolchain assembler(armasm) support assembly of NEON instructions.

Example

  • 例子是一個向量加法,用到了Neon Intrinsics, 也就是上文中所說的Compiler Intrinsics,代碼neon_vecadd.cpp, 編譯命令 g++ neon_vecadd.cpp -mfpu=neon

總結

以上是生活随笔為你收集整理的NEON简介的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。