日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Intro OpenCL Tutorial

發(fā)布時間:2023/12/13 编程问答 36 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Intro OpenCL Tutorial 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

Benedict R. Gaster, AMD Architect, OpenCL?

OpenCL? is a young technology, and, while a specification has been published (www.khronos.org/registry/cl/), there are currently few documents that provide a basic introduction with examples. This article helps make OpenCL? easier to understand and implement.

Note that:

  • I work at AMD, and, as such, I will test all example code on our implementation for both Windows? and Linux?; however, my intention is to illustrate the use of OpenCL? regardless of platform. All examples are written in pure OpenCL? and should run equally well on any implementation.
  • I have done my best to provide examples that work out-of-the-box on non-AMD implementations of OpenCL?, but I will not be testing them on non-AMD implementations; therefore, it is possible that an example might not work as expected on such systems. If this is the case, please let me know via our?OpenCL? forum, and I will do my best to rectify the code and publish an update.

The following “Hello World” tutorial provides a simple introduction to OpenCL?. I hope to follow up this first tutorial with additional ones covering topics such as:

  • Using platform and device layers to build robust OpenCL?
  • Program compilation and kernel objects
  • Managing buffers
  • Kernel execution
  • Kernel programming – basics
  • Kernel programming – synchronization
  • Matrix multiply – a case study
  • Kernel programming – built-ins

The “Hello World” program in OpenCL?

Here are some notes of caution on how the OpenCL? samples are written:

  • OpenCL? specifies a host API that is defined to be compatible with C89 and does not make any mention of C++ or other programming language bindings. Currently, there are several efforts to develop bindings for other languages (see the links at the end of this article), and, specifically, there has been a strong push to develop?C++ bindings. In this and subsequent tutorials, I use the C++ bindings exclusively and describe OpenCL? in these terms. See the OpenCL? 1.0 specification for the corresponding C API.Alternatively, you can view the source for the C++ bindings to see what underlying OpenCL? function is used, and with what arguments by the particular C++ binding.
  • OpenCL? defines a C-like language for programming compute device programs. These programs are passed to the OpenCL? runtime via API calls expecting values of type?char *. Often, it is convenient to keep these programs in separate source files. For this and subsequent tutorials, I assume the device programs are stored in files with names of the form?name_kernels.cl, where?name?varies, depending on the context, but the suffix?_kernels.cl?does not. The corresponding device programs are loaded at runtime and passed to the OpenCL? API. There are many alternative approaches to this; this one is chosen for readability.

For this first OpenCL? program, we start with the source for the host application.

Header files

Just like any other external API used in C++, you must include a header file when using the OpenCL? API. Usually, this is in the directory?CL?within the primary include directory. For the C++ bindings we have (replace the straight C API with?cl.h):

  • #include <utility>
  • #define __NO_STD_VECTOR // Use cl::vector instead of STL version
  • #include <CL/cl.hpp>
  • For our program, we use a small number of additional C++ headers, which are agnostic to OpenCL?.

  • #include <cstdio>
  • #include <cstdlib>
  • #include <fstream>
  • #include <iostream>
  • #include <string>
  • #include <iterator>
  • As we will dynamically request an OpenCL? device to return the “Hello World\n” string, we define it as a constant to use in calculations.

  • const std::string hw("Hello World\n");
  • Errors

    A common property of most OpenCL? API calls is that they either return an error code (type?cl_int) as the result of the function itself, or they store the error code at a location passed by the user as a parameter to the call. As with any API call that can fail, it is important, for the application to check its behavior correctly in the case of error. For the most part we will not concern ourselves with recovering from an error; for simplicity, we define a function,?checkErr, to see that a certain call has completed successfully. OpenCL? returns the value?CL_SUCCESS?in this case. If it is not, it outputs a user message and exits; otherwise, it simply returns.

  • inline void
  • checkErr(cl_int err, const char * name)
  • {
  • if (err != CL_SUCCESS) {
  • std::cerr << "ERROR: " << name
  • << " (" << err << ")" << std::endl;
  • exit(EXIT_FAILURE);
  • }
  • }
  • A common paradigm for error handling in C++ is through the use of exceptions, and the OpenCL? C++ bindings provide just such an interface. A later tutorial will cover the use of exceptions and other optional features provided by the C++ bindings. For now, let’s look at the one remaining function, “main,” necessary for our first OpenCL? application.

    OpenCL? Contexts

    The first step to initializing and using OpenCL? is to create a?context. The rest of the OpenCL? work (creating devices and memory, compiling and running programs) is performed within this?context. A?context?can have a number of associated devices (for example, CPU or GPU devices), and, within a?context, OpenCL? guarantees a relaxed memory consistency between devices. We will look at this in detail in a later tutorial; for now, we use a single device,?CL_DEVICE_TYPE_CPU, for the CPU device. We could have used?CL_DEVICE_TYPE_GPU?or some other support device type, assuming that the OpenCL? implementation supports that device. But before we can create a?context?we must first queue the OpenCL runtime to determine which platforms, i.e. different vendor’s OpenCL implementations, are present. The classcl::Platform?provides the static method cl::Platform::get for this and returns a list of platforms. For now we select the first platform and use this to create a?context. The constructor?cl::Context?should be successful and, in this case, the value of?err?is?CL_SUCCESS.

  • int
  • main(void)
  • {
  • cl_int err;
  • cl::vector< cl::Platform > platformList;
  • cl::Platform::get(&platformList);
  • checkErr(platformList.size()!=0 ? CL_SUCCESS : -1, "cl::Platform::get");
  • std::cerr << "Platform number is: " << platformList.size() << std::endl;std::string platformVendor;
  • platformList[0].getInfo((cl_platform_info)CL_PLATFORM_VENDOR, &platformVendor);
  • std::cerr << "Platform is by: " << platformVendor << "\n";
  • cl_context_properties cprops[3] =
  • {CL_CONTEXT_PLATFORM, (cl_context_properties)(platformList[0])(), 0};cl::Context context(
  • CL_DEVICE_TYPE_CPU,
  • cprops,
  • NULL,
  • NULL,
  • &err);
  • checkErr(err, "Conext::Context()");
  • Before delving into compute devices, where the ‘real’ work happens, we first allocate an OpenCL? buffer to hold the result of the kernel that will be run on the device, i.e. the string “Hello World\n.” For now we simply allocate some memory on the host and request that OpenCL? use this memory directly, passing the flagCL_MEM_USE_HOST_PTR, when creating the buffer.

  • char * outH = new char[hw.length()+1];
  • cl::Buffer outCL(
  • context,
  • CL_MEM_WRITE_ONLY | CL_MEM_USE_HOST_PTR,
  • hw.length()+1,
  • outH,
  • &err);
  • checkErr(err, "Buffer::Buffer()");
  • ?

    OpenCL??Devices

    In OpenCL? many operations are performed with respect to a given context. For example, buffers (1D regions of memory) and images (2D and 3D regions of memory) allocation are all context operations. But there are also device specific operations. For example, program compilation and kernel execution are on a per device basis, and for these a specific device handle is required. So how do we obtain a handle for a device? We simply query a context for it. OpenCL? provides the ability to queue information about particular objects, and using the C++ API it comes in the form of?object.getInfo<CL_OBJECT_QUERY>(). In the specific case of getting the device from a context:

  • cl::vector<cl::Device> devices;
  • devices = context.getInfo<CL_CONTEXT_DEVICES>();
  • checkErr(
  • devices.size() > 0 ? CL_SUCCESS : -1, "devices.size() > 0");
  • Now that we have the list of associated devices for a context, in this case a single CPU device, we need to load and build the compute program (the program we intend to run on the device, or more generally: devices). The first few lines of the following code simply load the OpenCL? device program from disk, convert it to a string, and create a?cl::Program::Sources?object using the helper constructor. Given an object of type?cl::Program::Sources?a?cl::Program, an object is created and associated with a context, then built for a particular set of?devices.

  • std::ifstream file("lesson1_kernels.cl");
  • checkErr(file.is_open() ? CL_SUCCESS:-1, "lesson1_kernel.cl");std::string prog(
  • std::istreambuf_iterator<char>(file),
  • (std::istreambuf_iterator<char>()));cl::Program::Sources source(1,
  • std::make_pair(prog.c_str(), prog.length()+1));cl::Program program(context, source);
  • err = program.build(devices,"");
  • checkErr(err, "Program::build()");
  • A given?program?can have many entry points, called kernels, and to call one we must build a kernel object. There is assumed to exist a straightforward mapping from kernel names, represented as strings, to a function defined with the?__kernel?attribute in the compute program. In this case we can build a?cl::kernel?object,?kernel. Kernel arguments are set using the C++ API with?kernel.setArg(), which takes the index and value for the particular argument.

  • cl::Kernel kernel(program, "hello", &err);
  • checkErr(err, "Kernel::Kernel()");err = kernel.setArg(0, outCL);
  • checkErr(err, "Kernel::setArg()");
  • Now that the boiler plate code is done, it is time to compute the result (the output buffer with the string “Hello World\n”). All device computations are done using a command queue, which is a virtual interface for the device in question. Each command queue has a one-to-one mapping with a given device; it is created with the associated?context?using a call to the constructor for the class?cl::CommandQueue. Given a?cl::CommandQueue?queue,kernels can be queued usingqueue.enqueuNDRangeKernel. This queues a?kernel?for execution on the associated device. The kernel can be executed on a 1D, 2D, or 3D domain of indexes that execute in parallel, given enough resources. The total number of elements (indexes) in the launch domain is called the?global?work size; individual elements are known as?work-items.?Work-items?can be grouped into?work-groups?when communication between?work-items?is required.?Work-groups?are defined with a sub-index function (called the?local?work size), describing the size in each dimension corresponding to the dimensions specified for the global launch domain. There is a lot to consider with respect to kernel launches, and we will cover this in more detail in future tutorials. For now, it is enough to note that for Hello World, each work-item computes a letter in the resulting string; and it is enough to launch?hw.length()+1, where?hw?is the?const std::string?we defined at the beginning of the program. We need the extra?work-item?to account for the?NULL?terminator.

  • cl::CommandQueue queue(context, devices[0], 0, &err);
  • checkErr(err, "CommandQueue::CommandQueue()");cl::Event event;
  • err = queue.enqueueNDRangeKernel(
  • kernel,
  • cl::NullRange,
  • cl::NDRange(hw.length()+1),
  • cl::NDRange(1, 1),
  • NULL,
  • &event);
  • checkErr(err, "ComamndQueue::enqueueNDRangeKernel()");
  • The final argument to the?enqueueNDRangeKernel?call above was a?cl::Event?object, which can be used to query the status of the command with which it is associated, (for example, it has completed). It supports the method?wait()?that blocks until the command has completed. This is required to ensure the kernel has finished execution before reading the result back into host memory with?queue.enqueueReadBuffer(). With the compute result back in host memory, it is simply a matter of outputting the result to?stdout?and exiting the program.

  • event.wait();
  • err = queue.enqueueReadBuffer(
  • outCL,
  • CL_TRUE,
  • 0,
  • hw.length()+1,
  • outH);
  • checkErr(err, "ComamndQueue::enqueueReadBuffer()");
  • std::cout << outH;
  • return EXIT_SUCCESS;
  • }
  • Finally, to make the program complete an implementation for the device program (lesson1_kernels.cl), requires defining the external entry point, hello. The kernel implementation is straightforward: it calculates a unique index as a function of the launch domain using?get_global_id(), it uses it as an index into the string,?hw, then writes its value to the output array,?out.

  • #pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable
  • __constant char hw[] = "Hello World\n";
  • __kernel void hello(__global char * out)
  • {
  • size_t tid = get_global_id(0);
  • out[tid] = hw[tid];
  • }
  • For robustness, it would make sense to check that the thread id (tid) is not out of range of the hw; for now, we assume that the corresponding call toqueue.enqueueNDRangeKernel()?is correct.

    Building and running

    On Linux, it should be enough to use a single command to build the OpenCL? program; for example:
    gcc –o hello_world –Ipath-OpenCL-include –Lpath-OpenCL-libdir lesson1.cpp –lOpenCL

    To run:
    LD_LIBRARY_PATH=path-OpenCL-libdir ./hello_world

    On Windows, with a Visual Studio command window, an example is:
    cl /Fehello_world.exe /Ipath-OpenCL-include lesson.cpp path-OpenCL-libdir/OpenCL.lib

    Let’s assume that OpenCL.dll is on the path, then, running
    .\hello_world

    outputs the following string pm stdout:
    Hello World

    This completes our introductory tutorial to OpenCL?. Your feedback, comments, and questions are requested. Please visit our??OpenCL? forum.

    Useful Links

    The following list provides links to some specific programming bindings, other than C, for OpenCL?. I have not tested these and cannot vouch for their correctness, but hope they will be useful:

    • OpenCL? specification and headers:
      http://www.khronos.org/registry/cl/
    • OpenCL? technical forum:
      http://www.khronos.org/message_boards/viewforum.php?f=28
    • The C++ bindings used in this tutorial can be found on the OpenCL? web page at Khronos, along with complete documentation:
      http://www.khronos.org/registry/cl/
    • Python bindings can be found here:
      http://wiki.tiker.net/PyOpenCL
    • C# bindings can be found here:
      http://www.khronos.org/message_boards/viewtopic.php?f=28&t=1932

    ?

    OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

    總結(jié)

    以上是生活随笔為你收集整理的Intro OpenCL Tutorial的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

    如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。

    主站蜘蛛池模板: 激情婷| 五十路六十路七十路熟婆 | 国产综合第一页 | 97视频精品| 91视频毛片 | 奇米在线| 91精品久久人妻一区二区夜夜夜 | 米奇色| 91久久国产综合久久91精品网站 | 成人三级视频 | xx久久 | 少妇h视频 | 奇米影视第四色777 波多野结衣一区二区三区免费视频 | 毛片久久久久 | 青青操网| 黄色精品一区二区 | 一本色道久久88综合日韩精品 | 亚洲高清成人 | 香蕉视频在线免费 | 99re在线视频观看 | 亚洲欧洲无码一区二区三区 | 性色一区二区三区 | 超碰人人超碰 | 91色交视频 | 中文字幕码精品视频网站 | 不卡在线一区二区 | 久久最新精品 | 性感av在线 | 亚洲3p | 亚洲人精品午夜射精日韩 | 青青操精品 | 欧美激情精品久久久久久 | 精品人妻一区二区三区蜜桃 | 不卡av中文字幕 | 精品三级在线 | 成人动作片 | 亚洲欧美日韩一区二区 | 亚洲天堂网址 | 韩国一区二区视频 | 99视频在线播放 | a天堂在线观看 | 影音先锋在线观看视频 | 人禽l交视频在线播放 视频 | 成人看的视频 | 在线一级视频 | 四虎在线免费视频 | 激情五月色婷婷 | 日本顶级大片 | 日本久久久久久久久久 | 国产69精品久久久久久久久久 | 欧美熟女一区二区 | 日韩偷拍一区 | 亚洲一区二区三区在线免费观看 | 亚洲一个色 | 亚洲欧洲日产av | 精品综合网 | 羞羞的软件 | 嫩草影院污 | 国产区在线观看视频 | 雪白的扔子视频大全在线观看 | 欧美剧场| 亚洲人成在线观看 | 日韩人妻无码一区二区三区 | 蜜桃av成人永久免费 | 亚洲国产97在线精品一区 | 久久精品无码毛片 | 欧美 日韩 综合 | 激情一区二区三区 | 国产精品一亚洲av日韩av欧 | 国产美女视频一区二区 | 香蕉黄视频 | 日韩理论视频 | 欧美日日骚 | 中文字幕另类 | 国产字幕侵犯亲女 | juliaann欧美二区三区 | 少妇毛片一区二区三区 | 亚洲乱码国产乱码精品精 | 美女88av | 亚洲区小说区图片区qvod | 中文字幕亚洲精品在线 | 日日摸天天爽天天爽视频 | 国产乱码精品一区二区三区五月婷 | 奇米影视777第四色 2019中文字幕在线免费观看 | 欧美综合区 | 免费看黄色小视频 | 老司机福利精品 | 少妇做爰xxxⅹ性视频 | 天天干天天爽 | 色眯眯影院 | av天天堂 | 亚洲精品爱爱 | 99精品视频99| 国产尤物av尤物在线看 | 国产精品自拍小视频 | 狠狠干男人的天堂 | 国产成人精品在线 | 国产欧美在线一区 | 欧美巨大另类极品videosbest |