使用Runtime执行推理(C++)
使用Runtime執(zhí)行推理(C++)
概述
通過(guò)MindSpore Lite模型轉(zhuǎn)換后,需在Runtime中完成模型的推理執(zhí)行流程。本教程介紹如何使用C++接口編寫推理代碼。
Runtime總體使用流程如下圖所示:
包含的組件及功能如下所述:
? Model:MindSpore Lite使用的模型,通過(guò)用戶構(gòu)圖或直接加載網(wǎng)絡(luò),來(lái)實(shí)例化算子原型的列表。
? Lite Session:提供圖編譯的功能,并調(diào)用圖執(zhí)行器進(jìn)行推理。
? Scheduler:算子異構(gòu)調(diào)度器,根據(jù)異構(gòu)調(diào)度策略,為每一個(gè)算子選擇合適的kernel,構(gòu)造kernel list,并切分子圖。
? Executor:圖執(zhí)行器,執(zhí)行kernel list,動(dòng)態(tài)分配和釋放Tensor。
? Operator:算子原型,包含算子的屬性,以及shape、data type和format的推導(dǎo)方法。
? Kernel:算子庫(kù)提供算子的具體實(shí)現(xiàn),提供算子forward的能力。
? Tensor:MindSpore Lite使用的Tensor,提供了Tensor內(nèi)存操作的功能和接口。
更多C++ API說(shuō)明,請(qǐng)參考 API文檔。
讀取模型
在MindSpore Lite中,模型文件是從模型轉(zhuǎn)換工具轉(zhuǎn)換得到的.ms文件。進(jìn)行模型推理時(shí),需要從文件系統(tǒng)加載模型,并進(jìn)行模型解析,這部分操作主要在Model中實(shí)現(xiàn)。Model持有權(quán)重?cái)?shù)據(jù)、算子屬性等模型數(shù)據(jù)。
模型通過(guò)Model類的靜態(tài)Import方法從內(nèi)存數(shù)據(jù)中創(chuàng)建。函數(shù)返回的Model實(shí)例是一個(gè)指針,通過(guò)new創(chuàng)建,不再需要時(shí),需要用戶通過(guò)delete釋放。
如果對(duì)運(yùn)行時(shí)內(nèi)存有較大的限制,可以在Model被圖編譯以后,使用Free接口來(lái)降低內(nèi)存占用。但一旦調(diào)用了某個(gè)Model的Free接口,該Model就不能再進(jìn)行圖編譯了。
創(chuàng)建會(huì)話
使用MindSpore Lite執(zhí)行推理時(shí),LiteSession是推理的主入口,通過(guò)LiteSession我們可以進(jìn)行圖編譯、圖執(zhí)行。
創(chuàng)建上下文
上下文會(huì)保存會(huì)話所需的一些基本配置參數(shù),用于指導(dǎo)圖編譯和圖執(zhí)行,其定義如下:
MindSpore Lite支持異構(gòu)推理,推理時(shí)的后端配置信息由Context中的device_list_指定,默認(rèn)存放CPU的DeviceContext。在進(jìn)行圖編譯時(shí),會(huì)根據(jù)device_list_中不同的后端配置信息進(jìn)行算子選型調(diào)度。目前僅支持兩種異構(gòu),CPU和GPU異構(gòu)或者CPU和NPU異構(gòu)。 當(dāng)配置GPU的DeviceContext時(shí),優(yōu)先使用GPU推理;當(dāng)配置NPU的DeviceContext時(shí),優(yōu)先使用NPU推理。
device_list_[0]必須是CPU的DeviceContext, device_list_[1]是GPU的DeviceContext或者NPU的DeviceContext。暫時(shí)不支持同時(shí)設(shè)置CPU, GPU和NPU三個(gè)DeviceContext。
MindSpore Lite內(nèi)置一個(gè)進(jìn)程共享的線程池,推理時(shí)通過(guò)thread_num_指定線程池的最大線程數(shù),默認(rèn)為2線程,推薦最多不超過(guò)4個(gè)線程,否則可能會(huì)影響性能。
MindSpore Lite支持動(dòng)態(tài)內(nèi)存分配和釋放,如果沒(méi)有指定allocator,推理時(shí)會(huì)生成一個(gè)默認(rèn)的allocator,也可以通過(guò)Context方法在多個(gè)Context中共享內(nèi)存分配器。
如果用戶通過(guò)new創(chuàng)建Context,不再需要時(shí),需要用戶通過(guò)delete釋放。一般在創(chuàng)建完LiteSession后,Context即可釋放。
創(chuàng)建會(huì)話
有兩種方式可以創(chuàng)建會(huì)話:
? 第一種方法是使用上一步創(chuàng)建得到的Context,調(diào)用LiteSession的靜態(tài)static LiteSession *CreateSession(const lite::Context *context)方法來(lái)創(chuàng)建LiteSession。函數(shù)返回的LiteSession實(shí)例是一個(gè)指針,通過(guò)new創(chuàng)建,不再需要時(shí),需要用戶通過(guò)delete釋放。
? 第二種方法是使用上一步創(chuàng)建得到的Context,以及已經(jīng)從文件讀入的模型buffer和buffer的size,通過(guò)調(diào)用LiteSession的靜態(tài)static LiteSession *CreateSession(const char *model_buf, size_t size, const lite::Context *context)方法來(lái)創(chuàng)建LiteSession。函數(shù)返回的LiteSession實(shí)例是一個(gè)指針,通過(guò)new創(chuàng)建,不再需要時(shí),需要用戶通過(guò)delete釋放。
第二種方法中使用的CreateSession接口是一個(gè)簡(jiǎn)化流程的接口,使用這個(gè)接口可以簡(jiǎn)化調(diào)用流程。該接口的功能實(shí)現(xiàn)了三個(gè)接口的功能:單入?yún)⒌腃reateSession接口,Import接口和CompileGraph接口。
使用示例
下面示例代碼演示了Context的創(chuàng)建,以及在兩個(gè)LiteSession間共享內(nèi)存池的功能:
auto context = new (std::nothrow) lite::Context;
if (context == nullptr) {
MS_LOG(ERROR) << “New context failed while running %s”, modelName.c_str();
return RET_ERROR;
}
// CPU device context has default values.
auto &cpu_decice_info = context->device_list_[0].device_info_.cpu_device_info_;
// The large core takes priority in thread and core binding methods. This parameter will work in the BindThread interface. For specific binding effect, see the “Run Graph” section.
cpu_decice_info.cpu_bind_mode_ = HIGHER_CPU;
// If GPU device context is set. The preferred backend is GPU, which means, if there is a GPU operator, it will run on the GPU first, otherwise it will run on the CPU.
DeviceContext gpu_device_ctx{DT_GPU, {false}};
// The GPU device context needs to be push_back into device_list to work.
context->device_list_.push_back(gpu_device_ctx);
// Configure the number of worker threads in the thread pool to 2, including the main thread.
context->thread_num_ = 2;
// Allocators can be shared across multiple Contexts.
auto *context2 = new Context();
context2->thread_num_ = context->thread_num_;
context2->allocator = context->allocator;
auto &cpu_decice_info2 = context2->device_list_[0].device_info_.cpu_device_info_;
cpu_decice_info2.cpu_bind_mode_ = cpu_decice_info->cpu_bind_mode_;
// Use Context to create Session.
auto session1 = session::LiteSession::CreateSession(context);
// After the LiteSession is created, the Context can be released.
delete (context);
if (session1 == nullptr) {
MS_LOG(ERROR) << “CreateSession failed while running %s”, modelName.c_str();
return RET_ERROR;
}
// session1 and session2 can share one memory pool.
// Assume we have read a buffer from a model file named model_buf, and the size of buffer named model_buf_size
// Use Context、model_buf and model_buf_size to create Session.
auto session2 = session::LiteSession::CreateSession(model_buf, model_buf_size, context2);
// After the LiteSession is created, the Context can be released.
delete (context2);
if (session2 == nullptr) {
MS_LOG(ERROR) << “CreateSession failed while running %s”, modelName.c_str();
return RET_ERROR;
}
圖編譯
可變維度
使用MindSpore Lite進(jìn)行推理時(shí),在已完成會(huì)話創(chuàng)建與圖編譯之后,如果需要對(duì)輸入的shape進(jìn)行Resize,則可以通過(guò)對(duì)輸入的tensor重新設(shè)置shape,然后調(diào)用LiteSession的Resize接口。
某些網(wǎng)絡(luò)是不支持可變維度,會(huì)提示錯(cuò)誤信息后異常退出,比如,模型中有MatMul算子,并且MatMul的一個(gè)輸入Tensor是權(quán)重,另一個(gè)輸入Tensor是輸入時(shí),調(diào)用可變維度接口會(huì)導(dǎo)致輸入Tensor和權(quán)重Tensor的Shape不匹配,最終導(dǎo)致推理失敗。
使用示例
下面代碼演示如何對(duì)MindSpore Lite的輸入進(jìn)行Resize:
// Assume we have created a LiteSession instance named session.
auto inputs = session->GetInputs();
std::vector resize_shape = {1, 128, 128, 3};
// Assume the model has only one input,resize input shape to [1, 128, 128, 3]
std::vector<std::vector> new_shapes;
new_shapes.push_back(resize_shape);
session->Resize(inputs, new_shapes);
圖編譯
在圖執(zhí)行前,需要調(diào)用LiteSession的CompileGraph接口進(jìn)行圖編譯,進(jìn)一步解析從文件中加載的Model實(shí)例,主要進(jìn)行子圖切分、算子選型調(diào)度。這部分會(huì)耗費(fèi)較多時(shí)間,所以建議LiteSession創(chuàng)建一次,編譯一次,多次執(zhí)行。
/// \brief Compile MindSpore Lite model.
///
/// \note CompileGraph should be called before RunGraph.
///
/// \param[in] model Define the model to be compiled.
///
/// \return STATUS as an error code of compiling graph, STATUS is defined in errorcode.h.
virtual int CompileGraph(lite::Model *model) = 0;
使用示例
下面代碼演示如何進(jìn)行圖編譯:
// Assume we have created a LiteSession instance named session and a Model instance named model before.
// The methods of creating model and session can refer to “Import Model” and “Create Session” two sections.
auto ret = session->CompileGraph(model);
if (ret != RET_OK) {
std::cerr << “CompileGraph failed” << std::endl;
// session and model need to be released by users manually.
delete (session);
delete (model);
return ret;
}
model->Free();
輸入數(shù)據(jù)
獲取輸入Tensor
在圖執(zhí)行前,需要將輸入數(shù)據(jù)拷貝到模型的輸入Tensor。
MindSpore Lite提供兩種方法來(lái)獲取模型的輸入Tensor。
- 使用GetInputsByTensorName方法,根據(jù)模型輸入Tensor的名稱來(lái)獲取模型輸入Tensor中連接到輸入節(jié)點(diǎn)的Tensor。
- /// \brief Get input MindSpore Lite MSTensors of model by tensor name.
- ///
- /// \param[in] tensor_name Define tensor name.
- ///
- /// \return MindSpore Lite MSTensor.
- virtual mindspore::tensor::MSTensor *GetInputsByTensorName(const std::string &tensor_name) const = 0;
- 使用GetInputs方法,直接獲取所有的模型輸入Tensor的vector。
- /// \brief Get input MindSpore Lite MSTensors of model.
- ///
- /// \return The vector of MindSpore Lite MSTensor.
- virtual std::vector<tensor::MSTensor *> GetInputs() const = 0;
數(shù)據(jù)拷貝
當(dāng)獲取到模型的輸入,就需要向Tensor中填入數(shù)據(jù)。通過(guò)MSTensor的Size方法來(lái)獲取Tensor應(yīng)該填入的數(shù)據(jù)大小,通過(guò)data_type方法來(lái)獲取Tensor的數(shù)據(jù)類型,通過(guò)MSTensor的MutableData方法來(lái)獲取可寫的指針。
/// \brief Get byte size of data in MSTensor.
///
/// \return Byte size of data in MSTensor.
virtual size_t Size() const = 0;
/// \brief Get the pointer of data in MSTensor.
///
/// \note The data pointer can be used to both write and read data in MSTensor.
///
/// \return The pointer points to data in MSTensor.
virtual void *MutableData() const = 0;
使用示例
下面示例代碼演示了從LiteSession中獲取整圖輸入MSTensor,并且向其中灌入模型輸入數(shù)據(jù)的過(guò)程:
// Assume we have created a LiteSession instance named session.
auto inputs = session->GetInputs();
// Assume that the model has only one input tensor.
auto in_tensor = inputs.front();
if (in_tensor == nullptr) {
std::cerr << “Input tensor is nullptr” << std::endl;
return -1;
}
// It is omitted that users have read the model input file and generated a section of memory buffer: input_buf, as well as the byte size of input_buf: data_size.
if (in_tensor->Size() != data_size) {
std::cerr << “Input data size is not suit for model input” << std::endl;
return -1;
}
auto *in_data = in_tensor->MutableData();
if (in_data == nullptr) {
std::cerr << “Data of in_tensor is nullptr” << std::endl;
return -1;
}
memcpy(in_data, input_buf, data_size);
// Users need to free input_buf.
// The elements in the inputs are managed by MindSpore Lite so that users do not need to free inputs.
需要注意的是:
? MindSpore Lite的模型輸入Tensor中的數(shù)據(jù)排布必須是NHWC。
? 模型的輸入input_buf是用戶從磁盤讀取的,當(dāng)拷貝給模型輸入Tensor以后,用戶需要自行釋放input_buf。
? GetInputs和GetInputsByTensorName方法返回的vector不需要用戶釋放。
圖執(zhí)行
執(zhí)行會(huì)話
MindSpore Lite會(huì)話在進(jìn)行圖編譯以后,即可使用LiteSession的RunGraph進(jìn)行模型推理。
virtual int RunGraph(const KernelCallBack &before = nullptr, const KernelCallBack &after = nullptr) = 0;
綁核
MindSpore Lite內(nèi)置線程池支持綁核、解綁操作,通過(guò)調(diào)用BindThread接口,可以將線程池中的工作線程綁定到指定CPU核,用于性能分析。綁核操作與創(chuàng)建LiteSession時(shí)用戶指定的上下文有關(guān),綁核操作會(huì)根據(jù)上下文中的綁核策略進(jìn)行線程與CPU的親和性設(shè)置。
/// \brief Attempt to bind or unbind threads in the thread pool to or from the specified cpu core.
///
/// \param[in] if_bind Define whether to bind or unbind threads.
virtual void BindThread(bool if_bind) = 0;
需要注意的是,綁核是一個(gè)親和性操作,不保證一定能綁定到指定的CPU核,會(huì)受到系統(tǒng)調(diào)度的影響。而且綁核后,需要在執(zhí)行完代碼后進(jìn)行解綁操作。示例如下:
// Assume we have created a LiteSession instance named session.
session->BindThread(true);
auto ret = session->RunGraph();
if (ret != mindspore::lite::RET_OK) {
std::cerr << “RunGraph failed” << std::endl;
delete session;
return -1;
}
session->BindThread(false);
綁核參數(shù)有兩種選擇:大核優(yōu)先和中核優(yōu)先。
判定大核和中核的規(guī)則其實(shí)是根據(jù)CPU核的頻率而不是根據(jù)CPU的架構(gòu),對(duì)于沒(méi)有大中小核之分的CPU架構(gòu),在該規(guī)則下也可以區(qū)分大核和中核。
綁定大核優(yōu)先是指線程池中的線程從頻率最高的核開始綁定,第一個(gè)線程綁定在頻率最高的核上,第二個(gè)線程綁定在頻率第二高的核上,以此類推。
對(duì)于中核優(yōu)先,中核的定義是根據(jù)經(jīng)驗(yàn)來(lái)定義的,默認(rèn)設(shè)定中核是第三和第四高頻率的核,當(dāng)綁定策略為中核優(yōu)先時(shí),會(huì)優(yōu)先綁定到中核上,當(dāng)中核不夠用時(shí),會(huì)往小核上進(jìn)行綁定。
回調(diào)運(yùn)行
MindSpore Lite可以在調(diào)用RunGraph時(shí),傳入兩個(gè)KernelCallBack函數(shù)指針來(lái)回調(diào)推理模型,相比于一般的圖執(zhí)行,回調(diào)運(yùn)行可以在運(yùn)行過(guò)程中獲取額外的信息,幫助開發(fā)者進(jìn)行性能分析、Bug調(diào)試等。額外的信息包括:
? 當(dāng)前運(yùn)行的節(jié)點(diǎn)名稱
? 推理當(dāng)前節(jié)點(diǎn)前的輸入輸出Tensor
? 推理當(dāng)前節(jié)點(diǎn)后的輸入輸出Tensor
/// \brief callbackParam defines input arguments for callback function.
struct CallBackParam {
std::string name_callback_param; /< node name argument */
std::string type_callback_param; /< node type argument */
};
/// \brief Kernelcallback defines the function pointer for callback.
using KernelCallBack = std::function<bool(std::vector<tensor::MSTensor *> inputs, std::vector<tensor::MSTensor *> outputs, const CallBackParam &opInfo)>;
使用示例
下面示例代碼演示了使用LiteSession進(jìn)行圖編譯,并定義了兩個(gè)回調(diào)函數(shù)作為前置回調(diào)指針和后置回調(diào)指針,傳入到RunGraph接口進(jìn)行回調(diào)推理,并演示了一次圖編譯,多次圖執(zhí)行的使用場(chǎng)景:
// Assume we have created a LiteSession instance named session and a Model instance named model before.
// The methods of creating model and session can refer to “Import Model” and “Create Session” two sections.
auto ret = session->CompileGraph(model);
if (ret != RET_OK) {
std::cerr << “CompileGraph failed” << std::endl;
// session and model need to be released by users manually.
delete (session);
delete (model);
return ret;
}
// Copy input data into the input tensor. Users can refer to the “Input Data” section. We uses random data here.
auto inputs = session->GetInputs();
for (auto in_tensor : inputs) {
in_tensor = inputs.front();
if (in_tensor == nullptr) {
std::cerr << “Input tensor is nullptr” << std::endl;
return -1;
}
// When calling the MutableData method, if the data in MSTensor is not allocated, it will be malloced. After allocation, the data in MSTensor can be considered as random data.
(void) in_tensor->MutableData();
}
// Definition of callback function before forwarding operator.
auto before_call_back_ = [&](const std::vector<mindspore::tensor::MSTensor *> &before_inputs,
const std::vector<mindspore::tensor::MSTensor *> &before_outputs,
const session::CallBackParam &call_param) {
std::cout << "Before forwarding " << call_param.name_callback_param << std::endl;
return true;
};
// Definition of callback function after forwarding operator.
auto after_call_back_ = [&](const std::vector<mindspore::tensor::MSTensor *> &after_inputs,
const std::vector<mindspore::tensor::MSTensor *> &after_outputs,
const session::CallBackParam &call_param) {
std::cout << "After forwarding " << call_param.name_callback_param << std::endl;
return true;
};
// Call the callback function when performing the model inference process.
ret = session_->RunGraph(before_call_back_, after_call_back_);
if (ret != RET_OK) {
MS_LOG(ERROR) << “Run graph failed.”;
return RET_ERROR;
}
// CompileGraph would cost much time, a better solution is calling CompileGraph only once and RunGraph much more times.
for (size_t i = 0; i < 10; i++) {
auto ret = session_->RunGraph();
if (ret != RET_OK) {
MS_LOG(ERROR) << “Run graph failed.”;
return RET_ERROR;
}
}
// session and model needs to be released by users manually.
delete (session);
delete (model);
獲取輸出
獲取輸出Tensor
MindSpore Lite在執(zhí)行完推理后,就可以獲取模型的推理結(jié)果。
MindSpore Lite提供四種方法來(lái)獲取模型的輸出MSTensor。
- 使用GetOutputsByNodeName方法,根據(jù)模型輸出節(jié)點(diǎn)的名稱來(lái)獲取模型輸出MSTensor中連接到該節(jié)點(diǎn)的Tensor的vector。
- /// \brief Get output MindSpore Lite MSTensors of model by node name.
- ///
- /// \param[in] node_name Define node name.
- ///
- /// \return The vector of MindSpore Lite MSTensor.
- virtual std::vector<tensor::MSTensor *> GetOutputsByNodeName(const std::string &node_name) const = 0;
- 使用GetOutputByTensorName方法,根據(jù)模型輸出Tensor的名稱來(lái)獲取對(duì)應(yīng)的模型輸出MSTensor。
- /// \brief Get output MindSpore Lite MSTensors of model by tensor name.
- ///
- /// \param[in] tensor_name Define tensor name.
- ///
- /// \return Pointer of MindSpore Lite MSTensor.
- virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const = 0;
- 使用GetOutputs方法,直接獲取所有的模型輸出MSTensor的名稱和MSTensor指針的一個(gè)map。
- /// \brief Get output MindSpore Lite MSTensors of model mapped by tensor name.
- ///
- /// \return The map of output tensor name and MindSpore Lite MSTensor.
- virtual std::unordered_map<std::string, mindspore::tensor::MSTensor *> GetOutputs() const = 0;
當(dāng)獲取到模型的輸出Tensor,就需要向Tensor中填入數(shù)據(jù)。通過(guò)MSTensor的Size方法來(lái)獲取Tensor應(yīng)該填入的數(shù)據(jù)大小,通過(guò)data_type方法來(lái)獲取Tensor的數(shù)據(jù)類型,通過(guò)MSTensor的MutableData方法來(lái)獲取可讀寫的內(nèi)存指針。
/// \brief Get byte size of data in MSTensor.
///
/// \return Byte size of data in MSTensor.
virtual size_t Size() const = 0;
/// \brief Get data type of the MindSpore Lite MSTensor.
///
/// \note TypeId is defined in mindspore/mindspore/core/ir/dtype/type_id.h. Only number types in TypeId enum are
/// suitable for MSTensor.
///
/// \return MindSpore Lite TypeId of the MindSpore Lite MSTensor.
virtual TypeId data_type() const = 0;
/// \brief Get the pointer of data in MSTensor.
///
/// \note The data pointer can be used to both write and read data in MSTensor.
///
/// \return The pointer points to data in MSTensor.
virtual void *MutableData() const = 0;
使用示例
下面示例代碼演示了使用GetOutputs接口獲取輸出MSTensor,并打印了每個(gè)輸出MSTensor的前十個(gè)數(shù)據(jù)或所有數(shù)據(jù):
// Assume we have created a LiteSession instance named session before.
auto output_map = session->GetOutputs();
// Assume that the model has only one output node.
auto out_node_iter = output_map.begin();
std::string name = out_node_iter->first;
// Assume that the unique output node has only one output tensor.
auto out_tensor = out_node_iter->second;
if (out_tensor == nullptr) {
std::cerr << “Output tensor is nullptr” << std::endl;
return -1;
}
// Assume that the data format of output data is float 32.
if (out_tensor->data_type() != mindspore::TypeId::kNumberTypeFloat32) {
std::cerr << “Output of lenet should in float32” << std::endl;
return -1;
}
auto *out_data = reinterpret_cast<float *>(out_tensor->MutableData());
if (out_data == nullptr) {
std::cerr << “Data of out_tensor is nullptr” << std::endl;
return -1;
}
// Print the first 10 float data or all output data of the output tensor.
std::cout << "Output data: ";
for (size_t i = 0; i < 10 && i < out_tensor->ElementsNum(); i++) {
std::cout << " " << out_data[i];
}
std::cout << std::endl;
// The elements in outputs do not need to be free by users, because outputs are managed by the MindSpore Lite.
需要注意的是,GetOutputsByNodeName、GetOutputByTensorName和GetOutputs方法返回的vector或map不需要用戶釋放。
下面示例代碼演示了使用GetOutputsByNodeName接口獲取輸出MSTensor的方法:
// Assume we have created a LiteSession instance named session before.
// Assume that model has a output node named output_node_name_0.
auto output_vec = session->GetOutputsByNodeName(“output_node_name_0”);
// Assume that output node named output_node_name_0 has only one output tensor.
auto out_tensor = output_vec.front();
if (out_tensor == nullptr) {
std::cerr << “Output tensor is nullptr” << std::endl;
return -1;
}
下面示例代碼演示了使用GetOutputByTensorName接口獲取輸出MSTensor的方法:
// Assume we have created a LiteSession instance named session.
// We can use GetOutputTensorNames method to get all name of output tensor of model which is in order.
auto tensor_names = session->GetOutputTensorNames();
// Assume we have created a LiteSession instance named session before.
// Use output tensor name returned by GetOutputTensorNames as key
for (auto tensor_name : tensor_names) {
auto out_tensor = session->GetOutputByTensorName(tensor_name);
if (out_tensor == nullptr) {
std::cerr << “Output tensor is nullptr” << std::endl;
return -1;
}
}
獲取版本號(hào)
MindSpore Lite提供了Version方法可以獲取版本號(hào),包含在include/version.h頭文件中,調(diào)用該方法可以得到版本號(hào)字符串。
使用示例
下面代碼演示如何獲取MindSpore Lite的版本號(hào):
#include “include/version.h”
std::string version = mindspore::lite::Version();
Session并行
MindSpore Lite支持多個(gè)LiteSession并行推理,但不支持多個(gè)線程同時(shí)調(diào)用單個(gè)LiteSession的RunGraph接口。
單Session并行
MindSpore Lite不支持多線程并行執(zhí)行單個(gè)LiteSession的推理,否則會(huì)得到以下錯(cuò)誤信息:
ERROR [mindspore/lite/src/lite_session.cc:297] RunGraph] 10 Not support multi-threading
多Session并行
MindSpore Lite支持多個(gè)LiteSession同時(shí)進(jìn)行推理的場(chǎng)景,每個(gè)LiteSession的線程池和內(nèi)存池都是獨(dú)立的。
使用示例
下面代碼演示了如何創(chuàng)建多個(gè)LiteSession,并且并行執(zhí)行推理的過(guò)程:
#include
#include “src/common/file_utils.h”
#include “include/model.h”
#include “include/version.h”
#include “include/context.h”
#include “include/lite_session.h”
mindspore::session::LiteSession *GenerateSession(mindspore::lite::Model *model) {
if (model == nullptr) {
std::cerr << “Read model file failed while running” << std::endl;
return nullptr;
}
auto context = new (std::nothrow) mindspore::lite::Context;
if (context == nullptr) {
std::cerr << “New context failed while running” << std::endl;
return nullptr;
}
auto session = mindspore::session::LiteSession::CreateSession(context);
delete (context);
if (session == nullptr) {
std::cerr << “CreateSession failed while running” << std::endl;
return nullptr;
}
auto ret = session->CompileGraph(model);
if (ret != mindspore::lite::RET_OK) {
std::cout << “CompileGraph failed while running” << std::endl;
delete (session);
return nullptr;
}
auto msInputs = session->GetInputs();
for (auto msInput : msInputs) {
(void)msInput->MutableData();
}
return session;
}
int main(int argc, const char **argv) {
size_t size = 0;
char *graphBuf = mindspore::lite::ReadFile(“test.ms”, &size);
if (graphBuf == nullptr) {
std::cerr << “Read model file failed while running” << std::endl;
return -1;
}
auto model = mindspore::lite::Model::Import(graphBuf, size);
if (model == nullptr) {
std::cerr << “Import model file failed while running” << std::endl;
delete;
return -1;
}
delete;
auto session1 = GenerateSession(model);
if (session1 == nullptr) {
std::cerr << “Generate session 1 failed” << std::endl;
delete(model);
return -1;
}
auto session2 = GenerateSession(model);
if (session2 == nullptr) {
std::cerr << “Generate session 2 failed” << std::endl;
delete(model);
return -1;
}
model->Free();
std::thread thread1(&{
auto status = session1->RunGraph();
if (status != 0) {
std::cerr << "Inference error " << status << std::endl;
return;
}
std::cout << “Session1 inference success” << std::endl;
});
std::thread thread2(&{
auto status = session2->RunGraph();
if (status != 0) {
std::cerr << "Inference error " << status << std::endl;
return;
}
std::cout << “Session2 inference success” << std::endl;
});
thread1.join();
thread2.join();
delete (session1);
delete (session2);
delete (model);
return 0;
}
總結(jié)
以上是生活随笔為你收集整理的使用Runtime执行推理(C++)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: HiLink LiteOS IoT芯
- 下一篇: Ascend昇腾计算