當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

TensorRT(2)-基本使用：mnist手写体识别

發布時間：2024/9/27 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了 TensorRT(2)-基本使用：mnist手写体识别小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

結合 tensorRT官方給出的一個例程，介紹tensorRT的使用。

這個例程是mnist手寫體識別。例程位于目錄：?/usr/src/tensorrt/samples/sampleMNIST

文件結構：

tensorrt/samples/sampleMNIST

-common.cpp

-common.h

-Makefile

-sampleMNIST.cpp

主要是?sampleMNIST.cpp?文件，?common.cpp?文件主要提供讀取文件的函數和 Logger對象。

main

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

#include <algorithm>

#include <assert.h>

#include <cmath>

#include <cuda_runtime_api.h>

#include <fstream>

#include <iostream>

#include <sstream>

#include <sys/stat.h>

#include <time.h>

#include "NvCaffeParser.h"

#include "NvInfer.h"

#include "common.h"

using namespace nvinfer1;

using namespace nvcaffeparser1;

//定義輸入輸出大小，創建Logger對象

//Logger是一個日志類，在common.h文件中定義

static Logger gLogger;

// Attributes of MNIST Caffe model

static const int INPUT_H = 28;

static const int INPUT_W = 28;

static const int OUTPUT_SIZE = 10;

//指定輸入輸出blob，和資源文件夾

const char* INPUT_BLOB_NAME = "data";

const char* OUTPUT_BLOB_NAME = "prob";

const std::vector<std::string> directories{"data/samples/mnist/", "data/mnist/"};

//查找文件

std::string locateFile(const std::string& input)

{

return locateFile(input, directories);

}

//讀取圖片

// Simple PGM (portable greyscale map) reader

void readPGMFile(const std::string& fileName, uint8_t buffer[INPUT_H * INPUT_W])

{

readPGMFile(fileName, buffer, INPUT_H, INPUT_W);

}

………………

int main(int argc, char** argv)

{

if (argc > 1)

{

std::cout << "This sample builds a TensorRT engine by importing a trained MNIST Caffe model.\n";

std::cout << "It uses the engine to run inference on an input image of a digit.\n";

return EXIT_SUCCESS;

}

// Create TRT model from caffe model and serialize it to a stream

// 創建tensorRT流對象 trtModelStream，這個就跟文件流中的 ifstream 是類似的。

// trtModelStream是一塊內存區域，用于保存序列化的plan文件。

IHostMemory* trtModelStream{nullptr};

//1. build階段：調用caffeToTRTModel函數，傳入caffe模型文件和權值文件，創建 Ibuilder對象，調用模型解析函數，

//生成的plan文件保存在 gieModelStream 中

caffeToTRTModel("mnist.prototxt", "mnist.caffemodel", std::vector<std::string>{OUTPUT_BLOB_NAME}, 1, trtModelStream);

assert(trtModelStream != nullptr);

// 隨機讀取一張圖片

// Read a random digit file

srand(unsigned(time(nullptr)));

uint8_t fileData[INPUT_H * INPUT_W];

const int num = rand() % 10;

readPGMFile(locateFile(std::to_string(num) + ".pgm", directories), fileData);

//將原始圖片中的像素用二進制文本 “.:-=+*#%@”來輸出

// Print ASCII representation of digit

std::cout << "\nInput:\n" << std::endl;

for (int i = 0; i < INPUT_H * INPUT_W; i++)

std::cout << (" .:-=+*#%@"[fileData[i] / 26]) << (((i + 1) % INPUT_W) ? "" : "\n");

// 加載均值文件，將讀取的圖片統一減去平均值。

// Parse mean file

ICaffeParser* parser = createCaffeParser();

IBinaryProtoBlob* meanBlob = parser->parseBinaryProto(locateFile("mnist_mean.binaryproto", directories).c_str());

parser->destroy();

// Subtract mean from image

const float* meanData = reinterpret_cast<const float*>(meanBlob->getData());

float data[INPUT_H * INPUT_W];

for (int i = 0; i < INPUT_H * INPUT_W; i++)

data[i] = float(fileData[i]) - meanData[i];

meanBlob->destroy();

// Deserialize engine we serialized earlier

// 創建運行時環境 IRuntime對象，傳入 gLogger 用于打印信息

IRuntime* runtime = createInferRuntime(gLogger);

assert(runtime != nullptr);

ICudaEngine* engine = runtime->deserializeCudaEngine(trtModelStream->data(), trtModelStream->size(), nullptr);

assert(engine != nullptr);

trtModelStream->destroy();

//創建上下文環境，主要用于inference 函數中啟動cuda核

IExecutionContext* context = engine->createExecutionContext();

assert(context != nullptr);

//2.deploy 階段：調用 doInference 函數，進行推理過程

// Run inference on input data

float prob[OUTPUT_SIZE];

doInference(*context, data, prob, 1);

//銷毀無用對象

// Destroy the engine

context->destroy();

engine->destroy();

runtime->destroy();

//輸出分類結果

// Print histogram of the output distribution

std::cout << "\nOutput:\n\n";

float val{0.0f};

int idx{0};

for (unsigned int i = 0; i < 10; i++)

{

val = std::max(val, prob[i]);

if (val == prob[i]) idx = i;

std::cout << i << ": " << std::string(int(std::floor(prob[i] * 10 + 0.5f)), '*') << "\n";

}

std::cout << std::endl;

return (idx == num && val > 0.9f) ? EXIT_SUCCESS : EXIT_FAILURE;

}

實際上從第93行創建 IRuntime對象時，就可以認為是屬于deploy了。

最后輸出是這樣的：讀進一張9，輸出一個結果：

正在上傳…重新上傳取消

其中最重要的兩個函數 caffeToTRTModel( ) 和 doInference( )分別完成的是build和deploy的功能。

Build Phase

正在上傳…重新上傳取消

將Caffe model 轉換為 TensorRT object，首先使用其他深度學習框架訓練好模型，然后丟進tensorRT優化器中進行優化，優化后會產生一個文件，這個文件可以認為是優化后的模型，接著使用序列化方法將這個優化好后的模型存儲在磁盤上，存儲到磁盤上的文件稱為?plan file。

這個階段需要給tensorRT提供兩個文件，分別是

網絡模型文件（比如，caffe的deploy.prototxt）
訓練好的權值文件（比如，caffe的net.caffemodel）

除此之外，還需要明確 batch size，并指明輸出層。

mnist例程中的caffe模型解析代碼：標志是創建 IBuilder對象。

// 解析caffemodel到tensorrt

void caffeToTRTModel(const std::string& deployFile, // Path of Caffe prototxt file

const std::string& modelFile, // Path of Caffe model file

const std::vector<std::string>& outputs, // Names of network outputs

unsigned int maxBatchSize, // Note: Must be at least as large as the batch we want to run with

IHostMemory*& trtModelStream) // Output buffer for the TRT model

{

// 1. Create builder

//創建一個 IBuilder，傳進gLogger參數是為了方便打印信息。

//builder 這個地方感覺像是使用了建造者模式。

IBuilder* builder = createInferBuilder(gLogger);

// Parse caffe model to populate network, then set the outputs

const std::string deployFpath = locateFile(deployFile, directories);

const std::string modelFpath = locateFile(modelFile, directories);

std::cout << "Reading Caffe prototxt: " << deployFpath << "\n";

std::cout << "Reading Caffe model: " << modelFpath << "\n";

//創建一個 network對象，但是這個network對象只是一個空架子，里面的屬性還沒有具體的數值。

INetworkDefinition* network = builder->createNetwork();

//創建一個caffe模型解析對象，parser,并調用解析函數，填充network對象，

//將caffe模型中的blob解析為tensorRT中的tensor，賦給blob_name_to_tensor變量。

//此處使用了模型文件和權值文件。

ICaffeParser* parser = createCaffeParser();

const IBlobNameToTensor* blobNameToTensor = parser->parse(deployFpath.c_str(),

modelFpath.c_str(),

*network,

DataType::kFLOAT);

//標記輸出blob.

// Specify output tensors of network

for (auto& s : outputs)

network->markOutput(*blobNameToTensor->find(s.c_str()));

// 設置batch size；設置工作空間 size。

builder->setMaxBatchSize(maxBatchSize);

builder->setMaxWorkspaceSize(1 << 20);

// 2.Build engine

//使用network創建 CudaEngine，優化方法在這里執行。

//至此，caffe模型已轉換為tensorRT object。

ICudaEngine* engine = builder->buildCudaEngine(*network);

assert(engine);

//銷毀沒用的network對象和parser對象。

// Destroy parser and network

network->destroy();

parser->destroy();

//將轉換好的tensorRT object序列化到內存中，trtModelStream是一塊內存空間。

//這里也可以序列化到磁盤中。

// Serialize engine and destroy it

trtModelStream = engine->serialize();

//銷毀無用對象

engine->destroy();

builder->destroy();

//關閉protobuf庫

shutdownProtobufLibrary();

}

Deploy Phase

正在上傳…重新上傳取消

Deploy 階段需要文件如下：

標簽文件（這個主要是將模型產生的數字標號分類，與真實的名稱對應起來），不過這個例子中就不需要了，因為MNIST的真實分類就是數字標號。

Deploy 階段可以認為從主函數中就已經開始了。標志是創建 IRuntime 對象。

int main(int argc, char** argv)

{

………………

// Deserialize engine we serialized earlier

// 創建運行時環境 IRuntime對象，傳入 gLogger 用于打印信息

IRuntime* runtime = createInferRuntime(gLogger);

assert(runtime != nullptr);

ICudaEngine* engine = runtime->deserializeCudaEngine(trtModelStream->data(), trtModelStream->size(), nullptr);

assert(engine != nullptr);

trtModelStream->destroy();

//創建上下文環境，主要用于inference 函數中啟動cuda核

IExecutionContext* context = engine->createExecutionContext();

assert(context != nullptr);

//2.deploy 階段：調用 inference 函數，進行推理過程

// Run inference on input data

float prob[OUTPUT_SIZE];

doInference(*context, data, prob, 1);

………………

}

其中 doInference函數的詳細內容如下：

void doInference(IExecutionContext& context, float* input, float* output, int batchSize)

{

//使用傳進來的context恢復engine。

const ICudaEngine& engine = context.getEngine();

//engine.getNbBindings()是為了獲取與這個engine相關的輸入輸出tensor的數量。

//這個地方，輸入+輸出總共就2個，所以做個驗證。

// Pointers to input and output device buffers to pass to engine.

// Engine requires exactly IEngine::getNbBindings() number of buffers.

assert(engine.getNbBindings() == 2);

//void* 型數組，主要用于下面GPU開辟內存。

void* buffers[2];

//獲取與這個engine相關的輸入輸出tensor的索引。

// In order to bind the buffers, we need to know the names of the input and output tensors.

// Note that indices are guaranteed to be less than IEngine::getNbBindings()

const int inputIndex = engine.getBindingIndex(INPUT_BLOB_NAME);

const int outputIndex = engine.getBindingIndex(OUTPUT_BLOB_NAME);

//為輸入輸出tensor開辟顯存。

// Create GPU buffers on device

CHECK(cudaMalloc(&buffers[inputIndex], batchSize * INPUT_H * INPUT_W * sizeof(float)));

CHECK(cudaMalloc(&buffers[outputIndex], batchSize * OUTPUT_SIZE * sizeof(float)));

//創建cuda流，用于管理數據復制，存取，和計算的并發操作

// Create stream

cudaStream_t stream;

CHECK(cudaStreamCreate(&stream));

//從內存到顯存，從CPU到GPU，將輸入數據拷貝到顯存中

//input是讀入內存中的數據；buffers[inputIndex]是顯存上的存儲區域，用于存放輸入數據

// DMA input batch data to device, infer on the batch asynchronously, and DMA output back to host

CHECK(cudaMemcpyAsync(buffers[inputIndex], input, batchSize * INPUT_H * INPUT_W * sizeof(float), cudaMemcpyHostToDevice, stream));

//啟動cuda核，異步執行推理計算

context.enqueue(batchSize, buffers, stream, nullptr);

//從顯存到內存，將計算結果拷貝回內存中

//output是內存中的存儲區域;buffers[outputIndex]是顯存中的存儲區域，存放模型輸出.

CHECK(cudaMemcpyAsync(output, buffers[outputIndex], batchSize * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, stream));

//這個是為了同步不同的cuda流。

cudaStreamSynchronize(stream);

//銷毀流對象和釋放顯存

// Release stream and buffers

cudaStreamDestroy(stream);

CHECK(cudaFree(buffers[inputIndex]));

CHECK(cudaFree(buffers[outputIndex]));

}

輔助函數

用到 common.cpp 文件中的輔助函數：locateFile( ) 和 readPGMFile( )

#include "common.h"

// Locate path to file, given its filename or filepath suffix and possible dirs it might lie in

// Function will also walk back MAX_DEPTH dirs from CWD to check for such a file path

//查找文件

inline std::string locateFile(const std::string& filepathSuffix, const std::vector<std::string>& directories)

{

const int MAX_DEPTH{10};

bool found{false};

std::string filepath;

for (auto& dir : directories)

{

filepath = dir + filepathSuffix;

for (int i = 0; i < MAX_DEPTH && !found; i++)

{

std::ifstream checkFile(filepath);

found = checkFile.is_open();

if (found) break;

filepath = "../" + filepath; // Try again in parent dir

}

if (found)

{

break;

}

filepath.clear();

}

if (filepath.empty()) {

std::string directoryList = std::accumulate(directories.begin() + 1, directories.end(), directories.front(),

[](const std::string& a, const std::string& b) { return a + "\n\t" + b; });

throw std::runtime_error("Could not find " + filepathSuffix + " in data directories:\n\t" + directoryList);

}

return filepath;

}

//讀取圖片

inline void readPGMFile(const std::string& fileName, uint8_t* buffer, int inH, int inW)

{

std::ifstream infile(fileName, std::ifstream::binary);

assert(infile.is_open() && "Attempting to read from a file that is not open.");

std::string magic, h, w, max;

infile >> magic >> h >> w >> max;

infile.seekg(1, infile.cur);

infile.read(reinterpret_cast<char*>(buffer), inH * inW);

}

日志類

common.h文件中有個日志類：?class Logger : public nvinfer1::ILogger

這是一個日志類，繼承自?nvinfer1::ILogger

// Logger for TensorRT info/warning/errors

class Logger : public nvinfer1::ILogger

{

public:

Logger(): Logger(Severity::kWARNING) {}

Logger(Severity severity): reportableSeverity(severity) {}

void log(Severity severity, const char* msg) override

{

// suppress messages with severity enum value greater than the reportable

if (severity > reportableSeverity) return;

switch (severity)

{

case Severity::kINTERNAL_ERROR: std::cerr << "INTERNAL_ERROR: "; break;

case Severity::kERROR: std::cerr << "ERROR: "; break;

case Severity::kWARNING: std::cerr << "WARNING: "; break;

case Severity::kINFO: std::cerr << "INFO: "; break;

default: std::cerr << "UNKNOWN: "; break;

}

std::cerr << msg << std::endl;

}

Severity reportableSeverity{Severity::kWARNING};

};

nvinfer1::ILogger?這個類位于 tensorRT頭文件?NvInfer.h?中，此文件路徑：?/usr/include/x86_64-linux-gnu/NvInfer.h

把 ILogger 類摘出來：

class ILogger

{

public:

//!

//! \enum Severity

//!

//! The severity corresponding to a log message.

//!

enum class Severity

{

kINTERNAL_ERROR = 0, //!< An internal error has occurred. Execution is unrecoverable.

kERROR = 1, //!< An application error has occurred.

kWARNING = 2, //!< An application error has been discovered, but TensorRT has recovered or fallen back to a default.

kINFO = 3 //!< Informational messages.

};

//!

//! A callback implemented by the application to handle logging messages;

//!

//! \param severity The severity of the message.

//! \param msg The log message, null terminated.

//!

virtual void log(Severity severity, const char* msg) = 0;

protected:

virtual ~ILogger() {}

};

可見這個類是 builder, engine and runtime 的一個日志接口，這個類應該以單例模式使用。即當有多個IRuntime 和/或 IBuilder 對象時，也只能使用同一個ILogger接口。

這個接口中有個枚舉類?enum class Severity?定義了日志報告級別，分別為 kINTERNAL_ERROR，kERROR，kWARNING和kINFO；然后還有一個純虛函數 log( ) ，用戶可以自定義這個函數，以實現不同效果的打印。

比如common.h 文件中Logger類的 log()函數，就是根據不同的報告級別向標準錯誤輸出流輸出帶有不同前綴的信息。這個地方是可以自己定義的，比如你可以設置為輸出信息到文件流然后把信息保存到txt文件中等。

以上就是使用tensorRT優化MNIST的LeNet的一個簡單的例子，其實對于mnist來說，使用tensorRT加速的意義不大，因為這個模型本來就比較小，這里使用這個例子主要是為了學習tensorRT的用法。

參考

http://wiki.jikexueyuan.com/project/java-design-pattern/builder-pattern.html

史上最全設計模式導學目錄（完整版）

NVIDIA TensorRT | NVIDIA Developer

Deploying Deep Neural Networks with NVIDIA TensorRT

TensorRT Developer Guide

TensorRT C++ API

總結

以上是生活随笔為你收集整理的TensorRT(2)-基本使用：mnist手写体识别的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：卡状态不符是啥意思
下一篇： TensorRT(3)-C++ API使