當前位置：首頁 > 编程语言 > c/c++ >内容正文

c/c++

利用Pytorch的C++前端(libtorch)读取预训练权重并进行预测

發布時間：2024/9/27 c/c++ 23 豆豆

生活随笔收集整理的這篇文章主要介紹了利用Pytorch的C++前端(libtorch)读取预训练权重并进行预测小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

本篇使用的平臺為Ubuntu，Windows平臺的請看Pytorch的C++端(libtorch)在Windows中的使用

前言

距離發布Pytorch-1.0-Preview版的發布已經有兩個多月，Pytorch-1.0最矚目的功能就是生產的大力支持，推出了C++版本的生態端(FB之前已經在Detectron進行了實驗)，包括C++前端和C++模型編譯工具。

對于我們來說，之后如果想要部署深度學習應用的時候，只需要在Python端利用Pytorch進行訓練，然后使用torch.jit導出我們訓練好的模型，再利用C++端的Pytorch讀取進行預測即可，當然C++端的Pytorch也是可以進行訓練的。

因為我們使用的C++版的Pytorch實際上為編譯好的動態鏈接庫和頭文件，官方提供已經編譯好的下載包:

之后我們將其稱之為libtorch，官方對此有個簡單的小教程：Loading a TorchScript Model in C++ — PyTorch Tutorials 1.10.1+cu102 documentation

通過這個小教程我們可以了解到這個庫的基本用法。

下圖是利用Libtorch + OpenCV-4.0.0在GPU端進行的預測(簡單識別手勢)，所使用的語言為C++，相較python版本的預測速度提升10%。

好了，廢話不多少，接下來聊聊如何使用它吧~

正式開始

Pytorch-1.0已經發布兩個月了，為什么今天才進行嘗試呢——原因很簡單，個人比較擔心其接口的不穩定性，故稍微多等樂些時間再進行嘗試。雖然多等了，但是資料依然很是匱乏，官方的相關教程少之可憐，唯一參考信息的獲取只有少數的博客和github上的issue了。

但是有一點好消息，相比于之前，現在嘗試libtorch已經幾乎沒什么問題了，各方面都已經完善，如果大家對libtorch感興趣，那么這篇文章就比較適合你啦~

另外還有個消息，Pytorch-1.0的穩定版將在這個星期五發布，也就是明天：

這樣下來，libtorch的接口已經基本穩定，剩下的就讓我們感覺嘗嘗鮮吧。

獲取libtorch

獲取libtorch的方式有兩種：

從官網下載最新的編譯好的文件：Installing C++ Distributions of PyTorch — PyTorch master documentation
自己進行源碼編譯

我這里推薦第二種，因為官方編譯好的版本為了兼容性，選擇了舊式的C++-ABI(相關鏈接：https://github.com/pytorch/pytorch/issues/13541 ; Issues linking with libtorch (C++11 ABI?) - PyTorch Forums)，如果你使用的gcc版本>5，那么如果你將libtorch與其他編譯好的庫(使用gcc-5以及以上)進行聯合編譯，很有可能出現沖突，為了避免環境上面的問題，建議自己對源碼進行編譯。當然大家也可以測試下官方的

當然還有一點需要說明，如果你僅僅只單獨使用libtorch庫(從官方下載，并沒有鏈接其他庫，例如opencv)，那么你這樣編譯那么是沒有任何問題的。大家可以直接下載官方編譯好的包進行快速嘗試。

源碼編譯

源碼編譯的前提步驟可以參考官方教程：https://github.com/pytorch/pytorch 和 Pytorch-0.4.1-cuda9.1-linux源碼安裝指南。

安裝好所有的依賴件后，我們下載好官方的源碼，然后進入Pytorch源碼目錄環境執行：

git submodule update --init --recursive # 執行更新第三方庫，確保安裝成功 mkdir build cd build python ../tools/build_libtorch.py

有個ISSUE提到必須將源碼目錄中tools/build_pytorch_libs.sh第127行左右添加一句(-D_GLIBCXX_USE_CXX11_ABI=1)再進行編譯:

THIRD_PARTY_DIR="$BASE_DIR/third_party"C_FLAGS="" # 添加上 -D_GLIBCXX_USE_CXX11_ABI=1. # Workaround OpenMPI build failure # ImportError: /build/pytorch-0.2.0/.pybuild/pythonX.Y_3.6/build/torch/_C.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN3MPI8Datatype4FreeEv # https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=686926 C_FLAGS="${C_FLAGS} -DOMPI_SKIP_MPICXX=1" LDFLAGS=""

這個其實并不需要，我們直接編譯即可。

這一部其實類似于Pytorch的源碼編譯，至于其中的細節(cuda、cudnn版本)這里不進行贅述了，大家可以查閱本站相關內頁或者根據網上教程來進行安裝：

相關內容：
CUDA,CUDNN工具箱多版本安裝、多版本切換

如果編譯無錯之后我們會看到輸出信息：

-- Install configuration: "Release" -- Set runtime path of "/home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libmkldnn.so.0.14.0" to "$ORIGIN:/home/prototype/anaconda3/envs/fastai/lib" -- Set runtime path of "/home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libc10.so" to "$ORIGIN" -- Set runtime path of "/home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libc10_cuda.so" to "$ORIGIN" -- Set runtime path of "/home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libcaffe2.so" to "$ORIGIN:/usr/lib/openmpi/lib:/usr/local/cuda/lib64:/home/prototype/anaconda3/envs/fastai/lib" -- Set runtime path of "/home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libcaffe2_gpu.so" to "$ORIGIN:/usr/local/cuda/lib64:/home/prototype/anaconda3/envs/fastai/lib:/usr/lib/openmpi/lib" -- Set runtime path of "/home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libtorch.so.1" to "$ORIGIN:/usr/local/cuda/lib64:/home/prototype/anaconda3/envs/fastai/lib" -- Set runtime path of "/home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libcaffe2_module_test_dynamic.so" to "$ORIGIN:/home/prototype/anaconda3/envs/fastai/lib"

編譯好之后的libtorch在path/to/pytorch/torch/lib/中，但要注意，實際我們在cmake中添加查找lib位置的路徑為/pytorch/torch/share/cmake。

我們之后在cmake時需要添加-DCMAKE_PREFIX_PATH=/path/to/pytorch/torch/lib/tmp_install引入libtorch路徑。

注意：在最新版的Pytorch-1.0.1中(經測試也適合1.0-1.3)，默認libtorch編譯好的文件路徑有所改變，我們應該這樣添加 -DCMAKE_PREFIX_PATH=path/to/pytorch/torch/share/cmake

不懂什么是Cmake的可以看這里：編譯器gcc、clang、make、cmake辨析

簡單測試libtorch是否正常工作

這里進行一個簡單的測試，測試我們導出的模型在python端和C++端是否一致，其中model的輸入為(n,3,224,224)的tensor，輸出為(3)的tensor，預測三個類別，首先我們在python端導出這個模型權重：

import torch from Models.MobileNetv2 import mobilenetv2model = mobildnetv2(pretrained) example = torch.rand(1, 3, 224, 224).cuda() # 注意，我這里導出的是CUDA版的模型，因為我的模型是在GPU中進行訓練的 model = model.eval()traced_script_module = torch.jit.trace(model, example) output = traced_script_module(torch.ones(1,3,224,224).cuda()) traced_script_module.save('mobilenetv2-trace.pt') print(output)

此時打印出輸出結果：

tensor([[ -1.2374, -96.6268, 19.2590]], device='cuda:0',grad_fn=<AddBackward0>)

None

上述導出的’mobilenetv2-trace.pt‘的鏈接：百度網盤請輸入提取碼提取碼：sym8

然后，我們下載官方或者自己編譯好libtorch，并且知道其所在的地址:path/to/libtorch（這只是例子，具體地址每個人不同）。然后編寫我們的CmakeLists文件，其中find_package作用為根據我們提供的地址，去尋找libtorch的TorchConfig.cmake從而將整個libtorch庫添加到我們的整體文件中：

cmake_minimum_required(VERSION 3.0.0 FATAL_ERROR) project(simnet)find_package(Torch REQUIRED)message(STATUS "Pytorch status:") message(STATUS " libraries: ${TORCH_LIBRARIES}")add_executable(simnet test.cpp) target_link_libraries(simnet ${TORCH_LIBRARIES}) set_property(TARGET simnet PROPERTY CXX_STANDARD 11)

然后編寫我們的C++端的Pytorch，簡單讀取權重信息然后創建一個tensor輸入權重模型再打印出結果：

#include "torch/script.h" #include "torch/torch.h"#include <iostream> #include <memory>using namespace std;int main(int argc, const char* argv[]) {if (argc != 2) {std::cerr << "usage: example-app <path-to-exported-script-module>\n";return -1;}// 讀取我們的權重信息// 如果是1.1版本及以下: std::shared_ptr<torch::jit::script::Module> module = torch::jit::load(argv[1]);// 如果是1.2版本及以上:torch::jit::script::Module module;try {module = torch::jit::load(argv[1]);}catch (const c10::Error& e) {std::cerr << "error loading the model\n";return -1;}module->to(at::kCUDA);assert(module != nullptr);std::cout << "ok\n";// 建立一個輸入，維度為(1,3,224,224)，并移動至cudastd::vector<torch::jit::IValue> inputs;inputs.push_back(torch::ones({1, 3, 224, 224}).to(at::kCUDA));// Execute the model and turn its output into a tensor.at::Tensor output = module->forward(inputs).toTensor();std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n'; }

我們編譯此代碼然后讀取之前導出的模型，可以發現此時輸出：

ok-1.2374 -96.6271 19.2592 [ Variable[CUDAFloatType]{1,3} ]

None

通過與之前tensor([[ -1.2374, -96.6268, 19.2590]], device='cuda:0',grad_fn=<AddBackward0>)進行對比，發現在小數點第三位出略有差別，但總體來說差別不是很大。

注意，兩次讀取都是在GPU中進行的，我們需要注意下，利用CPU和利用GPU訓練的模型是不同的，如果導出使用GPU訓練的模型(利用model.cpu()將模型移動到CPU中導出)然后使用CPU去讀取，結果并不正確，必須保證導出和讀取的設備一致。

如果使用的libtorch和導出的模型版本不匹配(這個錯誤經常出現于我們編譯libtorch的版本和導出模型的Pytorch版本不同)則會出現這個錯誤(這個問題可能會在API穩定后解決)：

(simnet:7105): GStreamer-CRITICAL **: gst_element_get_state: assertion 'GST_IS_ELEMENT (element)' failed terminate called after throwing an instance of 'c10::Error'what(): memcmp("PYTORCH1", buf, kMagicValueLength) != 0 ASSERT FAILED at /home/prototype/Downloads/pytorch/caffe2/serialize/inline_container.cc:75, please report a bug to PyTorch. File is an unsupported archive format from the preview release. (PyTorchStreamReader at /home/prototype/Downloads/pytorch/caffe2/serialize/inline_container.cc:75) frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x6c (0x7f92b7e7cf1c in /home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libc10.so) frame #1: torch::jit::PyTorchStreamReader::PyTorchStreamReader(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::istream*) + 0x6fc (0x7f92ca49a88c in /home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libcaffe2.so) frame #2: torch::jit::load(std::istream&) + 0x2c5 (0x7f92cd9619f5 in /home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libtorch.so.1) frame #3: torch::jit::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x55 (0x7f92cd961c15 in /home/prototype/Downloads/pytorch/torch/lib/tmp_install/lib/libtorch.so.1) frame #4: /home/prototype/CLionProjects/simnet/cmake-build-release/simnet() [0x404f60] frame #5: __libc_start_main + 0xf0 (0x7f92b4701830 in /lib/x86_64-linux-gnu/libc.so.6) frame #6: /home/prototype/CLionProjects/simnet/cmake-build-release/simnet() [0x407739]

利用OpenCV讀取圖像傳遞給libtorch進行預測

這樣，我們已經初步使用了libtorch進行了測試，但是實際上我們需要圖像庫來讀取圖像或者視頻，然后將其轉化為Tensor再輸入模型進行預測，這時我們就需要將libtorch與其他的庫進行聯合編譯。

這里我們將OpenCV和libtorch一起編譯，實現通過OpenCV開啟攝像頭將幀轉化為tensor進行實時的預測，并判斷當前的手勢。

編譯OpenCV

這里我們仍然推薦在當前的環境下(cmake、make、gcc版本確定情況下)編譯自己的OpenCV，如果自己之前已經編譯好可以跳過這一步。

至于如何編譯OpenCV，可以看這里：Ubuntu下源碼安裝Opencv完全指南

與OpenCV聯合編譯

自己環境中存在OpenCV的前提下，同樣使用Cmake的find_package命令可以找到，為此，我們修改CmakeLists文件為：

cmake_minimum_required(VERSION 3.12 FATAL_ERROR) project(simnet)find_package(Torch REQUIRED) # 查找libtorch find_package(OpenCV REQUIRED) # 查找OpenCVif(NOT Torch_FOUND)message(FATAL_ERROR "Pytorch Not Found!") endif(NOT Torch_FOUND)message(STATUS "Pytorch status:") message(STATUS " libraries: ${TORCH_LIBRARIES}")message(STATUS "OpenCV library status:") message(STATUS " version: ${OpenCV_VERSION}") message(STATUS " libraries: ${OpenCV_LIBS}") message(STATUS " include path: ${OpenCV_INCLUDE_DIRS}")add_executable(simnet test.cpp) target_link_libraries(simnet ${TORCH_LIBRARIES} ${OpenCV_LIBS}) set_property(TARGET simnet PROPERTY CXX_STANDARD 11)

在Cmake配置后如果正確找到后會顯示以下的信息：

-- Caffe2: CUDA detected: 9.2 -- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc -- Caffe2: CUDA toolkit directory: /usr/local/cuda -- Caffe2: Header version is: 9.2 -- Found cuDNN: v7.4.1 (include: /usr/local/cuda/include, library: /usr/local/cuda/lib64/libcudnn.so) -- Autodetected CUDA architecture(s): 6.1;6.1 -- Added CUDA NVCC flags for: -gencode;arch=compute_61,code=sm_61 -- Pytorch status: -- libraries: torch;caffe2_library;caffe2_gpu_library;/usr/lib/x86_64-linux-gnu/libcuda.so;/usr/local/cuda/lib64/libnvrtc.so;/usr/local/cuda/lib64/libnvToolsExt.so;/usr/local/cuda/lib64/libcudart_static.a;-lpthread;dl;/usr/lib/x86_64-linux-gnu/librt.so -- OpenCV library status: -- version: 4.0.0 -- libraries: opencv_calib3d;opencv_core;opencv_dnn;opencv_features2d;opencv_flann;opencv_gapi;opencv_highgui;opencv_imgcodecs;opencv_imgproc;opencv_ml;opencv_objdetect;opencv_photo;opencv_stitching;opencv_video;opencv_videoio -- include path: /usr/local/include/opencv4 -- Configuring done -- Generating done -- Build files have been written to: /home/prototype/CLionProjects/simnet/cmake-build-release

Make

然后我們的C++代碼為：

#include <opencv2/opencv.hpp> #include "torch/script.h" #include "torch/torch.h"#include <iostream> #include <memory>using namespace std;// resize并保持圖像比例不變 cv::Mat resize_with_ratio(cv::Mat& img) {cv::Mat temImage;int w = img.cols;int h = img.rows;float t = 1.;float len = t * std::max(w, h);int dst_w = 224, dst_h = 224;cv::Mat image = cv::Mat(cv::Size(dst_w, dst_h), CV_8UC3, cv::Scalar(128,128,128));cv::Mat imageROI;if(len==w){float ratio = (float)h/(float)w;cv::resize(img,temImage,cv::Size(224,224*ratio),0,0,cv::INTER_LINEAR);imageROI = image(cv::Rect(0, ((dst_h-224*ratio)/2), temImage.cols, temImage.rows));temImage.copyTo(imageROI);}else{float ratio = (float)w/(float)h;cv::resize(img,temImage,cv::Size(224*ratio,224),0,0,cv::INTER_LINEAR);imageROI = image(cv::Rect(((dst_w-224*ratio)/2), 0, temImage.cols, temImage.rows));temImage.copyTo(imageROI);}return image; }int main(int argc, const char* argv[]) {if (argc != 2) {std::cerr << "usage: example-app <path-to-exported-script-module>\n";return -1;}cv::VideoCapture stream(0);cv::namedWindow("Gesture Detect", cv::WINDOW_AUTOSIZE);std::shared_ptr<torch::jit::script::Module> module = torch::jit::load(argv[1]);module->to(at::kCUDA);cv::Mat frame;cv::Mat image;cv::Mat input;while(1){stream>>frame;image = resize_with_ratio(frame);imshow("resized image",image); //顯示攝像頭的數據cv::cvtColor(image, input, cv::COLOR_BGR2RGB);// 下方的代碼即將圖像轉化為Tensor，隨后導入模型進行預測torch::Tensor tensor_image = torch::from_blob(input.data, {1,input.rows, input.cols,3}, torch::kByte);tensor_image = tensor_image.permute({0,3,1,2});tensor_image = tensor_image.toType(torch::kFloat);tensor_image = tensor_image.div(255);tensor_image = tensor_image.to(torch::kCUDA);torch::Tensor result = module->forward({tensor_image}).toTensor();auto max_result = result.max(1, true);auto max_index = std::get<1>(max_result).item<float>();if(max_index == 0)cv::putText(frame, "paper", {40, 50}, cv::FONT_HERSHEY_PLAIN, 2.0, cv::Scalar(0, 255, 0), 2);else if(max_index == 1)cv::putText(frame, "scissors", {40, 50}, cv::FONT_HERSHEY_PLAIN, 2.0, cv::Scalar(0, 255, 0), 2);elsecv::putText(frame, "stone", {40, 50}, cv::FONT_HERSHEY_PLAIN, 2.0, cv::Scalar(0, 255, 0), 2);imshow("Gesture Detect",frame); //顯示攝像頭的數據cv::waitKey(30);}

然后在cmake時添加-DCMAKE_PREFIX_PATH=/path/to/pytorch/torch/lib/tmp_install引入libtorch路徑。

這樣我們的程序就可以運行了~

關于這個libtorch-C++的API的具體講解，因為篇幅原因沒有詳細寫出來，會在之后的文章中進行說明。

遇到的問題

上述的編譯中可能會出現這個問題，或者其他出現一大堆命名定義但顯示未定義的函數：

error: undefined reference to `cv::imread(std::string const&, int)'

如果你的OpenCV在單獨編譯使用時沒有錯誤，但是一塊編譯就出現問題，那么這代表我們的libtorch庫和OpenCV庫沖突了，沖突原因可能是OpenCV編譯OpenCV的C++-ABI版本和libtorch中的不同，所以建議OpenCV最好和libtorch在同樣的環境下編譯。

當然還有有很多奇奇怪怪的原因，Pytorch中目前對C++的文檔并不是很詳細，也比較稀缺，但是可以在Pytorch論壇和github項目中查找相關問題或者提問。

Pytorch的C++端已經接近成熟，C++的預測相比Python端會稍微快一些，也減輕了安裝Pytorch包的負擔，未來等C++的APi穩定之后，我們可以直接利用torch.jit導出我們訓練好的模型，在部署設備上，只需要一個lib庫就可以利用GPU進行預測，這樣生產效率會將會大大提高。

參考鏈接

CMakeLists.txt添加opencv庫注意事項_Moth的執著-CSDN博客_cmakelists opencv
Building PyTorch with LibTorch From Source with CUDA Support - Data Science <3 Machine Learning Blog
https://github.com/tobiascz/MNIST_Pytorch_python_and_capi/blob/master/example-app.cpp
https://github.com/pytorch/pytorch/issues/14620
https://github.com/pytorch/pytorch/issues/14330
https://github.com/pytorch/pytorch/issues/12506
https://github.com/pytorch/pytorch/issues/13245#issuecomment-435165566
https://github.com/pytorch/pytorch/issues/13898#issuecomment-438657077

總結

以上是生活随笔為你收集整理的利用Pytorch的C++前端(libtorch)读取预训练权重并进行预测的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。