0
点赞
收藏
分享

微信扫一扫

YoloV5在tensorRT上加速(Windows)(C++)(webcam)


文章目录

  • ​​1. 软件安装​​
  • ​​1.1 OpenCV安装​​
  • ​​1.2 cmake安装​​
  • ​​1.3 TensorRT安装​​
  • ​​1.4 tensorrtx配置​​
  • ​​1.5 yolov5​​
  • ​​2. 修改CMakeList.txt​​
  • ​​3. 编译tensorrtx/yolov5​​
  • ​​4. 测试示例​​
  • ​​5. webcam版本​​

1. 软件安装

  • cuda11.1
  • 对应版本的cudnn
  • opencv-3.4.0
  • VS2017
  • TensorRT-7.2.3.4
  • Cmake
  • tensorrtx(yolov5-4.0版本)
  • yolov5(yolov5-4.0版本)

对于cuda、cudnn和VS的安装在此就不做叙述了

1.1 OpenCV安装

下载opencv3.4: ​​https://opencv.org/opencv-3-4.html​​​ 注意: 不要下载最新版本(不要高于4.0版本)!
接着只需要将其解压缩,然后配置环境变量就行了
YoloV5在tensorRT上加速(Windows)(C++)(webcam)_opencv
运行exe(其实是解压),将压缩包解压到相应目录,如: C:\Program Files (x86)\opencv
在系统变量 Path 的末尾添加:C:\Program Files (x86)\opencv\build\x64\vc15\bin

VS2017中配置OpenCV

打开vs2017,新建立一个空的项目,在菜单栏中点击【视图】[属性管理器],这时候右边会出现一个属性管理器工作区来。

工程中右击–>属性

包含目录 + 库目录 + 链接器(debug里面加了,releas也可以加)

  • 包含目录 配置:
    VC++目录—>包含目录:
    D:\opencv\build\include ;
    D:\opencv\build\include\opencv;
    D:\opencv\build\include\opencv2
  • 库目录 配置:
    VC++目录—>库目录:
    D:\opencv\build\x64\vc15\lib
    注意:(1)此处的x64表示电脑是64位,32位选择x86
    (2)vc10表示VS是2010,vc11对应VS2012,vc12对应VS2013,vc14对应VS2015 ,vc15对应VS2017
  • 链接器 配置:
    链接器–>输入–>附加依赖项
    opencv_world341.lib 注意release里面
    opencv_world341d.lib 注意,这是在debug里,不能混用,不然会闪退的

1.2 cmake安装

​​https://cmake.org/​​​ 解压后如下图所示
YoloV5在tensorRT上加速(Windows)(C++)(webcam)_c++_02

1.3 TensorRT安装

  1. 下载解压后,将lib路径添加到Path环境变量中
    ​​​https://developer.nvidia.com/nvidia-tensorrt-7x-download​​

YoloV5在tensorRT上加速(Windows)(C++)(webcam)_windows_03

  1. 将TensorRT解压位置\lib下的dll文件复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin目录下;\lib的lib文件复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib中;\include文件夹复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include中。
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_d3_04
  2. 测试示例代码
    在D:\MyWorkSpace\Lib\TensorRT-7.2.3.4\samples\sampleMNIST下运行代码
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_c++_05
    配置vs2017项目属性
    a.将E:\tensorrt_tar\TensorRT-7.2.3.4\lib加入 项目->属性->VC++目录–>可执行文件目录
    b.将E:\tensorrt_tar\TensorRT-7.2.3.4\lib加入 VC++目录–>库目录
    c. 将E:\tensorrt_tar\TensorRT-7.2.3.4\include加入C/C++ --> 常规 --> 附加包含目录
    d.将nvinfer.lib、nvinfer_plugin.lib、nvonnxparser.lib和nvparsers.lib加入链接器–>输入–>附加
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_windows_06
  3. 用anaconda 进入TensorRT-xxxx\data\mnist 目录,执行python download_pgms.py
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_vc++_07
  4. 进入TensorRT-xxxx\bin,用cmd执行sample_mnist.exe --datadir=d:\path\to\TensorRT-xxxxx\data\mnist
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_vc++_08

1.4 tensorrtx配置

​​https://github.com/wang-xinyu/tensorrtx​​​YoloV5在tensorRT上加速(Windows)(C++)(webcam)_c++_09
下载文件dirent.h
下载文件dirent.h, 下载地址 ​​https://github.com/tronkko/dirent​​ 放置到 tensorrtx/include文件夹下,文件夹需新建

1.5 yolov5

​​https://github.com/ultralytics/yolov5​​​ // 下载权重文件yolov5s.pt // 将文件tensorrtx/yolov5/gen_wts.py 复制到ultralytics/yolov5 // ensure
the file name is yolov5s.pt and yolov5s.wts in gen_wts.py // go to ultralytics/yolov5 执行

python gen_wts.py

2. 修改CMakeList.txt

修改后的版本

cmake_minimum_required(VERSION 3.2)

project(yolov5)
set(OpenCV_DIR "D:\\MyWorkSpace\\Lib\\opencv\\build") #1
set(TRT_DIR "D:\\MyWorkSpace\\Lib\\TensorRT-7.2.3.4") #2
set(OpenCV_INCLUDE_DIRS "D:\\MyWorkSpace\\Lib\\opencv\\build\\include") #3
set(OpenCV_LIBS "D:\\MyWorkSpace\\Lib\\opencv\\build\\x64\\vc14\\lib\\opencv_world340.lib") #4

add_definitions(-std=c++11)
option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_BUILD_TYPE Debug)

set(THREADS_PREFER_PTHREAD_FLAG ON)
find_package(Threads)

# setup CUDA
find_package(CUDA REQUIRED)
message(STATUS " libraries: ${CUDA_LIBRARIES}")
message(STATUS " include path: ${CUDA_INCLUDE_DIRS}")

include_directories(${CUDA_INCLUDE_DIRS})

####
enable_language(CUDA) # add this line, then no need to setup cuda path in vs
####
include_directories(${PROJECT_SOURCE_DIR}/include)
include_directories(${TRT_DIR}\\include)
include_directories(D:\\MyWorkSpace\\git\\tensorrtx-master\\include) # 5


##### find package(opencv)
include_directories(${OpenCV_INCLUDE_DIRS})
include_directories(${OpenCV_INCLUDE_DIRS}\\opencv2) #6


# -D_MWAITXINTRIN_H_INCLUDED for solving error: identifier "__builtin_ia32_mwaitx" is undefined
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -Wall -Ofast -D_MWAITXINTRIN_H_INCLUDED")

# setup opencv
find_package(OpenCV QUIET
NO_MODULE
NO_DEFAULT_PATH
NO_CMAKE_PATH
NO_CMAKE_ENVIRONMENT_PATH
NO_SYSTEM_ENVIRONMENT_PATH
NO_CMAKE_PACKAGE_REGISTRY
NO_CMAKE_BUILDS_PATH
NO_CMAKE_SYSTEM_PATH
NO_CMAKE_SYSTEM_PACKAGE_REGISTRY
)

message(STATUS "OpenCV library status:")
message(STATUS " version: ${OpenCV_VERSION}")
message(STATUS " libraries: ${OpenCV_LIBS}")
message(STATUS " include path: ${OpenCV_INCLUDE_DIRS}")

include_directories(${OpenCV_INCLUDE_DIRS})
link_directories(${TRT_DIR}\\lib) #7
link_directories(${OpenCV_DIR}\\x64\\vc14\\lib) #8

add_executable(yolov5 ${PROJECT_SOURCE_DIR}/calibrator.cpp ${PROJECT_SOURCE_DIR}/yolov5.cpp ${PROJECT_SOURCE_DIR}/yololayer.cu ${PROJECT_SOURCE_DIR}/yololayer.h)

target_link_libraries(yolov5 "nvinfer" "nvinfer_plugin") #5
target_link_libraries(yolov5 ${OpenCV_LIBS}) #6
target_link_libraries(yolov5 ${CUDA_LIBRARIES}) #7
target_link_libraries(yolov5 Threads::Threads) #8

3. 编译tensorrtx/yolov5

  • 执行cmake-gui来配置project
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_opencv_10
  • 点击 Configure并设置环境
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_c++_11
  • 点击Finish,等待Configure done
  • 点击Generate并等待Generate done
  • 点击Open Project
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_d3_12
  • 在VS2017进行编译
    YoloV5在tensorRT上加速(Windows)(C++)(webcam)_vc++_13

PS: 使用Release模式

编译成功后,文件夹Release下会出现exe
YoloV5在tensorRT上加速(Windows)(C++)(webcam)_opencv_14

4. 测试示例

生成engine

yolov5.exe -s yolov5s.wts yolov5s.engine s

YoloV5在tensorRT上加速(Windows)(C++)(webcam)_c++_15
图片推理

yolov5.exe -d yolov5s.engine ./samples

YoloV5在tensorRT上加速(Windows)(C++)(webcam)_d3_16
下图为python版本的运行时间,可以发现加速的效果还是很明显的
YoloV5在tensorRT上加速(Windows)(C++)(webcam)_vc++_17

5. webcam版本

修改yolov5.cpp(270-323)

int fcount = 0;
cv::VideoCapture capture(0);
cv::Mat frame;
while (true)
{
fcount++;
if (fcount < BATCH_SIZE) continue;
capture.read(frame);
for (int b = 0; b < fcount; b++) {
cv::Mat img = frame;
if (img.empty()) continue;
cv::Mat pr_img = preprocess_img(img, INPUT_W, INPUT_H); // letterbox BGR to RGB
int i = 0;
for (int row = 0; row < INPUT_H; ++row) {
uchar* uc_pixel = pr_img.data + row * pr_img.step;
for (int col = 0; col < INPUT_W; ++col) {
data[b * 3 * INPUT_H * INPUT_W + i] = (float)uc_pixel[2] / 255.0;
data[b * 3 * INPUT_H * INPUT_W + i + INPUT_H * INPUT_W] = (float)uc_pixel[1] / 255.0;
data[b * 3 * INPUT_H * INPUT_W + i + 2 * INPUT_H * INPUT_W] = (float)uc_pixel[0] / 255.0;
uc_pixel += 3;
++i;
}
}
}

// Run inference
auto start = std::chrono::system_clock::now();
doInference(*context, stream, buffers, data, prob, BATCH_SIZE);
auto end = std::chrono::system_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
std::vector<std::vector<Yolo::Detection>> batch_res(fcount);
for (int b = 0; b < fcount; b++) {
auto& res = batch_res[b];
nms(res, &prob[b * OUTPUT_SIZE], CONF_THRESH, NMS_THRESH);
}
for (int b = 0; b < fcount; b++) {
auto& res = batch_res[b];
//std::cout << res.size() << std::endl;
cv::Mat img = frame;
for (size_t j = 0; j < res.size(); j++) {
cv::Rect r = get_rect(img, res[j].bbox);
cv::rectangle(img, r, cv::Scalar(0x27, 0xC1, 0x36), 2);
cv::putText(img, std::to_string((int)res[j].class_id), cv::Point(r.x, r.y - 1), cv::FONT_HERSHEY_PLAIN, 1.2, cv::Scalar(0xFF, 0xFF, 0xFF), 2);
}
//cv::imwrite("_" + file_names[f - fcount + 1 + b], img);
cv::imshow("frame", img);

int c = cv::waitKey(10);
if (c == 27) {
break;
}
}
fcount = 0;
}

YoloV5在tensorRT上加速(Windows)(C++)(webcam)_d3_18


举报

相关推荐

0 条评论