TensorRT开源软件

此存储库包含NVIDIA TensorRT的开源软件(OSS)组件。其中包括TensorRT插件和解析器(Caffe和ONNX)的源代码,以及演示TensorRT平台使用和功能的示例应用程序。这些开源软件组件是TensorRT General Availability(GA)发行版的一个子集,其中包含一些扩展和错误修复。

对于TensorRT OSS的代码贡献,请参阅我们的贡献指南和编码指南。

有关TensorRT OSS发行版附带的新添加和更新的摘要,请参阅变更日志。

Build

Prerequistites

要构建TensorRT OSS组件,首先需要以下软件包。

参考链接:https://github.com/NVIDIA/TensorRT

TensorRT GA build

System Packages

Optional Packages

NOTE: onnx-tensorrt, cub, and protobuf packages are downloaded along with TensorRT OSS, and not required to be installed.

Downloading TensorRT Build

1.   Download TensorRT OSS

On Linux: Bash

git clone -b master https://github.com/nvidia/TensorRT TensorRT
cd TensorRT
git submodule update --init --recursive
export TRT_SOURCE=`pwd`

On Windows: Powershell

git clone -b master https://github.com/nvidia/TensorRT TensorRT
cd TensorRT
git submodule update --init --recursive
$Env:TRT_SOURCE = $(Get-Location)

2.   Download TensorRT GA

To build TensorRT OSS, obtain the corresponding TensorRT GA build from NVIDIA Developer Zone.

Example: Ubuntu 18.04 on x86-64 with cuda-11.1

Download and extract the latest TensorRT 7.2.1 GA package for Ubuntu 18.04 and CUDA 11.1

cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.1.cudnn8.0.tar.gz
export TRT_RELEASE=`pwd`/TensorRT-7.2.1.6

Example: Ubuntu 18.04 on PowerPC with cuda-11.0

Download and extract the latest TensorRT 7.2.1 GA package for Ubuntu 18.04 and CUDA 11.0

cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.powerpc64le-gnu.cuda-11.0.cudnn8.0.tar.gz
export TRT_RELEASE=`pwd`/TensorRT-7.2.1.6

Example: CentOS/RedHat 7 on x86-64 with cuda-11.0

Download and extract the TensorRT 7.2.1 GA for CentOS/RedHat 7 and CUDA 11.0 tar package

cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.CentOS-7.6.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz
export TRT_RELEASE=`pwd`/TensorRT-7.2.1.6

Example: Ubuntu18.04 Cross-Compile for QNX with cuda-10.2

Download and extract the TensorRT 7.2.1 GA for QNX and CUDA 10.2 tar package

cd ~/Downloads
tar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.aarch64-qnx.cuda-10.2.cudnn7.6.tar.gz
export TRT_RELEASE=`pwd`/TensorRT-7.2.1.6
export QNX_HOST=/<path-to-qnx-toolchain>/host/linux/x86_64
export QNX_TARGET=/<path-to-qnx-toolchain>/target/qnx7

Example: Windows on x86-64 with cuda-11.0

Download and extract the TensorRT 7.2.1 GA for Windows and CUDA 11.0 zip package and add msbuild to PATH

cd ~\Downloads
Expand-Archive .\TensorRT-7.2.1.6.Windows10.x86_64.cuda-11.0.cudnn8.0.zip
$Env:TRT_RELEASE = '$(Get-Location)\TensorRT-7.2.1.6'
$Env:PATH += 'C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\MSBuild\15.0\Bin\'

3.   (Optional) JetPack SDK for Jetson builds

Using the JetPack SDK manager, download the host components. Steps:

i.         Download and launch the SDK manager. Login with your developer account.

ii.         Select the platform and target OS (example: Jetson AGX Xavier, Linux Jetpack 4.4), and click Continue.

iii.         Under Download & Install Options change the download folder and select Download now, Install later. Agree to the license terms and click Continue.

iv.         Move the extracted files into the $TRT_SOURCE/docker/jetpack_files folder.

Setting Up The Build Environment

For native builds, install the prerequisite System Packages. Alternatively (recommended for non-Windows builds), install Docker and generate a build container as described below:

1.    Generate the TensorRT-OSS build container.

The TensorRT-OSS build container can be generated using the Dockerfiles and build script included with TensorRT-OSS. The build container is bundled with packages and environment required for building TensorRT OSS.

Example: Ubuntu 18.04 on x86-64 with cuda-11.1

./docker/build.sh --file docker/ubuntu.Dockerfile --tag tensorrt-ubuntu --os 18.04 --cuda 11.1

Example: Ubuntu 18.04 on PowerPC with cuda-11.0

./docker/build.sh --file docker/ubuntu-cross-ppc64le.Dockerfile --tag tensorrt-ubuntu-ppc --os 18.04 --cuda 11.0

Example: CentOS/RedHat 7 on x86-64 with cuda-11.0

./docker/build.sh --file docker/centos.Dockerfile --tag tensorrt-centos --os 7 --cuda 11.0

Example: Ubuntu 18.04 Cross-Compile for Jetson (arm64) with cuda-10.2 (JetPack)

./docker/build.sh --file docker/ubuntu-cross-aarch64.Dockerfile --tag tensorrt-cross-jetpack --os 18.04 --cuda 10.2

2.   Launch the TensorRT-OSS build container.

Example: Ubuntu 18.04 build container

./docker/launch.sh --tag tensorrt-ubuntu --gpus all --release $TRT_RELEASE --source $TRT_SOURCE

NOTE:

  1. i.         Use the tag corresponding to the build container you generated in
  2. ii.         To run TensorRT/CUDA programs in the build container, install NVIDIA Container Toolkit. Docker versions < 19.03 require nvidia-docker2 and --runtime=nvidia flag for docker run commands. On versions >= 19.03, you need the nvidia-container-toolkit package and --gpus all flag.

Building TensorRT-OSS

  • Generate Makefiles or VS project (Windows) and build.

Example: Linux (x86-64) build with default cuda-11.1

 cd $TRT_SOURCE
 mkdir -p build && cd build
 cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=`pwd`/out
 make -j$(nproc)

Example: Native build on Jetson (arm64) with cuda-10.2

cd $TRT_SOURCE
mkdir -p build && cd build
cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=`pwd`/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=10.2
make -j$(nproc)

Example: Ubuntu 18.04 Cross-Compile for Jetson (arm64) with cuda-10.2 (JetPack)

 cd $TRT_SOURCE
 mkdir -p build && cd build
 cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=`pwd`/out -DCMAKE_TOOLCHAIN_FILE=$TRT_SOURCE/cmake/toolchains/cmake_aarch64.toolchain -DCUDA_VERSION=10.2
 make -j$(nproc)

Example: Cross-Compile for QNX with cuda-10.2

 cd $TRT_SOURCE
 mkdir -p build && cd build
 cmake .. -DTRT_LIB_DIR=$TRT_RELEASE/lib -DTRT_OUT_DIR=`pwd`/out -DCMAKE_TOOLCHAIN_FILE=$TRT_SOURCE/cmake/toolchains/cmake_qnx.toolchain -DCUDA_VERSION=10.2
 make -j$(nproc)

Example: Windows (x86-64) build in Powershell

 cd $Env:TRT_SOURCE
 mkdir -p build ; cd build
 cmake .. -DTRT_LIB_DIR=$Env:TRT_RELEASE\lib -DTRT_OUT_DIR='$(Get-Location)\out' -DCMAKE_TOOLCHAIN_FILE=..\cmake\toolchains\cmake_x64_win.toolchain
 msbuild ALL_BUILD.vcxproj

NOTE:

    1. The default CUDA version used by CMake is 11.1. To override this, for example to 10.2, append -DCUDA_VERSION=10.2 to the cmake command.
    2. If samples fail to link on CentOS7, create this symbolic link: ln -s $TRT_OUT_DIR/libnvinfer_plugin.so $TRT_OUT_DIR/libnvinfer_plugin.so.7
  • Required CMake build arguments are:
  • Optional CMake build arguments:
  • The TensorRT python API bindings must be installed for running TensorRT python applications
    • TRT_LIB_DIR: Path to the TensorRT installation directory containing libraries.
    • TRT_OUT_DIR: Output directory where generated build artifacts will be copied.
    • CMAKE_BUILD_TYPE: Specify if binaries generated are for release or debug (contain debug symbols). Values consists of [Release] | Debug
    • CUDA_VERISON: The version of CUDA to target, for example [11.1].
    • CUDNN_VERSION: The version of cuDNN to target, for example [8.0].
    • NVCR_SUFFIX: Optional nvcr/cuda image suffix. Set to "-rc" for CUDA11 RC builds until general availability. Blank by default.
    • PROTOBUF_VERSION: The version of Protobuf to use, for example [3.0.0]. Note: Changing this will not configure CMake to use a system version of Protobuf, it will configure CMake to download and try building that version.
    • CMAKE_TOOLCHAIN_FILE: The path to a toolchain file for cross compilation.
    • BUILD_PARSERS: Specify if the parsers should be built, for example [ON] | OFF. If turned OFF, CMake will try to find precompiled versions of the parser libraries to use in compiling samples. First in ${TRT_LIB_DIR}, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.
    • BUILD_PLUGINS: Specify if the plugins should be built, for example [ON] | OFF. If turned OFF, CMake will try to find a precompiled version of the plugin library to use in compiling samples. First in ${TRT_LIB_DIR}, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.
    • BUILD_SAMPLES: Specify if the samples should be built, for example [ON] | OFF.
    • CUB_VERSION: The version of CUB to use, for example [1.8.0].
    • GPU_ARCHS: GPU (SM) architectures to target. By default we generate CUDA code for all major SMs. Specific SM versions can be specified here as a quoted space-separated list to reduce compilation time and binary size. Table of compute capabilities of NVIDIA GPUs can be found here. Examples:
      • NVidia A100: -DGPU_ARCHS="80"
      • Tesla T4, GeForce RTX 2080: -DGPU_ARCHS="75"
      • Titan V, Tesla V100: -DGPU_ARCHS="70"
      • Multiple SMs: -DGPU_ARCHS="80 75"
    • TRT_PLATFORM_ID: Bare-metal build (unlike containerized cross-compilation) on non Linux/x86 platforms must explicitly specify the target platform. Currently supported options: x86_64 (default), aarch64

(Optional) Install TensorRT python bindings

Example: install TensorRT wheel for python 3.6

pip3 install $TRT_RELEASE/python/tensorrt-7.2.1.6-cp36-none-linux_x86_64.whl

References

TensorRT Resources

Known Issues

TensorRT 7.2.1

  • None

Nvidia TensorRT开源软件的更多相关文章

  1. NVIDIA TensorRT:可编程推理加速器

    NVIDIA TensorRT:可编程推理加速器 一.概述 NVIDIA TensorRT是一个用于高性能深度学习推理的SDK.它包括一个深度学习推理优化器和运行时间,为深度学习推理应用程序提供低延迟 ...

  2. NVIDIA TensorRT高性能深度学习推理

    NVIDIA TensorRT高性能深度学习推理 NVIDIA TensorRT 是用于高性能深度学习推理的 SDK.此 SDK 包含深度学习推理优化器和运行时环境,可为深度学习推理应用提供低延迟和高 ...

  3. spring boot 实战:我们的第一款开源软件

    在信息爆炸时代,如何避免持续性信息过剩,使自己变得专注而不是被纷繁的信息所累?每天会看到各种各样的新闻,各种新潮的技术层出不穷,如何筛选出自己所关心的? 各位看官会想,我们是来看开源软件的,你给我扯什 ...

  4. 2014 年最热门的国人开发开源软件 TOP 100 - 开源中国社区

    不知道从什么时候开始,很多一说起国产好像就非常愤慨,其实大可不必.做开源中国六年有余,这六年时间国内的开源蓬勃发展,从一开始的使用到贡献,到推出自己很多的开源软件,而且还有很多软件被国外的认可.中国是 ...

  5. 2014 年最热门的国人开发开源软件TOP 100

    不知道从什么时候开始,很多一说起国产好像就非常愤慨,其实大可不必.做开源中国六年有余,这六年时间国内的开源蓬勃发展,从一开始的使用到贡献,到推出自己很多的开源软件,而且还有很多软件被国外认可.中国是开 ...

  6. 号外:MS被开源软件打败了!

    [编辑推荐]微软宣布.NET将开源 支持Mac OS X和Linux (149/16525) » [最多推荐]Visual Studio Contact(); 直播笔记(44/2744) » [最多评 ...

  7. 利用开源软件strongSwan实现支持IKEv2的企业级IPsec VPN,并结合FreeRadius实现AAA协议(下篇)

    续篇—— 利用开源软件strongSwan实现支持IKEv2的企业级IPsec VPN,并结合FreeRadius实现AAA协议(上篇) 上篇文章写了如何构建一个支持IKEv2的VPN,本篇记录的是如 ...

  8. 2014年国人开发的最热门的开源软件TOP 100

    不知道从什么时候开始,很多一说起国产好像就非常愤慨,其实大可不必.做开源中国六年有余,这六年时间国内的开源蓬勃发展,从一开始的使用到贡献,到推出自己很多的开源软件,而且还有很多软件被国外的认可.中国是 ...

  9. GIS开源软件大全

    3 - F 3map:行星地球项目由3map驱动,这是一个自由软件,由Telstra宽带基金会创建并支持,提供客户端与服务器的能力以在线再现虚拟地球. Amein!:其界面介于ArcMap和UMN M ...

随机推荐

  1. hdu1358 最小循环节,最大循环次数 KMP

    题意:       给你一个字符串,让你找到一些字符串,这个字符串是从第一个字母开始的,并且他可以分成1个一上循环子结构够成的,比如 abcabcabc  那么当前的这个串就是三个abc构成的,他的A ...

  2. Python 爬虫 BeautifulSoup4 库的使用

    BeautifulSoup4库 和 lxml 一样,Beautiful Soup 也是一个HTML/XML的解析器,主要的功能也是如何解析和提取 HTML/XML 数据.lxml 只会局部遍历,而Be ...

  3. HackingLab基础关

    目录 1:Key在哪里? 2:再加密一次你就得到key啦~ 3:猜猜这是经过了多少次加密? 4:据说MD5加密很安全,真的是么? 5:种族歧视 6:HAHA浏览器 7:key究竟在哪里呢? 8:key ...

  4. 学习Python一年,这次终于弄懂了浅拷贝和深拷贝

    官方文档:copy主题 源代码: Lib/copy.py 话说,网上已经有很多关于Python浅拷贝和深拷贝的文章了,不过好多文章看起来还是决定似懂非懂,所以决定用自己的理解来写出这样一篇文章. 当别 ...

  5. .NET并发编程-TPL Dataflow并行工作流

    本系列学习在.NET中的并发并行编程模式,实战技巧 本小节了解TPL Dataflow并行工作流,在工作中如何利用现成的类库处理数据.旨在通过TDF实现数据流的并行处理. TDF Block 数据流由 ...

  6. Day002 Java开发环境搭建

    Java开发环境搭建 JDK.JRE.JVM JDK: Java Development Kit(包涵JRE) JRE: Java Runtime Environment(包涵JVM) JVM: Ja ...

  7. spring mvc @Repository 注入不成功 的原因?

    这样的代码会影响 @Repository 注入

  8. 【Matlab】BASK的调试与解调仿真

    索引 一.BASK的调制 1.1 曼彻斯特码 1.2 增益控制 1.3 常量求和 1.4 与载波相乘 1.5 波形预览 1.6 参数设置(参考) 二.BASK的解调 2.1 滤波 2.2 信号比较 2 ...

  9. RTTI之dynamic_cast运算符

    #include <iostream> #include <cstdlib> #include <ctime> using std::cout; class Gra ...

  10. OO第一单元总结-多项式求导

    OO第一单元总结-多项式求导 一.第一.第二次作业总结 因为前两次作业设计复杂度差别不大,因而放在这里统一总结. 基于度量分析程序结构: 前两次作业确实存在缺乏可拓展设计的构想,基本还是面向过程的思维 ...