[onnxrumtime]onnxruntime和cuda对应关系表
admin
2024-01-25 12:08:16
0

官方网站参考:

NVIDIA - CUDA - onnxruntime

CUDA Execution Provider

The CUDA Execution Provider enables hardware accelerated computation on Nvidia CUDA-enabled GPUs.

Contents

  • Install
  • Requirements
  • Build
  • Configuration Options
  • Samples

Install

Pre-built binaries of ONNX Runtime with CUDA EP are published for most language bindings. Please reference Install ORT.

Requirements

Please reference table below for official GPU packages dependencies for the ONNX Runtime inferencing package. Note that ONNX Runtime Training is aligned with PyTorch CUDA versions; refer to the Training tab on onnxruntime.ai for supported versions.

Note: Because of CUDA Minor Version Compatibility, Onnx Runtime built with CUDA 11.4 should be compatible with any CUDA 11.x version. Please reference Nvidia CUDA Minor Version Compatibility.

ONNX RuntimeCUDAcuDNNNotes
1.1311.68.2.4 (Linux)
8.5.0.96 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.5.2
libcublas 11.6.5.2
libcudnn 8.2.4
1.12
1.11
11.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.5.2
libcublas 11.6.5.2
libcudnn 8.2.4
1.1011.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.1.51
libcublas 11.6.1.51
libcudnn 8.2.4
1.911.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.1.51
libcublas 11.6.1.51
libcudnn 8.2.4
1.811.0.38.0.4 (Linux)
8.0.2.39 (Windows)
libcudart 11.0.221
libcufft 10.2.1.245
libcurand 10.2.1.245
libcublasLt 11.2.0.252
libcublas 11.2.0.252
libcudnn 8.0.4
1.711.0.38.0.4 (Linux)
8.0.2.39 (Windows)
libcudart 11.0.221
libcufft 10.2.1.245
libcurand 10.2.1.245
libcublasLt 11.2.0.252
libcublas 11.2.0.252
libcudnn 8.0.4
1.5-1.610.28.0.3CUDA 11 can be built from source
1.2-1.410.17.6.5Requires cublas10-10.2.1.243; cublas 10.1.x will not work
1.0-1.110.07.6.4CUDA versions from 9.1 up to 10.1, and cuDNN versions from 7.1 up to 7.4 should also work with Visual Studio 2017

For older versions, please reference the readme and build pages on the release branch.

Build

For build instructions, please see the BUILD page.

Configuration Options

The CUDA Execution Provider supports the following configuration options.

device_id

The device ID.

Default value: 0

gpu_mem_limit

The size limit of the device memory arena in bytes. This size limit is only for the execution provider’s arena. The total device memory usage may be higher. s: max value of C++ size_t type (effectively unlimited)

arena_extend_strategy

The strategy for extending the device memory arena.

ValueDescription
kNextPowerOfTwo (0)subsequent extensions extend by larger amounts (multiplied by powers of two)
kSameAsRequested (1)extend by the requested amount

Default value: kNextPowerOfTwo

The type of search done for cuDNN convolution algorithms.

ValueDescription
EXHAUSTIVE (0)expensive exhaustive benchmarking using cudnnFindConvolutionForwardAlgorithmEx
HEURISTIC (1)lightweight heuristic based search using cudnnGetConvolutionForwardAlgorithm_v7
DEFAULT (2)default algorithm using CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM

Default value: EXHAUSTIVE

do_copy_in_default_stream

Whether to do copies in the default stream or use separate streams. The recommended setting is true. If false, there are race conditions and possibly better performance.

Default value: true

cudnn_conv_use_max_workspace

Check tuning performance for convolution heavy models for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this. Please take a look at the sample below for an example.

Default value: 0

cudnn_conv1d_pad_to_nc1d

Check convolution input padding in the CUDA EP for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this. Please take a look at the sample below for an example.

Default value: 0

enable_cuda_graph

Check using CUDA Graphs in the CUDA EP for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this.

Default value: 0

Samples

Python

import onnxruntime as ortmodel_path = ''providers = [('CUDAExecutionProvider', {'device_id': 0,'arena_extend_strategy': 'kNextPowerOfTwo','gpu_mem_limit': 2 * 1024 * 1024 * 1024,'cudnn_conv_algo_search': 'EXHAUSTIVE','do_copy_in_default_stream': True,}),'CPUExecutionProvider',
]session = ort.InferenceSession(model_path, providers=providers)

C/C++

USING LEGACY PROVIDER OPTIONS STRUCT

OrtSessionOptions* session_options = /* ... */;OrtCUDAProviderOptions options;
options.device_id = 0;
options.arena_extend_strategy = 0;
options.gpu_mem_limit = 2 * 1024 * 1024 * 1024;
options.cudnn_conv_algo_search = OrtCudnnConvAlgoSearchExhaustive;
options.do_copy_in_default_stream = 1;SessionOptionsAppendExecutionProvider_CUDA(session_options, &options);

USING V2 PROVIDER OPTIONS STRUCT

OrtCUDAProviderOptionsV2* cuda_options = nullptr;
CreateCUDAProviderOptions(&cuda_options);std::vector keys{"device_id", "gpu_mem_limit", "arena_extend_strategy", "cudnn_conv_algo_search", "do_copy_in_default_stream", "cudnn_conv_use_max_workspace", "cudnn_conv1d_pad_to_nc1d"};
std::vector values{"0", "2147483648", "kSameAsRequested", "DEFAULT", "1", "1", "1"};UpdateCUDAProviderOptions(cuda_options, keys.data(), values.data(), keys.size());OrtSessionOptions* session_options = /* ... */;
SessionOptionsAppendExecutionProvider_CUDA_V2(session_options, cuda_options);// Finally, don't forget to release the provider options
ReleaseCUDAProviderOptions(cuda_options);

C#

var cudaProviderOptions = new OrtCUDAProviderOptions(); // Dispose this finallyvar providerOptionsDict = new Dictionary();
providerOptionsDict["device_id"] = "0";
providerOptionsDict["gpu_mem_limit"] = "2147483648";
providerOptionsDict["arena_extend_strategy"] = "kSameAsRequested";
providerOptionsDict["cudnn_conv_algo_search"] = "DEFAULT";
providerOptionsDict["do_copy_in_default_stream"] = "1";
providerOptionsDict["cudnn_conv_use_max_workspace"] = "1";
providerOptionsDict["cudnn_conv1d_pad_to_nc1d"] = "1";cudaProviderOptions.UpdateOptions(providerOptionsDict);SessionOptions options = SessionOptions.MakeSessionOptionWithCudaProvider(cudaProviderOptions);  // Dispose this finally

相关内容

热门资讯

原创 还... 头条号墨山看客 首发呈现 大好河山,邀您共看 Hello,大家好呀!欢迎来到老墨聊时事, 6月18日...
商道创投网・会员动态|Mani... 《商道创投网》2026 年 06 月 21 日从官方获悉:Manifold AI 流形空间近日完成数...
原创 歌... 文|嘴嘴 编辑|嘴嘴 很多人第一次认识李琼,都是通过那首传遍大江南北的《山路十八弯》。旋律一响起,...
明天,深交所史上最大规模IPO... 根据目前安排,下周将有2只新股可申购:一只为避雷器、绝缘子“小巨人”,另一只为国内领先的新能源发电运...
一财社论:织牢降落伞打好AI淘... AI发展又迎来一个新的历史节点。 当前市场正在走出单纯基于“模型参数”的攻坚战,进入基于真实ROI(...
原创 游... 美国财政部刚公布了最新数据,咱们中国4月份又减持了12亿美元美债,持仓直接掉到了 6511亿美元。什...
明日申购!深交所史上最大规模I... 【大河财立方消息】华润新能源控股有限公司(简称:华润新能源)6月18日发布的公告显示,华润新能源A股...
高盛最新预测:大幅下调黄金价格 当地时间6月19日,国际金价再次收跌,截至发稿,COMEX黄金下跌1.72%,报4172.9美元/盎...
SpaceX行情降温两连跌 本... 来源:环球市场播报 受上周创纪录IPO后本轮股价上涨行情降温影响,SpaceX股票周四美股收盘下跌3...
台青看好粤港澳大湾区发展 刘玥晴 郑欣怡 “同心筑梦·青聚香江”海峡两岸暨港澳青年融合发展主题交流活动近日在香港举办。与会台湾...
小腿溃疡最佳治疗方案指南 小腿溃疡是临床常见的迁延难愈性创面,首先需要明确病因分型才能针对性治疗,不要盲目自行涂药或使用偏方,...
603986,存储芯片大牛股,... 【导读】下周A股将有49家公司有限售股份解禁 中国基金报记者 夏天 Wind数据显示,下周(6月22...
山东万福河被指遭污染近10公里... 一名环保博主6月21日上午发布现场调查视频称,山东济宁市金乡县万福河遭严重污染,其中部分河段河水黑如...
国产电子陶瓷商闯关港股!潮州三... 图源:图虫创意 来源|时代商业研究院 作者|实习生陈嘉婕、郑琳 编辑|郑琳 2026年6月8日,潮州...
*ST集友:控股股东、实际控制... *ST集友:控股股东、实际控制人拟协议转让部分公司股份 每经AI快讯,*ST集友(SH603429,...
端午假期最后一天 铁路运输迎来... 今天是端午假期最后一天,铁路运输迎来返程客流高峰。记者从国铁集团了解到,全国铁路预计发送旅客1794...
原创 美... 越来越多美国人不再相信美国经济为他们服务。 收入下滑、贫富分化是全球问题,但在美国,这两个问题又...
腰椎不适辨证针灸调理,从根源缓... 不少人长期久坐、弯腰劳作、受凉后都会出现腰椎酸胀、僵硬,严重时弯腰受限、牵扯腿疼麻木,现代多诊断腰肌...
莱伯泰科:公司发展战略立足于内... 证券日报网6月18日讯 ,莱伯泰科在接受调研者提问时表示,公司的发展战略立足于内生增长与外延扩张的双...