[onnxrumtime]onnxruntime和cuda对应关系表
admin
2024-01-25 12:08:16
0

官方网站参考:

NVIDIA - CUDA - onnxruntime

CUDA Execution Provider

The CUDA Execution Provider enables hardware accelerated computation on Nvidia CUDA-enabled GPUs.

Contents

  • Install
  • Requirements
  • Build
  • Configuration Options
  • Samples

Install

Pre-built binaries of ONNX Runtime with CUDA EP are published for most language bindings. Please reference Install ORT.

Requirements

Please reference table below for official GPU packages dependencies for the ONNX Runtime inferencing package. Note that ONNX Runtime Training is aligned with PyTorch CUDA versions; refer to the Training tab on onnxruntime.ai for supported versions.

Note: Because of CUDA Minor Version Compatibility, Onnx Runtime built with CUDA 11.4 should be compatible with any CUDA 11.x version. Please reference Nvidia CUDA Minor Version Compatibility.

ONNX RuntimeCUDAcuDNNNotes
1.1311.68.2.4 (Linux)
8.5.0.96 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.5.2
libcublas 11.6.5.2
libcudnn 8.2.4
1.12
1.11
11.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.5.2
libcublas 11.6.5.2
libcudnn 8.2.4
1.1011.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.1.51
libcublas 11.6.1.51
libcudnn 8.2.4
1.911.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.1.51
libcublas 11.6.1.51
libcudnn 8.2.4
1.811.0.38.0.4 (Linux)
8.0.2.39 (Windows)
libcudart 11.0.221
libcufft 10.2.1.245
libcurand 10.2.1.245
libcublasLt 11.2.0.252
libcublas 11.2.0.252
libcudnn 8.0.4
1.711.0.38.0.4 (Linux)
8.0.2.39 (Windows)
libcudart 11.0.221
libcufft 10.2.1.245
libcurand 10.2.1.245
libcublasLt 11.2.0.252
libcublas 11.2.0.252
libcudnn 8.0.4
1.5-1.610.28.0.3CUDA 11 can be built from source
1.2-1.410.17.6.5Requires cublas10-10.2.1.243; cublas 10.1.x will not work
1.0-1.110.07.6.4CUDA versions from 9.1 up to 10.1, and cuDNN versions from 7.1 up to 7.4 should also work with Visual Studio 2017

For older versions, please reference the readme and build pages on the release branch.

Build

For build instructions, please see the BUILD page.

Configuration Options

The CUDA Execution Provider supports the following configuration options.

device_id

The device ID.

Default value: 0

gpu_mem_limit

The size limit of the device memory arena in bytes. This size limit is only for the execution provider’s arena. The total device memory usage may be higher. s: max value of C++ size_t type (effectively unlimited)

arena_extend_strategy

The strategy for extending the device memory arena.

ValueDescription
kNextPowerOfTwo (0)subsequent extensions extend by larger amounts (multiplied by powers of two)
kSameAsRequested (1)extend by the requested amount

Default value: kNextPowerOfTwo

The type of search done for cuDNN convolution algorithms.

ValueDescription
EXHAUSTIVE (0)expensive exhaustive benchmarking using cudnnFindConvolutionForwardAlgorithmEx
HEURISTIC (1)lightweight heuristic based search using cudnnGetConvolutionForwardAlgorithm_v7
DEFAULT (2)default algorithm using CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM

Default value: EXHAUSTIVE

do_copy_in_default_stream

Whether to do copies in the default stream or use separate streams. The recommended setting is true. If false, there are race conditions and possibly better performance.

Default value: true

cudnn_conv_use_max_workspace

Check tuning performance for convolution heavy models for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this. Please take a look at the sample below for an example.

Default value: 0

cudnn_conv1d_pad_to_nc1d

Check convolution input padding in the CUDA EP for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this. Please take a look at the sample below for an example.

Default value: 0

enable_cuda_graph

Check using CUDA Graphs in the CUDA EP for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this.

Default value: 0

Samples

Python

import onnxruntime as ortmodel_path = ''providers = [('CUDAExecutionProvider', {'device_id': 0,'arena_extend_strategy': 'kNextPowerOfTwo','gpu_mem_limit': 2 * 1024 * 1024 * 1024,'cudnn_conv_algo_search': 'EXHAUSTIVE','do_copy_in_default_stream': True,}),'CPUExecutionProvider',
]session = ort.InferenceSession(model_path, providers=providers)

C/C++

USING LEGACY PROVIDER OPTIONS STRUCT

OrtSessionOptions* session_options = /* ... */;OrtCUDAProviderOptions options;
options.device_id = 0;
options.arena_extend_strategy = 0;
options.gpu_mem_limit = 2 * 1024 * 1024 * 1024;
options.cudnn_conv_algo_search = OrtCudnnConvAlgoSearchExhaustive;
options.do_copy_in_default_stream = 1;SessionOptionsAppendExecutionProvider_CUDA(session_options, &options);

USING V2 PROVIDER OPTIONS STRUCT

OrtCUDAProviderOptionsV2* cuda_options = nullptr;
CreateCUDAProviderOptions(&cuda_options);std::vector keys{"device_id", "gpu_mem_limit", "arena_extend_strategy", "cudnn_conv_algo_search", "do_copy_in_default_stream", "cudnn_conv_use_max_workspace", "cudnn_conv1d_pad_to_nc1d"};
std::vector values{"0", "2147483648", "kSameAsRequested", "DEFAULT", "1", "1", "1"};UpdateCUDAProviderOptions(cuda_options, keys.data(), values.data(), keys.size());OrtSessionOptions* session_options = /* ... */;
SessionOptionsAppendExecutionProvider_CUDA_V2(session_options, cuda_options);// Finally, don't forget to release the provider options
ReleaseCUDAProviderOptions(cuda_options);

C#

var cudaProviderOptions = new OrtCUDAProviderOptions(); // Dispose this finallyvar providerOptionsDict = new Dictionary();
providerOptionsDict["device_id"] = "0";
providerOptionsDict["gpu_mem_limit"] = "2147483648";
providerOptionsDict["arena_extend_strategy"] = "kSameAsRequested";
providerOptionsDict["cudnn_conv_algo_search"] = "DEFAULT";
providerOptionsDict["do_copy_in_default_stream"] = "1";
providerOptionsDict["cudnn_conv_use_max_workspace"] = "1";
providerOptionsDict["cudnn_conv1d_pad_to_nc1d"] = "1";cudaProviderOptions.UpdateOptions(providerOptionsDict);SessionOptions options = SessionOptions.MakeSessionOptionWithCudaProvider(cudaProviderOptions);  // Dispose this finally

相关内容

热门资讯

欧陆之星钻石被出具警示函,涉未... 蓝鲸新闻1月17日讯,近日,江苏证监局发布行政监管措施决定书,剑指欧陆之星钻石(上海)有限公司。 决...
木林森涨1.76%,成交额4.... 来源:新浪证券-红岸工作室 1月16日,木林森涨1.76%,成交额4.35亿元,换手率4.20%,总...
金一文化跌3.63%,成交额2... 来源:新浪证券-红岸工作室 1月16日,金一文化跌3.63%,成交额2.04亿元,换手率2.38%,...
险资系私募基金总数量增至11只 险资长期股票投资试点正加速落地。中国证券投资基金业协会信息显示,国丰兴华鸿鹄志远三期私募证券投资基金...
邹加怡出任亚洲基础设施投资银行... 央广网北京1月16日消息(记者 宓迪)据“亚洲基础设施投资银行”微信公众号,今日,邹加怡正式就任亚洲...
汉德精密:向外走、向深拓 抢占... 全球产业分工重构之际,中国装备制造企业出海已从“规模扩张”迈入“提质增效”新阶段,而产业链协同共生成...
中央财政加力支持 民间专项担保... 围绕支持民间投资,日前召开的2026年首场国务院常务会议提出设立“民间投资专项担保计划”,这意味着我...
我们该如何应对“难治性双相情感... “前一秒还情绪高涨、斗志昂扬,下一秒就陷入低谷、悲观绝望”——这不是简单的“心情不好”,而是双相情感...
和讯投顾刘文博:指数高开低走,... 1月16日,和讯投顾刘文博表示,今早A股高开回应,但因整体调整尚不充分,市场合力未能持续,指数最终回...
80多家央企“一把手”2024... 本报(chinatimes.net.cn)记者刘昱汝 徐芸茜 北京报道 日前,国资委官网通过集中发布...
2026义乌电商博览会·跨境服... 在全球数字经济浪潮下,中国电商服务生态凭借完善的产业链协同与持续创新的技术能力,正成为推动跨境贸易增...
罗永浩需要为西贝预制菜风波担责... 16日下午,贾国龙更新微博称:“今晚(16日)10点将就罗永浩对西贝的重大污蔑诽谤一一全面回应”,并...
融资余额破新高,看懂资金才靠谱 最近A股融资余额持续攀升,创下历史新高,不少板块和个股都获得了融资资金的重点关注。分板块来看,电子、...
母婴APP用户粘性对比:强关系... 概述 母婴类移动应用在用户定位、业务模式与生态重心方面存在差异。以妈妈网、宝宝树和亲宝宝为例,三者分...
继蓝箭航天后,又一个商业航天企... 日前,证监会公开发行辅导公示系统显示,中科宇航技术股份有限公司(以下简称为“中科宇航”)于近日完成I...
爱科微启动A股上市辅导 来源:界面新闻 1月17日,证监会网站披露,爱科微科技(上海)股份有限公司已启动A股上市辅导,辅导机...
原创 欧... 先别急着喊胜利,欧洲自己先打起了退堂鼓。——这一句悄悄话,最近在巴黎、罗马、柏林的政坛圈子里悄然流传...
原创 吃... 武汉起家的Manus公司,现在被中国商务部正式立案审查了。 这事在科技圈炸开了锅,不是因为收购金额有...
原创 油... 1.17-蛋价分析:蛋价升温,火箭上涨! 时光荏苒,在国内畜禽养殖方面,去年下半年,国内鸡蛋价格维持...
马斯克因星链与欧洲最大航空公司... 奥利里与马斯克 北京时间1月17日,据《商业内幕》报道,埃隆·马斯克(Elon Musk)因为星链问...