[onnxrumtime]onnxruntime和cuda对应关系表
admin
2024-01-25 12:08:16
0

官方网站参考:

NVIDIA - CUDA - onnxruntime

CUDA Execution Provider

The CUDA Execution Provider enables hardware accelerated computation on Nvidia CUDA-enabled GPUs.

Contents

  • Install
  • Requirements
  • Build
  • Configuration Options
  • Samples

Install

Pre-built binaries of ONNX Runtime with CUDA EP are published for most language bindings. Please reference Install ORT.

Requirements

Please reference table below for official GPU packages dependencies for the ONNX Runtime inferencing package. Note that ONNX Runtime Training is aligned with PyTorch CUDA versions; refer to the Training tab on onnxruntime.ai for supported versions.

Note: Because of CUDA Minor Version Compatibility, Onnx Runtime built with CUDA 11.4 should be compatible with any CUDA 11.x version. Please reference Nvidia CUDA Minor Version Compatibility.

ONNX RuntimeCUDAcuDNNNotes
1.1311.68.2.4 (Linux)
8.5.0.96 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.5.2
libcublas 11.6.5.2
libcudnn 8.2.4
1.12
1.11
11.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.5.2
libcublas 11.6.5.2
libcudnn 8.2.4
1.1011.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.1.51
libcublas 11.6.1.51
libcudnn 8.2.4
1.911.48.2.4 (Linux)
8.2.2.26 (Windows)
libcudart 11.4.43
libcufft 10.5.2.100
libcurand 10.2.5.120
libcublasLt 11.6.1.51
libcublas 11.6.1.51
libcudnn 8.2.4
1.811.0.38.0.4 (Linux)
8.0.2.39 (Windows)
libcudart 11.0.221
libcufft 10.2.1.245
libcurand 10.2.1.245
libcublasLt 11.2.0.252
libcublas 11.2.0.252
libcudnn 8.0.4
1.711.0.38.0.4 (Linux)
8.0.2.39 (Windows)
libcudart 11.0.221
libcufft 10.2.1.245
libcurand 10.2.1.245
libcublasLt 11.2.0.252
libcublas 11.2.0.252
libcudnn 8.0.4
1.5-1.610.28.0.3CUDA 11 can be built from source
1.2-1.410.17.6.5Requires cublas10-10.2.1.243; cublas 10.1.x will not work
1.0-1.110.07.6.4CUDA versions from 9.1 up to 10.1, and cuDNN versions from 7.1 up to 7.4 should also work with Visual Studio 2017

For older versions, please reference the readme and build pages on the release branch.

Build

For build instructions, please see the BUILD page.

Configuration Options

The CUDA Execution Provider supports the following configuration options.

device_id

The device ID.

Default value: 0

gpu_mem_limit

The size limit of the device memory arena in bytes. This size limit is only for the execution provider’s arena. The total device memory usage may be higher. s: max value of C++ size_t type (effectively unlimited)

arena_extend_strategy

The strategy for extending the device memory arena.

ValueDescription
kNextPowerOfTwo (0)subsequent extensions extend by larger amounts (multiplied by powers of two)
kSameAsRequested (1)extend by the requested amount

Default value: kNextPowerOfTwo

The type of search done for cuDNN convolution algorithms.

ValueDescription
EXHAUSTIVE (0)expensive exhaustive benchmarking using cudnnFindConvolutionForwardAlgorithmEx
HEURISTIC (1)lightweight heuristic based search using cudnnGetConvolutionForwardAlgorithm_v7
DEFAULT (2)default algorithm using CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM

Default value: EXHAUSTIVE

do_copy_in_default_stream

Whether to do copies in the default stream or use separate streams. The recommended setting is true. If false, there are race conditions and possibly better performance.

Default value: true

cudnn_conv_use_max_workspace

Check tuning performance for convolution heavy models for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this. Please take a look at the sample below for an example.

Default value: 0

cudnn_conv1d_pad_to_nc1d

Check convolution input padding in the CUDA EP for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this. Please take a look at the sample below for an example.

Default value: 0

enable_cuda_graph

Check using CUDA Graphs in the CUDA EP for details on what this flag does. This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using this and updated using this.

Default value: 0

Samples

Python

import onnxruntime as ortmodel_path = ''providers = [('CUDAExecutionProvider', {'device_id': 0,'arena_extend_strategy': 'kNextPowerOfTwo','gpu_mem_limit': 2 * 1024 * 1024 * 1024,'cudnn_conv_algo_search': 'EXHAUSTIVE','do_copy_in_default_stream': True,}),'CPUExecutionProvider',
]session = ort.InferenceSession(model_path, providers=providers)

C/C++

USING LEGACY PROVIDER OPTIONS STRUCT

OrtSessionOptions* session_options = /* ... */;OrtCUDAProviderOptions options;
options.device_id = 0;
options.arena_extend_strategy = 0;
options.gpu_mem_limit = 2 * 1024 * 1024 * 1024;
options.cudnn_conv_algo_search = OrtCudnnConvAlgoSearchExhaustive;
options.do_copy_in_default_stream = 1;SessionOptionsAppendExecutionProvider_CUDA(session_options, &options);

USING V2 PROVIDER OPTIONS STRUCT

OrtCUDAProviderOptionsV2* cuda_options = nullptr;
CreateCUDAProviderOptions(&cuda_options);std::vector keys{"device_id", "gpu_mem_limit", "arena_extend_strategy", "cudnn_conv_algo_search", "do_copy_in_default_stream", "cudnn_conv_use_max_workspace", "cudnn_conv1d_pad_to_nc1d"};
std::vector values{"0", "2147483648", "kSameAsRequested", "DEFAULT", "1", "1", "1"};UpdateCUDAProviderOptions(cuda_options, keys.data(), values.data(), keys.size());OrtSessionOptions* session_options = /* ... */;
SessionOptionsAppendExecutionProvider_CUDA_V2(session_options, cuda_options);// Finally, don't forget to release the provider options
ReleaseCUDAProviderOptions(cuda_options);

C#

var cudaProviderOptions = new OrtCUDAProviderOptions(); // Dispose this finallyvar providerOptionsDict = new Dictionary();
providerOptionsDict["device_id"] = "0";
providerOptionsDict["gpu_mem_limit"] = "2147483648";
providerOptionsDict["arena_extend_strategy"] = "kSameAsRequested";
providerOptionsDict["cudnn_conv_algo_search"] = "DEFAULT";
providerOptionsDict["do_copy_in_default_stream"] = "1";
providerOptionsDict["cudnn_conv_use_max_workspace"] = "1";
providerOptionsDict["cudnn_conv1d_pad_to_nc1d"] = "1";cudaProviderOptions.UpdateOptions(providerOptionsDict);SessionOptions options = SessionOptions.MakeSessionOptionWithCudaProvider(cudaProviderOptions);  // Dispose this finally

相关内容

热门资讯

原创 4... 写在文章前的声明:在本文之前的说明:本文中所列的投资信息,只是一个对基金资产净值进行排行的客观描述,...
胜宏科技港股大涨49% 做完英... 记者 陈月芹 4月21日,全球AI算力板龙头胜宏科技(02476.HK)登陆港交所,上市首日股价大涨...
永赢基金:聚焦“科技新锐”,科... 数据来源:Wind,时间统计区间为2025/1/1-2026/4/21,指数过往表现不预示未来,不构...
五大阅读趋势显现!当当网发布2... 在第31个世界读书日即将来临之际及首个全民阅读活动周期间,当当网正式发布2026国民阅读洞察报告。 ...
业绩逐季回暖 老百姓大药房一季... 上证报中国证券网讯(记者 夏子航)4月22日晚,老百姓大药房发布2025年年报和2026年一季报。今...
中国20强城市大洗牌:苏州接近... 中国的城市经济竞争格局一直在变化,每年发布的GDP数据都会对城市经济实力进行重新排列。2025年榜又...
直击金宏气体股东会:预期年内氦... 《科创板日报》4月22日讯(记者 郭辉)金宏气体日前举行2025年度股东大会。会上该公司审议了公司年...
5月1日起,俄据悉将叫停哈萨克... 据行业消息人士透露,俄罗斯将于5月1日起停止经友谊管道转运哈萨克斯坦输往德国的石油,相关调整计划已送...
深化具身智能生态布局 京东携手... 4 月 22 日,京东与国内消费级人形机器人头部企业松延动力正式达成三年期战略合作。双方将围绕产品研...
原创 帮... 先问你一个问题,美伊停火今晚到期,按常理避险情绪该升温,黄金应该涨吧?结果恰恰相反——原油涨了,黄金...
300295、600889,将... 三六五网、南京化纤,将被*ST。 公司股票自4月23日开市起停牌一天,于4月24日开市起复牌并实施退...
能源大变天!外媒:羡慕中国的石... 这一次油价突破 110 美元的能源危机,着实魔幻。如果放在十年前,没人会相信中国能在这场风波中获利,...
黄金涨跌两难,现在还能上车吗? 中新网4月22日电(记者 左雨晴) 四月以来,美伊局势反复拉扯,美联储降息预期一变再变。黄金价格在4...
“我身体健康”,库克现身员工大... 当地时间4月21日,受苹果官宣CEO换届影响,公司股价盘中下探超2%,总市值失守4万亿美元关口,收盘...
库克留下一个悬念 工程师能否拯救创新节奏? 听筒Tech(ID:tingtongtech)原创 文 | 赵 森 ...
探索消费信贷与社交支付深度融合... 腾讯这一金融产品再添新功能,4月19日,北京商报记者注意到,微信分付灰度测试转账功能引发热议,在向微...
土耳其主要银行股指早盘下跌2% 每经AI快讯,4月20日,土耳其主要银行股指早盘下跌2%。 每日经济新闻
好用的OTA代运营源头厂家 在如今竞争激烈的酒旅行业中,OTA代运营服务成为了众多酒店、民宿提升竞争力的关键。但市场上的代运营厂...
成都五一出游全国热门第三 “五一”假期临近,同程旅行最新发布的《2026“五一”旅行趋势报告》显示,今年“五一”期间成都同时位...