NVIDIA
Overview
For NVIDIA GPUs, OAAX provides an implementation of conversion toolchain and runtime library. The latter is based on the ONNX Runtime with CUDA Execution Provider.
Requirements
Most NVIDIA GPUs are supported, including:
- Data Center / Desktop GPUs with compute capability 3.5 or higher (e.g., T4, A10, A100, RTX 3000 series, RTX 4000 series) running on Ubuntu 20.04 or higher or Windows 10/11.
- NVIDIA Jetson series (e.g. TX2, Xavier NX, AGX Orin) running JetPack 5.0 or higher.
Installation
To use the NVIDIA GPU acceleration, please ensure that you have the latest NVIDIA drivers and CUDA toolkit installed: - For Data Center / Desktop GPUs, please follow the instructions here. - For NVIDIA Jetson series, please follow the instructions here.
Usage
Runtime Library
The runtime library implements the OAAX's runtime interface for initializing, loading, running inference and destroying the runtime.
The initialization in particular can be done without providing a configuration by calling int runtime_initialization(); directly or by providing these parameters to int runtime_initialization_with_args(int length, char **keys, void **values);:
log_level(char *, default is "2"): The minimal log level for the runtime. This can be set to0for trace,1for debug,2for info,3for warnings,4for errors,5for critical and6to disable logging.log_file(char *, default is "runtime.log"): The file to which the runtime logs will be written. If not specified, logs will be written to stdout.num_threads(char *, default is "4"): The maximum number of threads that can be used by the runtime. The higher the number, the more CPU resources will be used, but the better the throughput.
You can check out the examples repository for more details on how to use the runtime library: OAAX Examples.
Conversion Toolchain
The conversion toolchain is used to validate, optimize and simplify the ONNX models. At the end of the process it produces a simplified ONNX model.
It can be used as follows:
docker run -v ./model:/model oaax-nvidia-toolchain:1.1.1 /model/model.onnx /model/output
The above command assumes that the model is located at ./model/model.onnx.
After a succesful conversion, the generated model will be saved in the ./model/output directory.
Download links and compatibility matrix
| OAAX versions | OS | Version | CPU architecture | Runtime library | Conversion toolchain |
|---|---|---|---|---|---|
| 1.1.1 | Ubuntu | 20.04 or higher | x86_64 | CUDA 11, CUDA 12 | Download |
| 1.1.1 | NVIDIA JetPack | - | ARM64 | JetPack 5, JetPack 6 | ⬆️ |
| 1.1.1 | Windows | 10 or higher | x86_64 | CUDA 11, CUDA 12 | ⬆️ |