Kind
ClusterPolicy
Group
nvidia.com
Version
v1
apiVersion: nvidia.com/v1 kind: ClusterPolicy metadata: name: example
Tip: use .spec.ccManager for path-only search
View raw schema
apiVersion string
APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
kind string
Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
metadata object
spec object
ClusterPolicySpec defines the desired state of ClusterPolicy
ccManager object
CCManager component spec
args []string
Optional: List of arguments
defaultMode string
Default CC mode setting for compatible GPUs on the node
enum: on, off, devtools
enabled boolean
Enabled indicates if deployment of CC Manager is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
CC Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
CC Manager image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
CC Manager image tag
cdi object
CDI configures how the Container Device Interface is used in the cluster
default boolean
Default indicates whether to use CDI as the default mechanism for providing GPU access to containers.
enabled boolean
Enabled indicates whether CDI can be used to make GPUs accessible to containers.
daemonsets object required
Daemonset defines common configuration for all Daemonsets
annotations object
Optional: Annotations is an unstructured key value map stored with a resource that may be set by external tools to store and retrieve arbitrary metadata. They are not queryable and should be preserved when modifying objects.
labels object
Optional: Map of string keys and values that can be used to organize and categorize (scope and select) objects. May match selectors of replication controllers and services.
priorityClassName string
rollingUpdate object
Optional: Configuration for rolling update of all DaemonSet pods
maxUnavailable string
tolerations []object
Optional: Set tolerations
effect string
Effect indicates the taint effect to match. Empty means match all taint effects. When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
key string
Key is the taint key that the toleration applies to. Empty means match all taint keys. If the key is empty, operator must be Exists; this combination means to match all values and all keys.
operator string
Operator represents a key's relationship to the value. Valid operators are Exists and Equal. Defaults to Equal. Exists is equivalent to wildcard for value, so that a pod can tolerate all taints of a particular category.
tolerationSeconds integer
TolerationSeconds represents the period of time the toleration (which must be of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default, it is not set, which means tolerate the taint forever (do not evict). Zero and negative values will be treated as 0 (evict immediately) by the system.
format: int64
value string
Value is the taint value the toleration matches to. If the operator is Exists, the value should be empty, otherwise just a regular string.
updateStrategy string
enum: RollingUpdate, OnDelete
dcgm object required
DCGM component spec
args []string
Optional: List of arguments
enabled boolean
Enabled indicates if deployment of NVIDIA DCGM Hostengine as a separate pod is enabled.
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
hostPort integer
Deprecated: HostPort represents host port that needs to be bound for DCGM engine (Default: 5555)
format: int32
image string
NVIDIA DCGM image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
NVIDIA DCGM image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
NVIDIA DCGM image tag
dcgmExporter object required
DCGMExporter spec
args []string
Optional: List of arguments
config object
Optional: Custom metrics configuration for NVIDIA DCGM Exporter
name string
ConfigMap name with file dcgm-metrics.csv for metrics to be collected by NVIDIA DCGM Exporter
enabled boolean
Enabled indicates if deployment of NVIDIA DCGM Exporter through operator is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA DCGM Exporter image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
NVIDIA DCGM Exporter image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
service object
Optional: Service configuration for NVIDIA DCGM Exporter
internalTrafficPolicy string
InternalTrafficPolicy describes how nodes distribute service traffic they receive on the ClusterIP.
type string
Type represents the ServiceType which describes ingress methods for a service
serviceMonitor object
Optional: ServiceMonitor configuration for NVIDIA DCGM Exporter
additionalLabels object
AdditionalLabels to add to ServiceMonitor instance for NVIDIA DCGM Exporter
enabled boolean
Enabled indicates if ServiceMonitor is deployed for NVIDIA DCGM Exporter
honorLabels boolean
HonorLabels chooses the metric’s labels on collisions with target labels.
interval string
Interval which metrics should be scraped from NVIDIA DCGM Exporter. If not specified Prometheus’ global scrape interval is used. Supported units: y, w, d, h, m, s, ms
pattern: ^(0|(([0-9]+)y)?(([0-9]+)w)?(([0-9]+)d)?(([0-9]+)h)?(([0-9]+)m)?(([0-9]+)s)?(([0-9]+)ms)?)$
relabelings []object
Relabelings allows to rewrite labels on metric sets for NVIDIA DCGM Exporter
action string
Action to perform based on the regex matching. `Uppercase` and `Lowercase` actions require Prometheus >= v2.36.0. `DropEqual` and `KeepEqual` actions require Prometheus >= v2.41.0. Default: "Replace"
enum: replace, Replace, keep, Keep, drop, Drop, hashmod, HashMod, labelmap, LabelMap, labeldrop, LabelDrop, labelkeep, LabelKe... replace, Replace, keep, Keep, drop, Drop, hashmod, HashMod, labelmap, LabelMap, labeldrop, LabelDrop, labelkeep, LabelKeep, lowercase, Lowercase, uppercase, Uppercase, keepequal, KeepEqual, dropequal, DropEqual
modulus integer
Modulus to take of the hash of the source label values. Only applicable when the action is `HashMod`.
format: int64
regex string
Regular expression against which the extracted value is matched.
replacement string
Replacement value against which a Replace action is performed if the regular expression matches. Regex capture groups are available.
separator string
Separator is the string between concatenated SourceLabels.
sourceLabels []string
The source labels select values from existing labels. Their content is concatenated using the configured Separator and matched against the configured regular expression.
targetLabel string
Label to which the resulting string is written in a replacement. It is mandatory for `Replace`, `HashMod`, `Lowercase`, `Uppercase`, `KeepEqual` and `DropEqual` actions. Regex capture groups are available.
version string
NVIDIA DCGM Exporter image tag
devicePlugin object required
DevicePlugin component spec
args []string
Optional: List of arguments
config object
Optional: Configuration for the NVIDIA Device Plugin via the ConfigMap
default string
Default config name within the ConfigMap for the NVIDIA Device Plugin config
name string
ConfigMap name for NVIDIA Device Plugin config including shared config between plugin and GFD
enabled boolean
Enabled indicates if deployment of NVIDIA Device Plugin through operator is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA Device Plugin image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
mps object
Optional: MPS related configuration for the NVIDIA Device Plugin
root string
Root defines the MPS root path on the host
repository string
NVIDIA Device Plugin image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
NVIDIA Device Plugin image tag
driver object required
Driver component spec
args []string
Optional: List of arguments
certConfig object
Optional: Custom certificates configuration for NVIDIA Driver container
name string
enabled boolean
Enabled indicates if deployment of NVIDIA Driver through operator is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA Driver image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
kernelModuleConfig object
Optional: Kernel module configuration parameters for the NVIDIA Driver
name string
kernelModuleType string
KernelModuleType represents the type of driver kernel modules to be used when installing the GPU driver. Accepted values are auto, proprietary and open. NOTE: If auto is chosen, it means that the recommended kernel module type is chosen based on the GPU devices on the host and the driver branch used
enum: auto, open, proprietary
licensingConfig object
Optional: Licensing configuration for NVIDIA vGPU licensing
configMapName string
nlsEnabled boolean
NLSEnabled indicates if NVIDIA Licensing System is used for licensing.
livenessProbe object
NVIDIA Driver container liveness probe settings
failureThreshold integer
Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.
format: int32
minimum: 1
initialDelaySeconds integer
Number of seconds after the container has started before liveness probes are initiated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
format: int32
periodSeconds integer
How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
format: int32
minimum: 1
successThreshold integer
Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1.
format: int32
minimum: 1
timeoutSeconds integer
Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
format: int32
minimum: 1
manager object
Manager represents configuration for NVIDIA Driver Manager initContainer
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
Image represents NVIDIA Driver Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
Repository represents Driver Managerrepository path
version string
Version represents NVIDIA Driver Manager image tag(version)
rdma object
GPUDirectRDMASpec defines the properties for nvidia-peermem deployment
enabled boolean
Enabled indicates if GPUDirect RDMA is enabled through GPU operator
useHostMofed boolean
UseHostMOFED indicates to use MOFED drivers directly installed on the host to enable GPUDirect RDMA
readinessProbe object
NVIDIA Driver container readiness probe settings
failureThreshold integer
Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.
format: int32
minimum: 1
initialDelaySeconds integer
Number of seconds after the container has started before liveness probes are initiated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
format: int32
periodSeconds integer
How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
format: int32
minimum: 1
successThreshold integer
Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1.
format: int32
minimum: 1
timeoutSeconds integer
Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
format: int32
minimum: 1
repoConfig object
Optional: Custom repo configuration for NVIDIA Driver container
configMapName string
repository string
NVIDIA Driver image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
startupProbe object
NVIDIA Driver container startup probe settings
failureThreshold integer
Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.
format: int32
minimum: 1
initialDelaySeconds integer
Number of seconds after the container has started before liveness probes are initiated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
format: int32
periodSeconds integer
How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
format: int32
minimum: 1
successThreshold integer
Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1.
format: int32
minimum: 1
timeoutSeconds integer
Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
format: int32
minimum: 1
upgradePolicy object
Driver auto-upgrade settings
autoUpgrade boolean
AutoUpgrade is a global switch for automatic upgrade feature if set to false all other options are ignored
drain object
DrainSpec describes configuration for node drain during automatic upgrade
deleteEmptyDir boolean
DeleteEmptyDir indicates if should continue even if there are pods using emptyDir (local data that will be deleted when the node is drained)
enable boolean
Enable indicates if node draining is allowed during upgrade
force boolean
Force indicates if force draining is allowed
podSelector string
PodSelector specifies a label selector to filter pods on the node that need to be drained For more details on label selectors, see: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors
timeoutSeconds integer
TimeoutSecond specifies the length of time in seconds to wait before giving up drain, zero means infinite
minimum: 0
maxParallelUpgrades integer
MaxParallelUpgrades indicates how many nodes can be upgraded in parallel 0 means no limit, all nodes will be upgraded in parallel
minimum: 0
maxUnavailable string | integer
MaxUnavailable is the maximum number of nodes with the driver installed, that can be unavailable during the upgrade. Value can be an absolute number (ex: 5) or a percentage of total nodes at the start of upgrade (ex: 10%). Absolute number is calculated from percentage by rounding up. By default, a fixed value of 25% is used.
podDeletion object
PodDeletionSpec describes configuration for deletion of pods using special resources during automatic upgrade
deleteEmptyDir boolean
DeleteEmptyDir indicates if should continue even if there are pods using emptyDir (local data that will be deleted when the pod is deleted)
force boolean
Force indicates if force deletion is allowed
timeoutSeconds integer
TimeoutSecond specifies the length of time in seconds to wait before giving up on pod termination, zero means infinite
minimum: 0
waitForCompletion object
WaitForCompletionSpec describes the configuration for waiting on job completions
podSelector string
PodSelector specifies a label selector for the pods to wait for completion For more details on label selectors, see: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors
timeoutSeconds integer
TimeoutSecond specifies the length of time in seconds to wait before giving up on pod termination, zero means infinite
minimum: 0
useNvidiaDriverCRD boolean
UseNvidiaDriverCRD indicates if the deployment of NVIDIA Driver is managed by the NVIDIADriver CRD type
useOpenKernelModules boolean
Deprecated: This field is no longer honored by the gpu-operator. Please use KernelModuleType instead. UseOpenKernelModules indicates if the open GPU kernel modules should be used
usePrecompiled boolean
UsePrecompiled indicates if deployment of NVIDIA Driver using pre-compiled modules is enabled
version string
NVIDIA Driver image tag
virtualTopology object
Optional: Virtual Topology Daemon configuration for NVIDIA vGPU drivers
config string
Optional: Config name representing virtual topology daemon configuration file nvidia-topologyd.conf
gdrcopy object
GDRCopy component spec
args []string
Optional: List of arguments
enabled boolean
Enabled indicates if GDRCopy is enabled through GPU Operator
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA GDRCopy driver image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
NVIDIA GDRCopy driver image repository
version string
NVIDIA GDRCopy driver image tag
gds object
GPUDirectStorage defines the spec for GDS components(Experimental)
args []string
Optional: List of arguments
enabled boolean
Enabled indicates if GPUDirect Storage is enabled through GPU operator
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA GPUDirect Storage Driver image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
NVIDIA GPUDirect Storage Driver image repository
version string
NVIDIA GPUDirect Storage Driver image tag
gfd object required
GPUFeatureDiscovery spec
args []string
Optional: List of arguments
enabled boolean
Enabled indicates if deployment of GPU Feature Discovery Plugin is enabled.
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
GFD image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
GFD image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
GFD image tag
hostPaths object
HostPaths defines various paths on the host needed by GPU Operator components
driverInstallDir string
DriverInstallDir represents the root at which driver files including libraries, config files, and executables can be found.
rootFS string
RootFS represents the path to the root filesystem of the host. This is used by components that need to interact with the host filesystem and as such this must be a chroot-able filesystem. Examples include the MIG Manager and Toolkit Container which may need to stop, start, or restart systemd services.
kataManager object
KataManager component spec
args []string
Optional: List of arguments
config object
Kata Manager config
artifactsDir string
ArtifactsDir is the directory where kata artifacts (e.g. kernel / guest images, configuration, etc.) are placed on the local filesystem.
runtimeClasses []object
RuntimeClasses is a list of kata runtime classes to configure.
artifacts object required
Artifacts are the kata artifacts associated with the runtime class.
pullSecret string
PullSecret is the secret used to pull the OCI artifact.
url string required
URL is the path to the OCI artifact (payload) containing all artifacts associated with a kata runtime class.
name string required
Name is the name of the kata runtime class.
nodeSelector object
NodeSelector specifies the nodeSelector for the RuntimeClass object. This ensures pods running with the RuntimeClass only get scheduled onto nodes which support it.
enabled boolean
Enabled indicates if deployment of Kata Manager is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
Kata Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
Kata Manager image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
Kata Manager image tag
mig object
MIG spec
strategy string
Optional: MIGStrategy to apply for GFD and NVIDIA Device Plugin
enum: none, single, mixed
migManager object
MIGManager for configuration to deploy MIG Manager
args []string
Optional: List of arguments
config object
Optional: Custom mig-parted configuration for NVIDIA MIG Manager container
default string
Default MIG config to be applied on the node, when there is no config specified with the node label nvidia.com/mig.config
enum: all-disabled,
name string
ConfigMap name
enabled boolean
Enabled indicates if deployment of NVIDIA MIG Manager is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
gpuClientsConfig object
Optional: Custom gpu-clients configuration for NVIDIA MIG Manager container
name string
ConfigMap name
image string
NVIDIA MIG Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
NVIDIA MIG Manager image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
NVIDIA MIG Manager image tag
nodeStatusExporter object required
NodeStatusExporter spec
args []string
Optional: List of arguments
enabled boolean
Enabled indicates if deployment of Node Status Exporter is enabled.
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
Node Status Exporter image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
Node Status Exporterimage repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
Node Status Exporterimage tag
operator object required
Operator component spec
annotations object
Optional: Annotations is an unstructured key value map stored with a resource that may be set by external tools to store and retrieve arbitrary metadata. They are not queryable and should be preserved when modifying objects.
defaultRuntime string required
Runtime defines container runtime type
enum: docker, crio, containerd
initContainer object
InitContainerSpec describes configuration for initContainer image used with all components
image string
Image represents image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
Repository represents image repository path
version string
Version represents image tag(version)
labels object
Optional: Map of string keys and values that can be used to organize and categorize (scope and select) objects. May match selectors of replication controllers and services.
runtimeClass string
use_ocp_driver_toolkit boolean
UseOpenShiftDriverToolkit indicates if DriverToolkit image should be used on OpenShift to build and install driver modules
psa object
PSA defines spec for PodSecurityAdmission configuration
enabled boolean
Enabled indicates if PodSecurityAdmission configuration needs to be enabled for all Pods
psp object
Deprecated: Pod Security Policies are no longer supported. Please use PodSecurityAdmission instead PSP defines spec for handling PodSecurityPolicies
enabled boolean
Enabled indicates if PodSecurityPolicies needs to be enabled for all Pods
sandboxDevicePlugin object
SandboxDevicePlugin component spec
args []string
Optional: List of arguments
enabled boolean
Enabled indicates if deployment of NVIDIA Sandbox Device Plugin through operator is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA Sandbox Device Plugin image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
NVIDIA Sandbox Device Plugin image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
NVIDIA Sandbox Device Plugin image tag
sandboxWorkloads object
SandboxWorkloads defines the spec for handling sandbox workloads (i.e. Virtual Machines)
defaultWorkload string
DefaultWorkload indicates the default GPU workload type to configure worker nodes in the cluster for
enum: container, vm-passthrough, vm-vgpu
enabled boolean
Enabled indicates if the GPU Operator should manage additional operands required for sandbox workloads (i.e. VFIO Manager, vGPU Manager, and additional device plugins)
toolkit object required
Toolkit component spec
args []string
Optional: List of arguments
enabled boolean
Enabled indicates if deployment of NVIDIA Container Toolkit through operator is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA Container Toolkit image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
installDir string
Toolkit install directory on the host
repository string
NVIDIA Container Toolkit image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
NVIDIA Container Toolkit image tag
validator object
Validator defines the spec for operator-validator daemonset
args []string
Optional: List of arguments
cuda object
CUDA validator spec
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
driver object
Toolkit validator spec
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
Validator image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
plugin object
Plugin validator spec
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
repository string
Validator image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
toolkit object
Toolkit validator spec
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
version string
Validator image tag
vfioPCI object
VfioPCI validator spec
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
vgpuDevices object
VGPUDevices validator spec
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
vgpuManager object
VGPUManager validator spec
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
vfioManager object
VFIOManager for configuration to deploy VFIO-PCI Manager
args []string
Optional: List of arguments
driverManager object
DriverManager represents configuration for NVIDIA Driver Manager
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
Image represents NVIDIA Driver Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
Repository represents Driver Managerrepository path
version string
Version represents NVIDIA Driver Manager image tag(version)
enabled boolean
Enabled indicates if deployment of VFIO Manager is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
VFIO Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
VFIO Manager image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
VFIO Manager image tag
vgpuDeviceManager object
VGPUDeviceManager spec
args []string
Optional: List of arguments
config object
NVIDIA vGPU devices configuration for NVIDIA vGPU Device Manager container
default string
Default config name within the ConfigMap
name string
ConfigMap name
enabled boolean
Enabled indicates if deployment of NVIDIA vGPU Device Manager is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA vGPU Device Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
NVIDIA vGPU Device Manager image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
NVIDIA vGPU Device Manager image tag
vgpuManager object
VGPUManager component spec
args []string
Optional: List of arguments
driverManager object
DriverManager represents configuration for NVIDIA Driver Manager initContainer
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
Image represents NVIDIA Driver Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
Repository represents Driver Managerrepository path
version string
Version represents NVIDIA Driver Manager image tag(version)
enabled boolean
Enabled indicates if deployment of NVIDIA vGPU Manager through operator is enabled
env []object
Optional: List of environment variables
name string required
Name of the environment variable.
value string
Value of the environment variable.
image string
NVIDIA vGPU Manager image name
pattern: [a-zA-Z0-9\-]+
imagePullPolicy string
Image pull policy
imagePullSecrets []string
Image pull secrets
repository string
NVIDIA vGPU Manager image repository
resources object
Optional: Define resources requests and limits for each pod
limits object
Limits describes the maximum amount of compute resources allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
requests object
Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. Requests cannot exceed Limits. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
version string
NVIDIA vGPU Manager image tag
status object
ClusterPolicyStatus defines the observed state of ClusterPolicy
conditions []object
Conditions is a list of conditions representing the ClusterPolicy's current state.
lastTransitionTime string required
lastTransitionTime is the last time the condition transitioned from one status to another. This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
message string required
message is a human readable message indicating details about the transition. This may be an empty string.
maxLength: 32768
observedGeneration integer
observedGeneration represents the .metadata.generation that the condition was set based upon. For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date with respect to the current state of the instance.
format: int64
minimum: 0
reason string required
reason contains a programmatic identifier indicating the reason for the condition's last transition. Producers of specific condition types may define expected values and meanings for this field, and whether the values are considered a guaranteed API. The value should be a CamelCase string. This field may not be empty.
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
minLength: 1
maxLength: 1024
status string required
status of the condition, one of True, False, Unknown.
enum: True, False, Unknown
type string required
type of condition in CamelCase or in foo.example.com/CamelCase.
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
maxLength: 316
namespace string
Namespace indicates a namespace in which the operator is installed
state string required
State indicates status of ClusterPolicy
enum: ignored, ready, notReady

No matches. Try .spec.ccManager for an exact path

Copied!