Maintenance policy
Denvr Cloud hardware maintenance requirements
To ensure performance, reliability and enhancement of our AI cloud platform, we have implemented the following guidelines for managing both planned and unplanned maintenance activities. These guidelines form an integral part of our platform's service commitments and are incorporated into our Terms of Service (TOS).
Periodic issues and maintenance should be expected as AI Factory hardware has higher expected failure rates versus traditional CPU products.
Scheduled Maintenance
Scheduled maintenance refers to mandatory activities to upgrade, patch, and ensure availability of our AI systems. This may include device firmware, drivers, and hypervisor upgrades.
These activities are communicated in advance with tenant administrators to minimize client disruption. Changes can be temporarily deferred as required but only for 1-2 months pending the criticality of the upgrade requirement.
Notifications provided by email and in-app dashboard
Maintenance windows to be preferred in off-peak hours
Expected downtime will be communicated
Return to production will be coordinated with the tenant
Emergency Maintenance
Some situations may require emergency maintenance, for example critical security vulnerabilities (CVE) and in response to system failures including hardware failures (GPU drops, transceivers and fibre link quality, etc).
Notifications will be provided even on short notice
Maintenance to be coordinated with the tenant if required. Clients may choose to accept degraded performance if preferred to have a less impacting change windows
Denvr will provide a summary of what was fixed, root cause analysis (RCA), and any follow up actions.
Bare Metal Maintenance
Denvr may detect system issues via network monitoring and BMC management. Because we have no access to bare metal nodes, we may require the tenant to assist with diagnostic commands and output to ensure system health.
Clients may choose to defer diagnostics and accept existing or potential degrades in order to minimize impact. However diagnostics and change windows may still be required per Scheduled Maintenance requirements.
Last updated