Archives

Virtana Teams with NVIDIA to Boost AI Factory Observability

Virtana

Virtana, a global leader in AI Factory observability, announced a strategic partnership with NVIDIA to enhance observability across enterprise AI Factories. By integrating Virtana’s observability platform with NVIDIA’s AI and accelerated computing technologies, the collaboration is designed to help IT teams manage distributed and complex AI environments with greater efficiency, reliability, and scalability.

As organizations move beyond AI experimentation into full-scale industrialization, the challenge of monitoring and optimizing performance across AI infrastructure layers is intensifying. Virtana, an NVIDIA Connect program member, delivers unified observability across on-premises, cloud, and containerized environments. Its platform now provides deeper visibility into NVIDIA GPU-accelerated infrastructure enabling faster insights, automation, and performance optimization.

Industry analysts underscore the urgency of this shift. According to Gartner, “By 2029, 70% of large enterprises failing to effectively utilize AI factories will cease to exist.” The message highlights that AI success is no longer optional enterprises must be infrastructure-ready to remain competitive.

Also Read: S&P Global Unites AI and Private Asset Management via iLEVEL 

Echoing this sentiment, NVIDIA’s founder and CEO, Jensen Huang, stated during his GTC 2025 keynote: “AI infrastructure must account for more than just raw performance, it must also consider energy consumption, physical space, and operational costs. Optimizing workloads to use only the compute resources truly necessary will be critical for scaling AI responsibly. Businesses will increasingly need to strike a balance between performance requirements and sustainability constraints if ‘AI everywhere’ is to become a reality.”

Paul Appleby, CEO and President of Virtana, emphasized the importance of this collaboration: “To accelerate Virtana’s mission to deliver AI Factory Observability, powered by AI, at industrial scale, our collaboration with NVIDIA is critical. By combining Virtana’s deep expertise in hybrid cloud performance with NVIDIA’s market-leading computing and AI capabilities, we’re empowering enterprises to improve application performance, accelerate root cause analysis, and reduce infrastructure costs. Our collaboration gives IT teams the intelligence they need to support AI-native workloads with confidence and efficiency.”

Driving AI-Optimized IT Operations

The partnership focuses on equipping IT teams with intelligent, real-time insights that accelerate decision-making, improve resource efficiency, and enhance application performance. With deeper observability into NVIDIA GPU-powered environments, enterprises can reduce mean-time-to-resolution (MTTR), align infrastructure with cost and performance goals, and ensure readiness for AI-native applications.

Key capabilities of the Virtana Platform include:

  • Automated Topology Discovery: Machine learning-powered mapping of interdependencies across AI applications, GPUs, storage, and networks for real-time system visibility.

  • AI-Based Root Cause Analysis: Leveraging NVIDIA AI Enterprise to quickly pinpoint issues, minimizing downtime and service disruption.

  • Predictive Performance Management: Proactive issue resolution through predictive analytics from historical and real-time data.

  • Cost and Capacity Optimization: AI-driven forecasting that helps align GPU usage with business requirements while controlling costs.

  • Natural Language Query via Virtana Copilot: A generative AI assistant enabling non-technical users to query infrastructure data using simple natural language.

Expanding Observability for NVIDIA NIM

Virtana is also extending observability for NVIDIA NIM by integrating OpenTelemetry standards. This provides deep visibility into application performance, availability, and health helping teams monitor, trace, and optimize AI workloads running on NIM.

Supporting Enterprise AI Deployments

Through this integration, enterprises will be able to:

  • Detect performance anomalies in real time.

  • Evaluate the infrastructure impact of AI workloads.

  • Plan for future resource requirements.

  • Minimize downtime with proactive monitoring.

Long-Term Collaboration

Virtana and NVIDIA are committed to ongoing collaboration to expand observability solutions for enterprise AI workloads. Future developments under consideration include:

  • Broader support for NVIDIA DGX systems.

  • Deeper integration with NIM for workload optimization and token cost management.

  • Co-developing observability best practices tailored for AI-driven enterprises.

This long-term alliance is set to equip enterprises with advanced observability tools and intelligence to optimize infrastructure, streamline AI operations, and scale innovation responsibly.