OLCF with Providentia develop an intelligence system

14 August 2019

The Oak Ridge Leadership Computing Facility (OLCF) and technology consulting company Providentia Worldwide LLC recently collaborated to develop an intelligence system that combines real-time updates from the IBM AC922 Summit supercomputer with local weather and operational data from its adjacent cooling plant, with the goal of optimizing Summit’s energy efficiency. The OLCF proposed the idea and provided facility data, and Providentia developed a scalable platform to integrate and analyze the data.

On each Summit node, IBM’s baseboard management controller (OpenBMC) provides real-time data readings from dozens of sensors equipped by Summit’s Power9 processors and NVIDIA GPUs, totaling more than 460,000 metrics per second that describe power consumption, temperature, and performance for the entire supercomputer. Although these data streams are not specifically designed for the purpose of controlling Summit’s cooling system, Rogers recognized early on that they could inform Summit’s cooling operations.

Providentia built a framework to pull from four main data sources: per-second sensor data (from Summit’s OpenBMC boards on each node), jobs data at 15-second intervals, the cooling plant’s Programmable Logic Controller, and local weather data from the National Oceanic and Atmospheric Association.