Test challenges in calibrating power for server designs

Article By : Aik-Moh Ng and Lauren Getz, Teradyne

An insatiable appetite for powerful processors in high-compute servers drives an increase in MOSFETs, calling for precise power measurements.

The global pandemic has accelerated the adoption of emerging semiconductor technologies to meet market demands, which has enabled companies with superior technology to outperform their competition. More than 50% of companies will need to build new digital businesses to stay economically viable, and recovery from the pandemic will involve permanent changes to many dimensions of an organization, including the pace at which it conducts business; its core value proposition; and the talent.

With digital and technology-driven disruptions creating a winner-takes-all dynamic in an expanding number of industries, only a subset of organizations is likely to thrive. In today’s competitive semiconductor market, where top companies have robust and expansive technology portfolios that are always evolving, a strong technology foundation is critical for success. The time is now for these companies to make bold and innovative investments in advanced technology and digital capabilities.

Effects of pandemic on digital ecosystem

大流行已经放大技术的必要性growth and has encouraged innovation across the entire digital ecosystem, from big data and artificial intelligence (AI) to cloud computing and Internet of Things (IoT). Traditional brick and mortar retail companies have embraced technology to remain relevant and meet the demands of tech-savvy consumers. Big data has facilitated the digitization of various industries and exponential growth in eCommerce. Additionally, with travel restrictions imposed nearly worldwide, work from home became ubiquitous, resulting in an unanticipated surge in the uptake of cloud gaming and high-performance computing (HPC) as a service.

Figure 1The above chart shows the compound annual growth rate for server sales. Source:Teradyne

The server market supporting AI hyperscalers is expected to grow by 50% year over year through 2025, while cloud gaming CAGR is expected to grow at an astonishing 72% through 2025. However, super high-performance systems have especially challenging power requirements clocking up to 10.2 KWatts per server. The emerging class of exascale high-performance computers—computing systems capable of calculating at least 1018IEEE 754 Double Precision operations per second—and trillion parameter AI models for tasks such as accurate conversational AI require months to train, even with the processing power of today’s supercomputers.

Figure2In a typical server architecture, all processing components require power usage. Source: Teradyne

Power management challenges due to higher demand for computing power

As computing power increases, transistor count per die increases in tandem. Although process node counts have decreased over the years, die size is increasing as transistor counts double every 18 months. So, onboard real estate for power management devices decreases to accommodate larger processors. Consequently, increasing current draw coupled with decreasing availability of board development area for silicon-based MOSFETs, which supply current to the processors, results in an interesting power management challenge.

Figure 3The above data highlights battling current requirements and available design space. Source: Teradyne

An insatiable appetite for higher power processors, for applications such as AI training servers, drives a substantial increase in MOSFET drivers. To keep heat generation as low as possible and maximize energy efficiency, these devices are designed with low RDSONto deliver hundreds of amps of current to the processors they are powering. However, high-volume MOSFET drivers with extremely low RDSONmeasuring less than 1 mΩ create challenges for semiconductor test.

Test challenges for achieving precise measurements

Testing semiconductors prior to the installation in final application is critical to ensure devices meet specified requirements for the lifetime of their use. Sustaining competitive cost of test (COT), while providing complete test coverage, requires precision high-power instrumentation to operate accurately and efficiently.

Measuring precision voltages across the 1 mΩ gate resistance on the MOSFET driver requires tens of amps of current to flow through. High bandwidth, precision and power instruments from automated test equipment (ATE) can efficiently measure RDSONresistance accurately. However, parasitic resistance from the device interface boards (DIB) and a device test socket’s contact resistance, which can measure up to 50 times of the MOSFET driver’s RDSON, pushes the boundaries of maintaining optimal utilization of the test cell.

Additionally, high current pulsing can cause magnetic coupling into adjacent traces, compromising measured value for adjacent sites in high parallelism solution. Unfortunately, the industry practice of shielding or closely coupling high-current traces is not viable when dealing with current-induced magnetic coupling. In order to address this challenge, the high-current traces must be laid out as broadside differential pairs to optimize the magnetic field cancelation. Pulsing substantial current through high-contact resistance generates excessive heat and damages contact pins over time.

Precise RDSONmeasurements of devices could be achieved by meticulously designing onboard circuitry supporting application calibration to eliminate path and contact resistance. The onboard circuit has a secondary function to ensure the safe operation of all instruments deployed. Optimal throughput could be achieved by maintaining high equipment efficiency via an ideal test environment.

New power instruments with improved bandwidth, coupled with innovative test techniques, help to prolong the contact pin lifespan by shortening pulse width and incorporating an ultra-efficient contact resistance check before each test execution. Increasing the power instrument’s bandwidth delivers faster DI/DT, translating to shorter test times and the possibility of increasing site counts, resulting in higher overall throughput. Prolonging the contact pin lifespan also reduces consumables’ expenses.

Boosting energy efficiency with new materials like GaN

As deep-learning AI becomes more pervasive, the insatiable demand for computing power will ensue and supporting power management semiconductors will experience intense growth. Meanwhile, the carbon footprint from data centers is attracting attention and regulatory policies are being enacted to ensure data centers are equipped with energy-efficient equipment. In 2019, data centers consumed about 2% of the world’s electricity, but this number is expected to rise to up to 8% by 2030.

场效应管的效率通常出最多95%. To meet the growing energy consumption of data centers, new materials and processes such as gallium nitride (GaN) are being developed to address the shortcomings of traditional semiconductor materials. With higher efficiency and switching frequencies, GaN power supplies deliver more power than their silicon-based predecessors with a similar footprint. “Turbocharged” GaN power supplies delivering higher power on a similar footprint could increase overall server density by up to 56% on existing racks.

Figure 4The above table compares material properties of silicon and gallium nitride. Source: Teradyne

Gallium nitride power supplies deliver three benefits compared to silicon-based power supplies. First, existing data centers can increase their data density. Second, more efficient power supplies translate into lower operating costs. Finally, the data center can reduce its CO2排放作为of the global goal to achieve net-zero emissions by the year 2050.

The primary industry challenge with GaN transistors is the high dynamic on-resistance which is difficult to measure when switched at required high frequencies. Top test equipment manufacturers are working to develop the precision instrumentation required to guarantee GaN RDSONspecifications. Soon, GaN will replace silicon as the preferred material technology for delivering power, but until GaN’s hard switching dynamic on-resistance can be measured consistently and accurately, silicon-based processors will continue to be utilized.

As demand for high-performance computing applications, often delivered as a service, increases, semiconductor companies must rise to the challenge to remain competitive by adopting new technologies and processes. Advances in power management and new materials like GaN will ensure the technology is able to keep up with the applications driving it. However, with these new technologies come a number of challenges, both for manufacturing and test. Those that can be nimble enough to adapt will find success in these new and emerging markets.

This article was originally published onEDN.

Aik-Moh Ng is a product manager for analog power test products at Teradyne.

Lauren Getz is a product manager for analog power test products at Teradyne.

Leave a comment