The Challenge and Opportunity of Reducing Power on AI Processors
By David Lammers
“For the edge, it is all about consuming the least power while optimizing for the required performance.”
Given the new reality of industry events going virtual during the pandemic, I was able to log in to half a dozen semiconductor conferences so far this year. A recurring theme was silicon for artificial intelligence (AI) and deep learning (DL), an evolving field which ranges across a broad spectrum of technologies and device types. One common vein running through them was a focus on memory optimization and solving the power/memory bottleneck.
AI is a hot market. ABI Research estimates that the overall AI silicon market will hit $21 billion in 2024. A surprisingly large fraction of that is held by ASIC-based AI accelerators, predicted to triple in value to a $9 billion total available market (TAM) by 2024, with a 30 percent compound annual growth rate (CAGR).
For both training and inference processing, companies are wracking their brains trying to come up with power-saving solutions. While machine learning is only part of total data center power consumption, it is expanding rapidly. Data centers consumed about 3 percent of all power in the United States in 2017, and that doubled to 6 percent in 2020. Proliferation of intelligent edge devices are accelerating as well. According to market research firm IDC, over the next decade 125 billion “things” will be connected to the internet, and by then close to 60 Zettabytes of data will be annually created, captured, copied, and consumed.
It is crystal clear that our industry faces a major challenge: how to implement many intelligent devices at the edge, infer all of that data at the edge with very low power consumption, and manage, process, and train exponentially growing data in the cloud, while keeping energy under control.
AI Reference Package Evolving
Hiren Majmudar, vice president of the computing business unit at GLOBALFOUNDRIES, said “there is a power bottleneck in both inference and training” that plays well into GF’s technology offerings, both its FinFET-based 12LP (12nm FinFET) platform and 12LP+ solutions, as well as its fully depleted SOI-based planar 22FDXTM (22nm FD-SOI) platform.
The FinFET-based technology has power and cost advantages for AI processors, either in the cloud or at the edge. The 12LP+ solution is capable of running AI cores at >1 Ghz, and features a new low-voltage SRAM and a standard cell library capable of 0.55V operation. GF’s most advanced FinFET solution, 12LP+ moved into production this year, and has a dual-work function FET delivering up to 20 percent faster logic performance or up to 40 percent lower power compared to the 12LP base platform.
“Our customers have unique architectures that often depend on a limited set of standard cells,” he said. “We’ve worked hard on our DTCO (design technology co-optimization), and have developed an AI reference package, with a pre-packaged set of components to demonstrate the potential. Via a collaborative DTCO model our customers can quickly get their SoC goals to the market. The DTCO effort can include design analysis services based on the customer’s own architecture for optimized Performance, Power and Area (PPA).”
Optimal PPA looks different depending on the specific application, Majmudar said.
“All segments are cost conscious. For the cloud, it is about TOPS per Watt, getting the best performance at lowest power. For the edge, it is all about lowest cost and consuming the least power while optimizing for the required performance at the edge,” he said.
The eMRAM offering for 22FDX has advantages for customers developing AI applications “looking for instant on, or always on,” Majmudar said. “There are many applications for eMRAM, with customers using it for better density and non-volatility. Another is analog compute in memory,” he added.
AI workloads are broad ranging, including voice, vision, and imaging, on top of the requirements for training and inference. “We are a very specialized foundry, constantly innovating our IP offerings. We continue to invest in IP, die-to-die interconnect, memory, and interface IP. We have a well-defined roadmap that we continue to improve with inputs from customers,” he said.
In future blogs I plan to detail how GF is working with startups in this field, but one of them deserves brief mention here just to provide a glimpse of how much innovation is going on among GF’s customers in AI silicon.
Fully depleted silicon-on-insulator platforms are well-adapted to support dynamic voltage, frequency scaling, and automatic clock gating. The result is ultra-low power consumption for signal processing and neural network algorithms that can run in battery-powered IoT devices.
Perceive, a majority-owned subsidiary of Xperi Corp., is aimed at AI inference for sensor data in ultra-low- power consumer devices. Perceive’s “Ergo” edge inference processor is capable of processing large neural networks on the device with efficiency 20 to 100 times higher than seen on today’s inference-capable processors.
The company is focused on security cameras, smart appliances, and mobile devices with integrated neural network processing, eliminating the need to send data to the cloud for inference processing.
Please watch the short video below to hear Perceive CEO Steve Teig speak with GF SVP Mike Hogan about Perceive’s approach to AI and machine learning: