Research Topic

Power-Efficient Scalable Manycore Architectures

Current Researchers: Dr. Hao Zheng and Dr. Ke Wang;

Over the last decade, Moore’s Law has slowed, while Dennard Scaling has ended. The end of voltage scaling has made power dissipation the fundamental barrier to scaling computing performance on all platforms, from mobile devices to embedded systems, laptops, servers, and datacenters.


This challenge, often called the power wall, is seen across the board. To meet power challenges, recent research has proposed various low-power techniques. Power-gating, for example, is an effective technique that powers off the under-utilized components to reduce static power consumption.

Dynamic voltage and frequency scaling (DVFS) is another technique that saves power by leveraging the application load to dynamically adjust voltage and frequency. The simultaneous use of various low-power techniques in one system can reduce more power consumption while creating several problems. For example, these low-power techniques can potentially conflict with each other if they are employed concurrently and make decisions at inappropriate times. These conflicts can even negatively affect performance and power savings.

In our research, we combine various power saving techniques while avoiding their shortcomings. The combination of different techniques leads to an explosion of design space. We further explore the use of machine learning to optimize the combined system.

01.

H. Zheng and A. Louri, “Agile: A Learning-Enabled Power and Performance-Efficient Network-on-Chip Design", IEEE Transactions on Emerging Topics in Computing, 10.1 (2022): 223-236.

A number of techniques to achieve power-efficient Network-on-Chips (NoCs) have been proposed, two of which are power-gating and dynamic voltage and frequency scaling (DVFS). Power-gating reduces static power, and DVFS reduces dynamic power. With the goal of reducing both static and dynamic power, it is intuitive to simultaneously deploy both techniques. However, we observe that the straightforward combination of power-gating and DVFS can result in reduced power benefits and degraded performance. In this project, we uniquely combine power-gating and DVFS with the aim of maximizing the NoC power savings and improving performance. The proposed NoC design, called Agile, consists of several architectural designs and a reinforcement learning (RL) based control policy to mitigate the negative effects induced by the combined power-gating and DVFS. Specifically, a simple bypass switch is deployed to maintain network connectivity, avoiding frequently waking up the powered-off router. An optimized pipeline can simply pipeline stages of the bypass switch to reduce network latency. Reversible link channel buffers can be dynamically allocated to where they are needed to improve throughput. In addition, the RL control policy predicts NoC traffic and decides optimal power-gating decisions, voltage/frequency levels and NoC architecture configurations at runtime. Furthermore, we explore the use of an artificial neural network (ANN) to efficiently reduce the area overhead of implementing RL. We evaluate our design using PARSEC benchmarks suite. The full system simulation results show that the proposed design improves the overall power savings by up to 58 percent while improving the performance up to 11 percent as compared to state-of-the-art designs. The ANN-based RL implementation and bypass switch incur nominal area overhead of 5 percent, as compared to a conventional router.

02.

H. Zheng and A. Louri, “An Energy-Efficient Network-on-Chip Design using Reinforcement Learning”, in Proceedings of 56th Design Automation Conference (DAC’19), Las Vegas, NV, June 2-6, 2019.

The design space for energy-efficient Network-on-Chips (NoCs) has expanded significantly comprising a number of techniques. The simultaneous application of these techniques to yield maximum energy efficiency requires the monitoring of a large number of system parameters which often results in substantial engineering efforts and complicated control policies. This motivates us to explore the use of reinforcement learning (RL) approach that automatically learns an optimal control policy to improve NoC energy efficiency. First, we deploy power-gating (PG) and dynamic voltage and frequency scaling (DVFS) to simultaneously reduce both static and dynamic power. Second, we use RL to automatically explore the dynamic interactions among PG, DVFS, and system parameters, learn the critical system parameters contained in the router and cache, and eventually evolve optimal per-router control policies that significantly improve energy efficiency. Moreover, we introduce an artificial neural network (ANN) to efficiently implement the large state-action table required by RL. Simulation results using PARSEC benchmark show that the proposed RL approach improves power consumption by 26%, while improving system performance by 7%, as compared to a combined PG and DVFS design without RL. Additionally, the ANN design yields 67% area reduction, as compared to a conventional RL implementation. 

03.

Q. Fettes, M. Clark, R. Bunescu, A. Karanth, and A. Louri, “Dynamic Voltage and Frequency Scaling in NoCs with Supervised and Reinforcement Learning Techniques,” in IEEE Transactions on Computers (TC), Volume 68, Issue 3, pp.375-389 , March 2019.

Network-on-Chips (NoCs) are the de facto choice for designing the interconnect fabric in multicore chips due to their regularity, efficiency, simplicity, and scalability. However, NoC suffers from excessive static power and dynamic energy due to transistor leakage current and data movement between the cores and caches. Power consumption issues are only exacerbated by ever decreasing technology sizes. Dynamic Voltage and Frequency Scaling (DVFS) is one technique that seeks to reduce dynamic energy; however this often occurs at the expense of performance. In this paper, we propose LEAD Learning-enabled Energy-Aware Dynamic voltage/frequency scaling for multicore architectures using both supervised learning and reinforcement learning approaches. LEAD groups the router and its outgoing links into the same V/F domain and implements proactive DVFS mode management strategies that rely on offline trained machine learning models in order to provide optimal V/F mode selection between different voltage/frequency pairs. We present three supervised learning versions of LEAD that are based on buffer utilization, change in buffer utilization and change in energy/throughput, which allow proactive mode selection based on accurate prediction of future network parameters. We then describe a reinforcement learning approach to LEAD that optimizes the DVFS mode selection directly, obviating the need for label and threshold engineering. Simulation results using PARSEC and Splash-2 benchmarks on a 4 × 4 concentrated mesh architecture show that by using supervised learning LEAD can achieve an average dynamic energy savings of 15.4 percent for a loss in throughput of 0.8 percent with no significant impact on latency. When reinforcement learning is used, LEAD increases average dynamic energy savings to 20.3 percent at the cost of a 1.5 percent decrease in throughput and a 1.7 percent increase in latency. Overall, the more flexible reinforcement learning approach enables learning an optimal behavior for a wider range of load environments under any desired energy versus throughput tradeoff.

04.

M. Clark, A. Kodi, R. Bunescu and A. Louri, “LEAD: Learning-enabled Energy-Aware Dynamic Voltage/Frequency Scaling in NoCs,” in Proceedings of the 55th Design Automation Conference (DAC), San Francisco, CA, 2018.

Network on Chips (NoCs) are the interconnect fabric of choice for multicore processors due to their superiority over traditional buses and crossbars in terms of scalability. While NoC’s offer several advantages, they still suffer from high static and dynamic power consumption. Dynamic Voltage and Frequency Scaling (DVFS) is a popular technique that allows dynamic energy to be saved, but it can potentially lead to loss in throughput. In this paper, we propose LEAD - Learning- enabled Energy-Aware Dynamic voltage/frequency scaling for NoC architectures wherein we use machine learning techniques to enable energy-performance trade-offs at reduced overhead cost. LEAD enables a proactive energy management strategy that relies on an offline trained regression model and provides a wide variety of voltage/frequency pairs (modes). LEAD groups each router and the router’s outgoing links locally into the same V/F domain, allowing energy management at a finer granularity without additional timing complications and overhead. Our simulation results using PARSEC and Splash-2 benchmarks on a 4 × 4 concentrated mesh architecture show an average dynamic energy savings of 17% with a minimal loss of 4% in throughput and no latency increase.

05.

H. Zheng and A. Louri, “EZ-Pass: An Energy & Performance-Efficient Power-gating Router Architecture for Scalable NoCs,” in IEEE Computer Architecture Letters, vol. PP, no. 99, Dec. 2017.

With technology scaling into nanometer regime, static power is becoming the dominant factor in the overall power consumption of Network- on-Chips (NoCs). Static power can be reduced by powering off routers during consecutive idle time through power-gating techniques. However, power-gating techniques suffer from a large wake-up latency to wake up the powered-off routers. Recent research aims to improve the wake-up latency penalty by hiding it through early wake-up techniques. However, these techniques do not exploit the full advantage of power-gating due to the early wake-up. Consequently, they do not achieve significant power savings. In this paper, we propose an architecture called Easy Pass (EZ-Pass) router that remedies the large wake-up latency overheads while providing significant static power savings. The proposed architecture takes advantage of idle resources in the network interface to transmit packets without waking up the router. Additionally, the technique hides the wake-up latency by continuing to provide packet transmission during the wake-up phase. We use full system simulation to evaluate our EZ-Pass router on a 64-core NoC with a mesh topology using PARSEC benchmark suites. Our results show that the proposed router reduces static power by up to 31% and overall network latency by up to 32% as compared to early-wakeup optimized power-gating techniques.

06.

D. DiTomaso, A. Sikder, A. Kodi and A. Louri, “Machine learning enabled power-aware Network-on-Chip design,” in Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017, Lausanne, 2017, pp. 1354-1359.

Although Network-on- Chips (NoCs) are fast becoming pervasive as the interconnect fabric for multicore architectures and systems-on- chips, they still suffer from excessive static and dynamic power consumption. High dynamic power consumption results from switching and storing data within routers/links while excess static power is consumed when routers and links are not utilized for communication and yet have to be powered up. In this paper, we propose LESSON (Learning Enabled Sleepy Storage Links and Routers in NoCs) to reduce both static and dynamic power consumption by power-gating the links and routers at low network utilization and moving the data storage from within the routers to the links at high network utilization. As the network utilization increases from low-to- high, to accommodate more traffic, we design the same channels to flow traffic in either direction, thereby avoiding complex routing or look-ahead wake-up algorithms. Machine learning algorithms predict when to power-gate the channels and routers and when to increase the channel bandwidths such that power savings are maximized while performance penalty is minimized. Our results show that we can improve total network power consumption when compared to conventional NoC buffer designs by 85.6% and when compared with aggressive NoC buffer designs by 31.7%. Our predictor shows marginal performance penalties and by dynamically changing the direction of the links, we can improve packet latency by 14%.

HPCAT Lab
High Performance Computing Architectures & Technologies Lab

Department of Electrical and Computer Enginnering
School of Engineering and Applied Science
The George Washington University


800 22nd Street NW
Washington, DC 20052
United States of America 

Contact

Ahmed Louri, IEEE Fellow
David and Marilyn Karlgaard Endowed Chair Professor of ECE
Director,  HPCAT Lab 


Email: louri@gwu.edu                    
Phone: +1 (202) 994 8241