A Taxonomy of GPGPU Performance Scaling


Published in the Proceedings of the 2015 IEEE International Symposium on Workload Characterization (IISWC 2015), October, 2015 (acceptance rate: 29/61 ≈ 48%)


Abhinandan Majumdar, Gene Wu, Kapil Dev, Joseph L. Greathouse, Indrani Paul, Wei Huang, Arjun Karthik Venugopal, Leonardo Piga, Chip Freitag, Sooraj Puthoor


Graphics processing units (GPUs) range from small, embedded designs to large, high-powered discrete cards. While the performance of graphics workloads is generally understood, there has been little study of the performance of GPGPU applications across a variety of hardware configurations. This work presents performance scaling data gathered for 267 GPGPU kernels from 97 programs run on 891 hardware configurations of a modern GPU. We study the performance of these kernels across a 5x change in core frequency, 8.3x change in memory bandwidth, and 11x difference in compute units. We illustrate that many kernels scale in intuitive ways, such as those that scale directly with added computational capabilities or memory bandwidth. We also find a number of kernels that scale in nonobvious ways, such as losing performance when more processing units are added or plateauing as frequency and bandwidth are increased. In addition, we show that a number of current benchmark suites do not scale to modern GPU sizes, implying that either new benchmarks or new inputs are warranted.




PDF Copyright © 2015 IEEE. Hosted on this personal website as per this IEEE policy.