Curriculum Vitae
Updated July 3, 2024
Contact Information
Address: Joseph Lee GreathouseAustin, TX USA
E-mail: joseph.l.greathouse@gmail.com
WWW: https://www.computermachines.org/
Research Interests
My work sits at the interface between computer hardware and software. This includes creating new software-visible hardware features in heterogeneous systems and optimizing software for novel hardware platforms.
My Ph.D. dissertation focused on hardware methods for accelerating software analyses such as data race detectors and memory checkers. I used existing hardware, like performance counters, for this in addition to designing new hardware mechanisms.
I have applied these interests and background towards AMD's Instinct accelerators and ROCm GPGPU software. I am a software architect and have helped design multiple hardware and software features for these products. Hardware examples include heterogeneous cache coherence protocols, virtual memory optimizations, new GPU performance monitoring infrastructure, multi-device power control algorithms, and RAS features. In the software domain, I have designed multiple user-visible software APIs that surface new hardware features, hardware-optimized algorithms for math libraries, and numerous workarounds for unexpected hardware behavior.
Professional Experience
- Advanced Micro Devices, Inc., Austin, TX USA
Fellow
July 2022 - Present- I am a software architect in the performance engineering team in AMD's Adaptive, Embedded and AI Group. My work focuses on hardware/software interface topics for AMD's ROCm platform for general-purpose GPU computing.
- I am a software architect for our AMD Instinct products, including MI200, MI300, and future designs. I create high-performance sparse linear algebra algorithms for AMD GPUs, optimize our machine learning and high-performance software for power and performance, architect GPU software, firmware, and hardware solutions, write training materials, provide internal and external support on advanced technical topics, and gather customer technical requirements.
- My responsibilities cover nearly every stage in the lifetime of our products. I interface with customers, internal developers, and research teams to set advanced product development plans. I collaborate with hardware, firmware, and software architects to define these products. I work with multiple development and verification teams to define how these products will be built and tested. I provide deep technical support and debugging expertise during both pre- and post-silicon bringup; this includes designing software workarounds for hardware issues, as well as root-causing and providing fixes in RTL. After our hardware is in production, I design and optimize our software, provide customer training and support, and gather feedback to feed into the next generation of our products.
Principal Member of Technical Staff
July 2019 - June 2022- As part of the performance engineering team in AMD's Radeon Technologies Group, this work focused on software, firmware, and hardware optimizations for AMD's ROCm software platform and Instinct accelerator hardware.
- Software architect for the AMD Instinct MI200 and MI300 programs. This included hardware feature design in areas such as cache coherence, virtual memory, performance monitoring, power control, and RAS. I also developed software interfaces for multiple new hardware features, created new user-level APIs, and delivered software and firmware workarounds for unexpected hardware behavior.
- I wrote hundreds of pages of documentation about AMD's hardware and software, created hundreds of slides of training material, and presented dozens of training sessions to large internal and external audiences.
- Beyond the HW/SW architecture role, I continued to create optimized sparse linear algebra algorithms for AMD GPUs, optimize our machine learning and high-performance computing codes for power and energy usage.
Senior Member of Technical Staff
July 2016 - June 2019- I started this position in AMD Research, leading a team of 10 engineers studying performance and power monitoring, estimation, and management mechanisms for CPUs and GPUs as part of AMD's PathForward exascale research program.
- We published research focusing on power and thermal management at HPCA 2017, ARCS 2017, ITherm 2018, ICCAD 2018, and MICRO 2019. The simulation tools we build were also used in a major HPCA 2017 industry track publication.
- This group also researched system and software topics for heterogeneous systems such as GPGPU buffer overflow protection (published at CGO 2017 and IWOCL 2018) and GPGPU system call overheads (published at IISWC 2018).
- I then moved to be a performance engineer in AMD's ROCm GPGPU software engineering team, where I worked on optimizing our GPU software, firmware, and hardware in order to meet the demands of our GPU compute customers.
- My work in this group included user support of our software stack, new hardware bring-up and software optimization for the Radeon Instinct MI60 and Instinct MI100 accelerators, and software feature development such as HIP Cooperative Groups and rocSPARSE algorithmic development.
Member of Technical Staff
July 2014 - June 2016- For AMD's FastForward 2 research, a major focus was extending the high-level simulation work from the FastForward program. Details of some of our extended power and performance models can be found in our papers at HPCA 2015, IISWC 2015, and IISWC 2016.
- Beyond modeling and simulation work, I focused on further GPGPU software, including further sparse linear algebra work as described at HiPC 2015 and IWOCL 2015 and hash table algorithms published at USENIX ATC 2016.
- I also performed system administration tasks during this period of time, setting up clusters of computers running new HSA software on AMD heterogeneous processors for multiple government labs to use in their own research.
August 2012 - June 2014- As part of AMD Research's FastForward research on exascale computing, my research broadly centered on creating a high-level performance and power simulator based on analytic scaling of real hardware measurements.
- This simulator was described in a ModSim 2013 paper. We invented multiple new CPU and GPU performance and power estimation algorithms for it, including one described at USENIX ATC 2014.
- This simulator was used as a major part of AMD Research's studies of heterogeneous CPU-GPU PIM systems, as described in MSPC 2013 and HPDC 2014.
- During this time, I also worked with exascale proxy applications to formulate new GPGPU algorithms for AMD GPUs and APUs, such as the GPU-based sparse matrix-vector multiplication algorithm published at SC14.
- University of Michigan, Ann Arbor, MI, USA
Research Assistant
May 2007 - August 2012- Identified methods of distributing software analyses across many users to reduce slowdowns.
- Managed graduate and undergraduate students through development of prototype systems.
- Kelly Services / Intel Corp., Champaign, IL, USA
Research Contractor
May 2010 - October 2010- Researched approaches for improving speed and accuracy of Intel Inspector XE data race detector.
- Utilized unique features of Intel processors to yield orders-of-magnitude performance gains for this tool; details can be found in our ISCA 2011 publication.
- International Business Machines Corp., Rochester, MN, USA
Speed Team Intern
May 2008 - August 2008- Designed and constructed an InfiniBand compliance verification suite that caught numerous bugs.
- Added the suite into the IBM PowerVM I/O firmware development process and found multiple bugs.
Education
- University of Michigan, Ann Arbor
Ph.D., Computer Science and Engineering
May 2012
Thesis Topic: Hardware Mechanisms for Distributed Dynamic Software Analysis
Advisor: Professor Todd Austin - University of Michigan, Ann Arbor
M.S.E. Computer Science and Engineering
May 2008
Concentration: Hardware Systems
GPA: 7.73/9.0 (3.79/4.0) - University of Illinois at Urbana-Champaign
B.S. Computer Engineering
With Honors
May 2006
Minor: International Engineering – Japanese
GPA: 3.71/4.0
Conference Publications
- Alan Smith, Gabriel H. Loh, Michael J. Schulte, Mike Ignatowski, Samuel Naffziger, Mike Mantor, Mark Fowler, Nathan Kalyanasundharam, Vamsi Alla, Nicholas Malaya, Joseph L. Greathouse, Eric Chapman, Raja Swaminathan, "Realizing the AMD Exascale Heterogeneous Processor Vision," Published in the Proceedings of the 51st International Symposium on Computer Architecture (ISCA 2024), June, 2024
- Gabriel H. Loh, Michael J. Schulte, Mike Ignatowski, Vignesh Adhinarayanan, Shaizeen Aga, Derrick Aguren, Varun Agrawal, Ashwin M. Aji, John Alsop, Paul Bauman, Bradford M. Beckmann, Majed Valad Beigi, Sergey Blagodurov, Travis Boraten, Michael Boyer, William Brantley, Noel Chalmers, Shaoming Chen, Kevin Cheng, Michael L. Chu, David Cownie, Nicholas Curtis, Joris Del Pino, Nam Duong, Alexandru Dutu, Yasuko Eckert, Christopher Erb, Chip Freitag, Joseph L. Greathouse, Sudhanva Gurumurthi, Anthony Gutierrez, Khaled Hamidouche, Sachin Hossamani, Wei Huang, Mahzabeen Islam, Nuwan Jayasena, John Kalamatianos, Onur Kayiran, Jagadish Kotra, Alan Lee, Daniel Lowell, Niti Madan, Abhinandan Majumdar, Nicholas Malaya, Srilatha Manne, Susumu Mashimo, Damon McDougall, Elliott Mednick, Michael Mishkin, Mark Nutter, Indrani Paul, Matthew Poremba, Brandon Potter, Kishore Punniyamurthy, Sooraj Puthoor, Steven E. Raasch, Karthik Rao, Greg Rodgers, Marko Scrbak, Mohammad Seyedzadeh, John Slice, Vilas Sridharan, Rene van Oostrum, Eric van Tassell, Abhinav Vishnu, Samuel Wasmundt, Mark Wilkening, Noah Wolfe, Mark Wyse, Adithya Yalavarti, Dmitri Yudanov, "A Research Retrospective on AMD's Exascale Computing Journey," Published in the Proceedings of the 50th International Symposium on Computer Architecture (ISCA 2023), June, 2023
- Raghavendra Pradyumna Pothukuchi, Joseph L. Greathouse, Karthik Rao, Christopher Erb, Leonardo Piga, Petros Voulgaris, Josep Torrellas, "Tangram: Integrated Control of Heterogeneous Computers," Published in the Proceedings of the 52nd IEEE/ACM International Symposium on Microarchitecture (MICRO-52), October, 2019
- Joseph L. Greathouse, Gabriel H. Loh, "Machine Learning for Performance and Power Modeling of Heterogeneous Systems," Published in the Proceedings of the 2018 International Conference on Computer Aided Design (ICCAD 2018), November, 2018
- Arkaprava Basu, Joseph L. Greathouse, Guru Venkataramani, Ján Veselý, "Interference from GPU System Service Requests," Published in the Proceedings of the 2018 IEEE International Symposium on Workload Characterization (IISWC 2018), September, 2018 (Nominated for Best Paper)
- Xudong An, Manish Arora, Wei Huang, William C. Brantley, Joseph L. Greathouse, "3D Numerical Analysis of Two-Phase Immersion Cooling for Electronic Components," Published in the Proceedings of the 17th IEEE Intersociety Conference on Thermomechanical Phenomena in Electronic Systems (ITherm 2018), May, 2018
- Nicholas Malaya, Shuai Che, Joseph L. Greathouse, René van Oostrum, Michael J. Schulte, "Accelerating Matrix Processing with GPUs," Published in the Proceedings of the 24th IEEE Symposium on Computer Arithmetic (ARITH 24), July, 2017
- Marko Ščrbak, Joseph L. Greathouse, Nuwan Jayasena, Krishna Kavi, "DVFS Space Exploration in Power Constrained Processing-in-Memory Systems," Published in the Proceedings of the 30th International Conference on Architecture of Computing Systems (ARCS 2017), April, 2017
- Abhinandan Majumdar, Leonardo Piga, Indrani Paul, Joseph L. Greathouse, Wei Huang, David H. Albonesi, "Dynamic GPGPU Power Management using Adaptive Model Predictive Control," Published in the Proceedings of the 23rd IEEE Symposium on High Performance Computer Architecture (HPCA 2017), February, 2017
- Thiruvengadam Vijayaraghavan, Yasuko Eckert, Gabriel H. Loh, Michael J. Schulte, Mike Ignatowski, Indrani Paul, Bradford M. Beckmann, Steven K. Reinhardt, William C. Brantley, Joseph L. Greathouse, Onur Kayiran, Matthew Poremba, Wei Huang, Arun Karunanithi, Greg Sadowski, Vilas Sridharan, Steven E. Raasch, Mitesh Meswani, "Design and Analysis of an APU for Exascale Computing," Published in the Proceedings of the 23rd IEEE Symposium on High Performance Computer Architecture (HPCA 2017 Industry Track), February, 2017
- Christopher Erb, Mike Collins, Joseph L. Greathouse, "Dynamic Buffer Overflow Detection for GPGPUs," Published in the Proceedings of the 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2017), February, 2017
- Vignesh Adhinarayanan, Indrani Paul, Joseph L. Greathouse, Wei Huang, Ashutosh Pattnaik, Wu-chun Feng, "Measuring and Modeling On-Chip Interconnect Power on Real Hardware," Published in the Proceedings of the 2016 IEEE International Symposium on Workload Characterization (IISWC 2016), September, 2016 (Awarded Best Paper)
- Alex D. Breslow, Dong Ping Zhang, Joseph L. Greathouse, Nuwan Jayasena, Dean M. Tullsen, "Horton Tables: Fast Hash Tables for In-Memory Data-Intensive Computing," Published in the Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC), June, 2016
- Mayank Daga, Joseph L. Greathouse, "Structural Agnostic SpMV: Adapting CSR-Adaptive for Irregular Matrices," Published in the Proceedings of the 2015 IEEE International Conference on High Performance Computing (HiPC 2015), December, 2015
- Abhinandan Majumdar, Gene Wu, Kapil Dev, Joseph L. Greathouse, Indrani Paul, Wei Huang, Arjun Karthik Venugopal, Leonardo Piga, Chip Freitag, Sooraj Puthoor, "A Taxonomy of GPGPU Performance Scaling," Published in the Proceedings of the 2015 IEEE International Symposium on Workload Characterization (IISWC 2015), October, 2015
- Gene Wu, Joseph L. Greathouse, Alexander Lyashevsky, Nuwan Jayasena, Derek Chiou, "GPGPU Performance and Power Estimation Using Machine Learning," Published in the Proceedings of the 21st IEEE Symposium on High Performance Computer Architecture (HPCA 2015), February, 2015
- Bo Su, Junli Gu, Li Shen, Wei Huang, Joseph L. Greathouse, Zhiying Wang, "PPEP: Online Performance, Power, and Energy Prediction Framework and DVFS Space Exploration," Published in the Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-47), December, 2014
- Joseph L. Greathouse, Mayank Daga, "Efficient Sparse Matrix-Vector Multiplication on GPUs using the CSR Storage Format," Published in the Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC14), November, 2014
- Dong Ping Zhang, Nuwan Jayasena, Alexander Lyashevsky, Joseph L. Greathouse, Lifan Xu, Michael Ignatowski, "TOP-PIM: Throughput-Oriented Programmable Processing in Memory," Published in the Proceedings of the 23rd International Symposium on High Performance Parallel and Distributed Computing (HPDC '14), June, 2014 (Nominated for Best Paper)
- Bo Su, Joseph L. Greathouse, Junli Gu, Michael Boyer, Li Shen, Zhiying Wang, "Implementing a Leading Loads Performance Predictor on Commodity Processors," Published in the Proceedings of the 2014 USENIX Annual Technical Conference (USENIX ATC '14), June, 2014
- Andrea Pellegrini, Joseph L. Greathouse, Valeria Bertacco, "Viper: Virtual Pipelines for Enhanced Reliability," Published in the Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA 2012), June, 2012
- Joseph L. Greathouse, Hongyi Xin, Yixin Luo, Todd Austin, "A Case for Unlimited Watchpoints," Published in the Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2012), March, 2012
- Joseph L. Greathouse, Zhiqiang Ma, Matthew I. Frank, Ramesh Peri, Todd Austin, "Demand-Driven Software Race Detection using Hardware Performance Counters," Published in the Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA 2011), June, 2011
- Joseph L. Greathouse, Chelsea LeBlanc, Todd Austin, Valeria Bertacco, "Highly Scalable Distributed Dataflow Analysis," Published in the Proceedings of the 2011 International Symposium on Code Generation and Optimization (CGO 2011), April, 2011 (Awarded Best Student Presentation at CGO2011)
- Joseph L. Greathouse, Ilya Wagner, David A. Ramos, Gautam Bhatnagar, Todd Austin, Valeria Bertacco and Seth Pettie, "Testudo: Heavyweight Security Analysis via Statistical Sampling," Published in the Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-41), November, 2008
Workshop Publications
- Christopher Erb, Joseph L. Greathouse, "clARMOR: A Dynamic Buffer Overflow Detector for OpenCL Kernels," Published in the Proceedings of the International Workshop on OpenCL (IWOCL 2018), May, 2018
- Joseph L. Greathouse, Kent Knox, Jakub Poła, Kiran Varaganti, Mayank Daga, "clSPARSE: A Vendor-Optimized Open-Source Sparse BLAS Library," Published in the Proceedings of the International Workshop on OpenCL (IWOCL 2016), April, 2016
- Yingying Tian, Sooraj Puthoor, Joseph L. Greathouse, Bradford M. Beckmann, Daniel Jiménez, "Adaptive GPU Cache Bypassing," Published in the Proceedings of the 8th Workshop on General Purpose Processing on GPUs (GPGPU-8), February, 2015
- Adam McLaughlin, Indrani Paul, Joseph L. Greathouse, Srilatha Manne, Sudhakar Yalamanchili, "A Power Characterization and Management of GPU Graph Traversal," Published at the Fourth Workshop on Architectures and Systems for Big Data (ASBD 2014), June, 2014
- Joseph L. Greathouse, Alexander Lyashevsky, Mitesh Meswani, Nuwan Jayasena, Michael Ignatowski, "Simulation of Exascale Nodes through Runtime Hardware Monitoring," Published at the Workshop on Modeling & simulation of Exascale Systems & Applications (ModSim 2013), September, 2013
- Dong Ping Zhang, Nuwan Jayasena, Alexander Lyashevsky, Joseph L. Greathouse, Mitesh Meswani, Mark Nutter, Michael Ignatowski, "A New Perspective on Processing-in-memory Architecture Design," Published at the ACM SIGPLAN Workshop on Memory Systems Performance and Correctness (MSPC 2013), June, 2013.
- Joseph L. Greathouse, Todd Austin, "Position Paper: The Potential of Sampling for Dynamic Analysis," Published in the Proceedings of the 6th ACM SIGPLAN Workshop on Programming Languages and Analysis for Security (PLAS 2011), June, 2011
Software Projects
- AMD Matrix Instruction Calculator
https://github.com/RadeonOpenCompute/amd_matrix_instruction_calculator - AMD Research Instruction Based Sampling (IBS) Toolkit
https://github.com/jlgreathouse/AMD_IBS_Toolkit - clSPARSE - A Vendor-Optimized Sparse BLAS Library for GPUs Using OpenCL
https://github.com/clMathLibraries/clSPARSE - clARMOR - A Buffer Overflow Detector for OpenCL GPU Kernels
https://github.com/ROCm-Developer-Tools/clARMOR
Patents
- Joseph L. Greathouse, Sean Keely, Alan D. Smith, Anthony Asaro, Ling-Ling Wang, Milind N. Nemlekar, Hari Thangirala, Felix Kuehling, "DMA Engines Configured to Perform First Portion Data Transfer Commands with a First DMA Engine and Second Portion Data Transfer Commands with Second DMA Engine. U.S. Patent Number 11,995,351, Granted May 28, 2024.
- Vydhyanathan Kalyanasundharam, Joseph L. Greathouse, Shyam Sekhar, "Hardware Device for Enforcing Atomicity for Memory Operations. U.S. Patent 11,972,261, Granted April 30, 2024.
- Gregory P. Rodgers, Joseph L. Greathouse, "Compiler-initiated Tile Replacement to Enable Hardware Acceleration Resources". U.S. Patent Number 11,853,734 (continuation of U.S. Patent 11,347,486), Granted December 26, 2023.
- Joseph L. Greathouse, Alan D. Smith, Francisco L. Duran, Felix Kuehling, Anthony Asaro, "Dynamic Repartition of Memory Physical Address Mapping". U.S. Patent Number 11,687,251, Granted June 27, 2023.
- Abhinav Vishnu, Joseph L. Greathouse, "Allreduce Enhanced Direct Memory Access Functionality". U.S. Patent Number 11,669,473, Granted June 6, 2023.
- Sanchari Sen, Derrick Allen Augren, Joseph L. Greathouse, "Family of Lossy Sparse Load SIMD Instrutions". U.S. Patent Number 11,663,001, Granted May 30, 2023.
- Joseph L. Greathouse, Steven Tony Tye, Mark Fowler, Milind N. Nemlekar, "Dynamic Modification of Coherent Atomic Memory Operations". U.S. Patent Number 11,604,737, Granted March 14, 2023.
- Shijia Wei, Joseph L. Greathouse, John Kalamatianos, "Per-instruction Energy Debugging Using Instruction Sampling Hardware". U.S. Patent Number 11,556,162, Granted January 17, 2023.
- Gregory P. Rodgers, Joseph L. Greathouse, "Compiler-initiated Tile Replacement to Enable Hardware Acceleration Resources". U.S. Patent Number 11,347,486, Granted May 31, 2022.
- Arkaprava Basu, Joseph L. Greathouse, "Enforcing Central Processing Unit Quality of Service Guarantees When Servicing Accelerator Requests". U.S. Patent Number 11,275,613, Granted March 15, 2022.
- Karthik Rao, Wei Huang, Xudong An, Manish Arora, Joseph L. Greathouse, "Runtime Localized Cooling of High-Performance Processors". U.S. Patent Number 11,137,809, Granted October 5, 2021.
- Khaled Hamidouche, Michael W. LeBeane, Nicholas P. Malaya, Joseph L. Greathouse, "Optimized and Scalable Sparse Triangular Linear Systems on Networks of Accelerators". U.S. Patent Number 10,936,697, Granted March 2, 2021.
- Raghavendra Pradyumna Pothukuchi, Joseph L. Greathouse, Leonardo de Paula Rosa Piga, "Distributed Multi-Input Multi-Output Control Theoretic Method to Manage Heterogeneous Systems". U.S. Patent Number 10,928,789, Granted February 23, 2021
- Jagadish B. Kotra, Karthik Rao, Joseph L. Greathouse, "Method and Apparatus for Temperature-Gradient Aware Data-Placement for 3D Stacked DRAMs". U.S. Patent Number 10,725,670, Granted July 28, 2020.
- Joseph L. Greathouse, Mitesh R. Meswani, Sooraj Puthoor, Dmitri Yudanov, James M. O'Connor, "Heterogeneous Graphics Processing Unit for Scheduling Thread Groups for Execution on Variable Width SIMD Units". U.S. Patent Number 10,713,059, Granted July 14, 2020.
- Joseph L. Greathouse, "High-Performance Sparse Triangular Solve on Graphics Processing Units". U.S. Patent Number 10,691,772, Granted June 23, 2020.
- Arkaprava Basu, Joseph L. Greathouse, "Dynamically Adapting Mechanism for Translation Lookaside Buffer Shootdowns". U.S. Patent Number 10,552,339, Granted February 4, 2020.
- Joseph L. Greathouse, Christopher D. Erb, Michael G. Collins, "Detecting Buffer Overflows in General-Purpose GPU Applications". U.S. Patent Number 10,067,710, Granted September 4, 2018.
- Dmitri Yudanov, Sergey Blagodurov, Arkaprava Basu, Sooraj Puthoor, Joseph L. Greathouse, "Predicting a Context Portion to Move Between a Context Buffer and Registers Based on Context Portions Previously Used by at least One Other Thread". U.S. Patent Number 10,019,283, Granted July 10, 2018.
- Leonardo de Paula Rosa Piga, Abhinandan Majumdar, Indrani Paul, Wei Huang, Manish Arora, Joseph L. Greathouse, "Hardware Accuracy Counters for Application Precision and Quality Feedback". U.S. Patent Number 9,990,203, Granted June 5, 2018.
- Mayank Daga, Joseph L. Greathouse, "Efficient Sparse Matrix-Vector Multiplication on Parallel Processors". U.S. Patent Number 9,697,176. Granted July 4, 2017.
- Joseph L. Greathouse, David S. Christie, "Randomly Branching Using Hardware Watchpoints". U.S. Patent Number 9,483,379. Granted November 1, 2016.
- Joseph L. Greathouse, David S. Christie, "Randomly Branching Using Performance Counters". U.S. Patent Number 9,448,909. Granted September 20, 2016.
- Joseph L. Greathouse, Anton Chernoff, "User-level Hardware Branch Records". U.S. Patent Number 9,372,733. Granted June 21, 2016.
Presentations
- "Accelerating Dynamic Software Analyses," Microsoft Research, Feb. 23, 2012
- "On-Demand Dynamic Software Analysis," AMD Tech Topic Series, Dec. 12, 2011
- "Hardware Support for On-Demand Software Analysis," University of Michigan CSE Graduate Student Honors Competition, Dec. 8, 2011
- "Accelerating Dynamic Software Analyses," Microsoft Research Silicon Valley, Dec. 2, 2011
- "Accelerating Dynamic Software Analyses," VMware, Dec. 1, 2011
- "On-Demand Dynamic Software Analysis," Intel Labs, Nov. 29, 2011
- "Sampling Dynamic Dataflow Analyses," University of British Columbia, Jun. 10, 2011
Videos
Posters
- Scalable Security Vulnerability Analysis via Sampling, 2011 GSRC Annual Symposium, Nov. 16, 2011
- Testudo: Heavyweight Security Analysis via Statistical Sampling, 2008 Engineering Graduate Symposium, University of Michigan, Nov. 7, 2008.
Teaching Experience
- University of Michigan – Graduate Student Instructor
January 2012 - April 2012
EECS 570 - Parallel Computer Architecture
Responsible for guiding multiple graduate student research projects related to parallel computing.
Set up software infrastructure for assignments on parallel programming and cache coherency protocols. - University of Illinois – Undergraduate Teaching Assistant
January 2005 - August 2006
ECE 290 - Computer Engineering I
Graded homework assignments and tests for four semesters
Taught discussion section for this undergraduate digital logic course during the summer of 2006. - University of Illinois – Grader
August 2005 - December 2005
CS 433 - Computer System Organization
Graded homework assignments for this undergraduate computer architecture course.
Professional Activities
- Program committee member for ASPLOS (2025), MICRO (2022), IISWC (2020), ICPP (2020), ISPASS (2015), HPPAC (2015–2018)
- External reviewer for MICRO (2009, 2013, 2014, 2017, 2020), HPCA (2013, 2014), IEEE CAL (2015–2017), IEEE TPDS (2017), IEEE TCAD (2017, 2018), IEEE TMSCS (2018), SC (2017), SRCS (2013), FMCAD (2010), and MDPI Computation (2018, 2020)
- External reviewer (through Todd Austin) for ASPLOS (2012, 2013), CODES (2011), DATE (2008–2012), FMCAD (2010), HPCA (2009, 2010, 2012), ISCA (2009, 2010, 2012), and MICRO (2008, 2011, 2012), and PACT (2012)
- Judge for SRC TechCon (2015)
- Association for Computing Machinery, Senior Member
- Institute for Electrical and Electronics Engineers, Senior Member
- U of M Advanced Computer Architecture Laboratory Reading Group organizer, 2009-2010, compute cluster administrator (2008--2011)
Awards and Honors
- Awards at Advanced Micro Devices, Inc.
- AMD Q1 2024 Next 5% Award for work on AMD Instinct MI300 execution
- AMD Q3 2022 Next 5% Award for work on work breaking the exaflop barrier
- AMD Q1 2020 Next 5% Award for work on AMD's Frontier supercomputer design win
- AMD Executive Spotlight Award: Q4 2019, Q4 2020, Q2 2021, Q2 2023 (2x)
- AMD DCGPU Spotlight Award: Q2 2020, Q1 2021, Q3 2021, Q2 2022, Q4 2022, Q2 2023, Q4 2023 (2x)
- AMD Research Spotlight Award: Q2 2017
- Academic Awards and Honors
- IISWC 2016 Best Paper Award
- CGO 2011 Best Student Presentation Award
- Nomination for Best Paper at IISWC 2018
- Nomination for Best Paper at HPDC 2014
- Awards and Honors at the University of Michigan
- 2011 University of Michigan CSE Graduate Student Honors Competition 1st Place
- University of Michigan EECS Departmental Fellowship, 2006-2007
- Honors at the University of Illinois
- Eta Kappa Nu Electrical and Computer Engineering Honor Society
- Tau Beta Pi Engineering Honor Society
- Illinois Chancellor's Scholar
- Illinois Engineering James Scholar
Skills
- Programming Languages
C, C++, HIP, CUDA, OpenCL, x86 assembly, AMD GCN, CDNA, and RDNA assembly, Python - Software Systems
Linux kernel, multiple AMD-internal simulation, firmware, and analysis tools