Guoyang Chen,

Senior Engineer
Qualcomm Inc.

About me My_CV.pdf

I am a senior engineer at Qualcomm from Jan, 2017. Now I am working on optimizing graphics compiler to generate efficient code for different generations of GPU. Before joining Qualcomm, I received my BS degree at Department of Computer Science, USTC in 2012 and my PhD degree at Department of Computer Science, NCSU in 2016. During my PhD study, I worked with my advisor Prof.Xipeng Shen. My research interests include compilers, program analysis and optimizations, heterogeneous computing, high performance computing.


  • Fellowship, IBM Center of Advanced Study, 2014-2016
  • Cisco Connected.recognition You Accelerate, 2015

Professional Activities

  • Artifact Committee: CGO'17
  • Program Committee: IPDPS'18, PPoPP'18(ERC), ToPC'17, TC'17, HUCAA workshop'17(TPC), ISMM'17(ERC)
  • Organizing Volunteer: LCPC'15
  • Reviewer: SC'17, CCGrid'17, LCPC'17, IPDPS'17, CC'16, ISCAHPC'16, PACT'16, AsHES'16, ASPLOS'16, ICS'16, ICPDS'15, PACT'15, AsHES'15, ASPLOS'15, LCPC'15, SC'15, IISWC'15, ICPP'14, PACT'14

Publications [Google Citation]

  Referred Conference Publications:

[Micro'17] "Efficient Support of Position Independence on Non-Volatile Memory" [to appear]
Guoyang Chen, Lei Zhang, Richa Budhiraja, Xipeng Shen, Youfeng Wu
The 50th Annual IEEE/ACM International Symposium on Microarchitecture, Boston, Oct 14-18, 2017 (acceptance rate: 61/327 = 18.65%)
[ICDE'17] "Sweet KNN: An Efficient KNN on GPU through Reconciliation of Redundancy and Regularity" [PDF]
Guoyang Chen, Yufei Ding, Xipeng Shen
2017 IEEE International Conference on Data Engineering, San Diego, CA, US, April, 2017
[PPoPP'17] "EffiSha: A Software Framework for Enabling Efficient Preemptive Scheduling of GPU" [PDF]
Guoyang Chen, Yue Zhao, Xipeng Shen, Huiyang Zhou
The 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Austin, Texas, US, Feb, 2017
[ASAP'16] "OpenCL-Based Erasure Coding on Heterogeneous Architecture" [PDF]
Guoyang Chen, Huiyang Zhou, Xipeng Shen, Josh Gahm, Narayan Venkat, Skip Booth, John Marshall
The 27th Annual IEEE International Conference on Application-specific Systems, Architectures and Processors, London, UK, July, 2016
[ECOOP'16] "Towards Ontology-Based Program Analysis" [PDF]
Yue Zhao, Guoyang Chen, Chunhua Liao, Xipeng Shen
The European Conference on Object-Oriented Programming, 2016
[ICS'16] "Coherence-Free Multiview: Enabling Reference-Discerning Data Placement on GPU" [PDF]
Guoyang Chen, Xipeng Shen
ACM International Conference on Supercomputing, 2016.[PDF]
[PPoPP'16 poster] "Data-centric combinatorial optimization of parallel code" [PDF]
Hao Luo, Guoyang Chen, Pengcheng Li, Chen Ding, Xipeng Shen
21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming,Spain, 2016.
[Micro'15] "Free Launch: Optimizing GPU Dynamic Kernel Launches through Thread Reuse" [PDF]
Guoyang Chen, Xipeng Shen
The 48th Annual IEEE/ACM International Symposium on Microarchitecture, Waikiki, Hawaii, USA, Dec, 2015
[ICS'15] "Enabling and Exploiting Flexible Task Assignment on GPU through SM-Centric Program Transformations" [PDF]
Bo Wu,Guoyang Chen, Dong Li, Xipeng Shen, Jeffrey Vetter
ACM International Conference on Supercomputing, Newport Beach, CA, 2015. (25% acceptance rate)
[Micro'14] "PORPLE: An Extensible Optimizer for Portable Data Placement on GPU" [PDF]
Guoyang Chen, Bo Wu, Dong Li, Xipeng Shen
The 47th Annual IEEE/ACM International Symposium on Microarchitecture, Cambridge, UK, December, 2014. (19% acceptance rate)
[PACT'14 poster] " SM-centric transformation: circumventing hardware restrictions for flexible GPU scheduling" [PDF]
Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen
Proceedings of the 23rd international conference on Parallel architectures and compilation, Alberta, Canada, August, 2014.

  Referred Journal Publications:

[TC'16] "Optimizing Data Placement on GPU Memory: A Portable Approach" [PDF]
Guoyang Chen, Xipeng Shen, Bo Wu, and Dong Li
The IEEE Transactions on Computers, 2016
[IEEE/Micro15] "Enabling Portable Optimizations of Data Placement on GPU" [ PDF]
Guoyang Chen, Bo Wu, Dong Li, Xipeng Shen
July/August Issue, the Heterogeneous Computing special issue of IEEE Micro, 2015
[FWNIS12] "Optimizing Compressive Sensing in the Internet of Things" [ PDF]
Guoyang Chen, Hao Yang, Liusheng Huang
Future Wireless Networks and Information Systems, 2012


  • Data Placement on GPU.
      A GPU features a number of types of memory. This project explores automatic approaches to finding out the best placement of data on the memory.
  • Free Launch on GPU
      Allows programmers to use subkernel launches to express dynamic parallelism with little overhead.
  • Erasure Coding on Heterogeneous Architectures
      Leverages the OpenCL framework to accelerate erasure coding for all target heterogeneous architectures, including GPUs, APUs, and FPGAs and proposes code optimizations for each target architecture
  • GPU Scheduling (Co-author)
      Current GPUs do not allow flexible control of thread scheduling. This project proposes a pure software solution to circumvent the difficulty, and hence opens up new opportunities for GPU program optimizations.
      papers: ICS'15, PACT14 poster
  • Efficient Preemptive Scheduling on GPU
      A software framework to support fairness and priority-aware scheduling (FIFO, round-robin, etc) of GPU kernels with little overhead and memory usage.
  • Efficient KNN implementation on GPU
      By applying Triangular Inequality theory, KNN on GPU has consistent 8X speedup over the fastest GPU version.


  • Senior Engineer: Qualcomm Inc. 2017.1-current
  • Research Assistant: North Carolina State University 2014.5-2016.12
  • PhD internship: Cisco System Inc. 2015.5-2015.12
  • Teaching Assistant: The College of William & Mary 2012.8-2014.5
Find me on: LinkedIn