教师详细介绍

梁云  Yun (Eric) Liang

北京大学高能效计算与应用中心助理教授 
北京大学高能效计算与应用中心助理主任 (Assistant director) 
北京大学信息科学技术学院新体制研究员 
电话:+86-10-6276-0779 
地址:北京大学理科5号楼518N室, 100871 
邮箱:ericlyun [at] pku.edu.cn

 

 

 

 

BIOGRAPHY

      Yun (Eric) Liang is an assistant professor in School of EECS, Peking University, China. He received his PhD in Computer Science from the National University of Singapore in 2010 and worked as a Research Scientist in UIUC before he joins PKU. His research focuses on heterogeneous computing, energy-efficient computing, computer architecture, compilation techniques, embedded system design, and real-time system. He has authored over 60 scientific publications in premier international journals and conferences in this domain. His research has been recognized by best paper award at FCCM 2011 and ICCAD 2017 and best paper nominations at DAC 2016, ASPDAC 2016, DAC 2012, FPT 2011, CODES+ISSS 2008. Prof Liang serves as Associate Editor for ACM Transactions in Embedded Computing Systems (TECS) and serves in the program committees in the premier conferences in the related domain including (HPCA, PACT, CGO, ICCAD, CC, DATE, CASES, ASPDAC, ICCD).  

 

 

 

AWARDS AND HONORS

 

  •   Best Paper Award, International Conference on Computer Aided Design (ICCAD), November, 2017.

  •   Best Paper Award Nomination, Design Automation Conference (DAC), June 2016.

    (14 nominations out of 676 submissions).

  •   Best Paper Award Nomination, Asia and South Pacific Design Automation Conference (ASP-DAC), January, 2016.

  •   Best Paper Award Nomination, Design Automation Conference (DAC), June 2012.

    (7 nominations out of 741 submissions).

  •   Best Paper Award Nomination, International Conference on Field Programmable Technology (FPT), December 2011.

  •   Best Paper Award,  IEEE International Symposium on Field-Programmable Custom Computing Machines 2011 (FCCM), May 2011. (1 out of 119 submissions)

  •   Best Paper Award NominationACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2008.

 

PUBLICATIONS

Journal Publications 

 

[J18]

Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. ""CRAT: Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," IEEE Transactions on Computer (TC)

[J17]

Yun Liang, Xiaolong Xie, Yu Wang, Guangyu Sun, Tao Wang. ""Optimizing Cache Bypassing and Warp Scheduling for GPUs, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)

[J16]

Xinfeng Xie, Dayou Du, Qian Li, Yun Liang, Wai Teng Tang, Zhong Liang Ong,Mian Lu, Huynh Phung Huynh , Rick Siow Mong Goh. "Exploiting Sparsity to Accelerate Fully Connected Layers of CNN-based Applications on Mobile SoCs, " ACM Transactions on Embedded Computing Systems (TECS)

[J15]

Yun Liang, Waiteng Tang, Ruizhe Zhao, Mian Lu, Huynh Phung Huynh, Rick Siow Mong Goh. "Scale-free Sparse Matrix-Vector Multiplication on Many-Core Architectures, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(TCAD).

[J14]

Yun Liang, Xiuhong Li. “Efficient Kernel Management on GPUs.” ACM Transactions on Embedded Computing Systems (TECS), Vol 16, Issue 4, May 2017.

 

[J13]   

Yun Liang, Muhammad T. Satria, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 7, July 2016.

 

[J12]

Yao Chen, Swathi T. Gurumani, Yun Liang, Guofeng Li, Donghui Guo, Kyle Rupnow, Deming Chen. "FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow, " IEEE Transactions on Very Large Scale Integration Systems (TVLSI), Vol. 24, No. 6, pp. 2220–2233, June 2016.

 

[J11]

Ying Chen, Tan Nguyen, Yao Chen, Swathi Gurumani, Yun Liang, Kyle Rupnow, Jason Cong, Wen-mei Hwu, Deming Chen. "FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 12, April 2016.

 

[J10]

Yun Liang, Shuo Wang. "Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs, " Journal of Computer Science and Technology (JCST), Vol 31, No.1, Janurary. 2016.

 

[J9]

Mian Lu, Yun Liang,  Huynh Phung Huynh, Zhongliang Ong, Bingsheng He, Rick Siow Mong Goh. "MrPhi : An Optimized MapReduce Framework on Intel Xeon Phi Coprocessors, " IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 11, pp. 3066-3078, November 2015.

[J8]

Yun Liang, Xiaolong Xie, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 10, pp. 1677-1690, October 2015.

[J7]

Yun Liang, Tulika Mitra, Lei Ju. "Instruction Cache Locking using Temporal Reuse Profile," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 9, pp. 1387-1400, August 2015.

[J6]

Yun Liang, Huynh Phung Huynh, Kyle Rupnow, Rick Siow Mong Goh, Deming Chen. "Efficient GPU Spatial-Temporal Multitasking," IEEE Transactions on Parallel and Distributed Systems (TPDS),  Vol. 26, No. 3, pp. 748-760, March 2015.

[J5]

Yun Liang, Tulika Mitra. "An Analytical Approach for Fast and Accurate Design Space Exploration of Instruction Caches," ACM Transactions on Embedded Computing Systems (TECS), 13(3), Article 43, December, 2013.

[J4]

Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, Vivy Suhendra. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-cores," Real-Time Systems Journal (RTS) 48(6), November, 2012.

[J3]

Yun Liang, Kyle Rupnow, Yinan Li, Dongbo Min, Minh Do, and Deming Chen. "High Level Synthesis: Productivity, Performance and Software Constraints, "

Journal of Electrical and Computer Engineering, Special Issue on ESL Design Methodology, Volume 2012 (2012), 649057, 2012.

[J2]

Lei Ju, Yun Liang, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware optimization of BAN applications," Journal of Design Automation for Embedded System, Volume 13 (3), September, 2009.

[J1]

Xianfeng Li, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Chronos: A Timing Analyzer for Embedded Software,"  Science of Computer Programming, Special issue on Experimental Software and Toolkit, 69(1-3), December 2007.

Conference Papers

 

[C51]

Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Qinru Qiu, Yanzhi Wang ,Yun Liang. "C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs, " to appear in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Feb 2018. 

[C50]

Yun Liang, Xiuhong Li, Xiaolong Xie. "Exploring Cache Bypassing and Partitioning for MultiTasking on GPUs, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017.

[C49]

Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang. ""A Hybrid Approach to Cache Management in Heterogeneous CPU-FPGA Platforms, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. (invited paper).

[C48]

Jieru Zhao, Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang, Bingsheng He. "COMBA: A Comprehensive Model-Based Analysis Framework for High Level Synthesis of Real Applications, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. Best Paper Award.

[C47]

Xiaolong Xie, Wei Tan, Liana L. Fong,Yun Liang. "CUMF_SGD:Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUS, " to appear in the proceedings of the 26th International Symposium on High Performance Parallel and Distributed Computing (HPDC), June 2017.

[C46]

Shuo Wang, Yun Liang. "A Comprehensive Framework for Synthesizing Stencil Algorithms on FPGAs using OpenCL Model, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

[C45]

Shuo Wang, Yun Liang, Wei Zhang. "FlexCL: An Analytical Performance Model for OpenCL Workloads on Flexible FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

[C44]

Qingcheng Xiao, Yun Liang, Liqiang Lu, Shengen Yan, Yu-Wing Tai. "Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

[C43]

Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, Yuxin Wang, Han Hu, Yun Liang, Jason Cong. "Automating the systolic array generation and

optimizations for high throughput convolution neural network," to appear in the

proceedings of the Design Automation Conference (DAC), June 2017. Best Paper Award Nomination.

[C42]

Liqiang Lu, Yun Liang, Qingcheng Xiao and Shengen Yan. "Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs," to appear in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2017.

[C41]

Guanwen Zhong, Alok Prakash, Siqi Wang, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of FPGA-based Accelerators with Multi-level Parallelism," to appear in the proceedings of the Design Automation and Test in Europe (DATE), March, 2017.

[C40]

Xuechao Wei, Yun Liang, Tao Wang, Songwu Lu, Jason Cong.  "Throughput Optimization for Streaming Applications on CPU-FPGA Heterogeneous Systems," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2017.

[C39]

Guanwen Zhong, Alok Prakash, Yun Liang, Tulika Mitra, Smail Niar. "Lin-Analyzer: A High-level Performance Analysis Tool for FPGA-based Accelerators,"  in the proceedings of the Design Automation Conference (DAC), June, 2016.

[C38]             

Xiuhong Li, Yun Liang. "Efficient Kernel Management on GPUs,"  in the proceedings of the Design Automation and Test in Europe (DATE), March, 2016.

[C37]

Shuo Wang, Yun Liang, Chao Zhang, Xiaolong Xie, Guangyu Sun, Yongpan Liu, Yu Wang, Xiuhong Li. "Performance-centric Register File Design for GPUs using Racetrack Memory," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2016. Best Paper Award Nomination.

[C36]

Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. "Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," in the proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015.

[C35]

Xian Zhang, Guangyu Sun, Chao Zhang, Weiqi Zhang, Yun Liang, Tao Wang, Yiran Chen, and Jia Di. "Fork Path: Improving Efficiency of ORAM by Removing Redundant Memory Accesses," in the proceedings of 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015.

[C34]

Yun Liang, Shuo Wang.  "Quantitative Performance and Power Analysis of LTE using High Level Synthesis," in the proceedings of International Conference on ASIC, Novemeber, 2015.(invited paper).

[C33]

XuechaoWei, Yun Liang, Xibai Li, Tao Wang, Songwu Lu, Jason Cong.  "Evaluation of Software Defined Radio on Heterogeneous Systems," in the proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT), 2015.(Poster).

[C32]

Chao Zhang, Guangyu Sun, Xian Zhang, Weiqi Zhang, Weisheng Zhao, Tao Wang, Yun Liang, Yongpan Liu, Yu Wang, and Jiwu Shu, "Hi-fi Playback: Tolerating Position Errors in Shift Operations of Racetrack Memory, " in the proceedings of the 42nd International Symposium on Computer Architecture (ISCA), June 2015.

[C31]

Xiaolong Xie, Yun Liang, Yu Wang, Guangyu Sun, Tao Wang. "Coordinated Static and Dynamic Cache Bypassing on GPUs,"  in the proceedings of 21st IEEE International Symposium on High Performance Computer Architecture (HPCA), February 2015.

[C30]

Waiteng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huynh, Xibai Li, Rick Siow Mong Goh."Optimizing and Auto-Tuning Scale-Free Sparse Matrix-Vector Multiplication on Intel Xeon Phi, " in the proceedings of the International Symposium on Code Generation and Optimization (CGO), February 2015.

[C29]

Guanwen Zhong, Vanchinathan Venkataramani, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of Multiple Loops on FPGAs using High Level Synthesis," in the proceedings of IEEE International Conference on Computer Design (ICCD), October 2014.

[C28]

Xiaoming Chen, Yu Wang, Yun Liang, Yuan Xie, Huazhong Yang, "Run-time Techniques for Simultaneous Aging and Power Optimization in GPGPUs," in the proceedings of the 51th Design Automation Conference (DAC), June, 2014.

[C27]

Jingyu Deng, Yun Liang, Guojie Luo, Guangyu Sun. "Rapid Design Space Exploration of Two-level Unified Caches," in the proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), June 2014.

[C26]

Swathi Gurumani, Jacob Tolar, Yao Chen, Yun Liang, Kyle Rupnow, Deming Chen. "Integrated CUDA-to-FPGA Synthesis with Network-on-Chip," in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2014.

[C25]

Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Dynamic Instruction Cache Locking," in the proceedings of the Design Automation and Test in Europe (DATE), March, 2014.

[C24]

Zhimin Wu, Yang Liu, Yun Liang, Jun Sun. "GPU Accelerated Counterexample Generation in LTL Model Checking," in the proceedings of the International Conference on Formal Engineering Methods (ICFEM), November, 2014.

[C23]

Xiaolong Xie, Yun Liang, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2013.

[C22]

Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh. "Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor," in the proceedings of the IEEE Bag Data (BigData), Oct, 2013.

[C21]

Alexandros Papakonstantinou, Deming Chen, Wen Mei Hwu, Yun Liang, Jason Cong. "Throughput-Oriented Kernel Porting onto FPGAs," in the proceedings of the 50th Design Automation Conference (DAC), June, 2013.

[C20]

Huping Ding, Yun Liang, Tulika Mitra. "Integrated Instruction Cache Analysis and Locking in Multitasking Real-time Systems," in the proceedings of the 50th Design Automation Conference (DAC),  June, 2013.

[C19]

Wei Zuo, Yun Liang, Peng Li, Kyle Rupnow, Deming Chen, Jason Cong. "Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations," in the proceedings of the 21st ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Februrary, 2013.

[C18]

Swathi T. Gurumani, Hisham Cholakkal, Yun Liang, Kyle Rupnow, Deming Chen. "High-Level Synthesis of Multiple Dependent CUDA Kernels on FPGA,"  in the proceedings of18th Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2013 (invited paper).

[C17]

Yun Liang, Zheng Cui, Kyle Rupnow, Deming Chen. "Register and Thread Structure Optimization for GPUs," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013.

[C16]

Huping Ding, Yun Liang, Tulika Mitra. "Shared Cache Aware Task Mapping for WCRT Minimization," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013.

[C15]

Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Partial Instruction Cache Locking," in the proceedings of ACM 49th Design Automation Conference (DAC), June 2012. Best Paper Award Nomination (7 out of 741 submissions).

[C14]

Zheng Cui, Yun Liang, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization," in the proceedings of IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2012.

[C13]

Yun Liang, Zheng Cui, Shengkui Zhao, Kyle Rupnow, Yihao Zhang, Douglas L. Jones, Deming Chen. "Real-time Implementation and Performance Optimization of 3D Sound Localization on GPUs," in the proceedings of Design Automation and Test in Europe (DATE), March 2012.

[C12]

Shengkui Zhao, Saima Ahmed, Yun Liang, Kyle Rupnow, Deming Chen, Douglas L Jones. "A real-time 3D sound localization system with miniature microphone array for virtual reality," in the proceedings of of 7th IEEE Conference on Industrial Electronics and Applications(ICIEA), July 2012.

[C11]

Kyle Rupnow, Yun Liang, Yinan Li, Dongbo Min, Minh Do, Deming Chen. "High Level Synthesis of Stereo Matching: Productivity, Performance, and Software Constraints, " in the proceedings of International Conference on Field Programmable Technology (FPT), December 2011. Best Paper Award Nomination (4 out of 94 submissions).

[C10]

Alexandros Papakonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-Mei W. Hwu, Jason Cong. "Multilevel Granularity Parallelism Synthesis on FPGAs," in the proceedings of the 19th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2011.  Best Paper Award (1 out of 119 submissions).

[C9]

Kyle Rupnow, Yun Liang, Yinan Li, Deming Chen. "A study of high-level synthesis: Promises and challenges," in the proceedings of IEEE 9th International Conference on ASIC (ASICON), October, 2011.

[C8]

Yun Liang, Tulika Mitra. "Improved procedure placement for set associative caches," in the proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems (CASES’10), October, 2010.

[C7]

Yun Liang, Tulika Mitra. "Instruction Cache Locking using Temporal Reuse Profile," in the proceedings of the ACM 47th Design Automation Conference (DAC), June 2010.

[C6]

Huynh Phung Huynh, Yun Liang, Tulika Mitra. "Efficient custom instructions generation for system-level design," in the proceedings of the International Conference on Field-Programmable Technology (FPT), December, 2010.

[C5]

Yan Li, Vivy Suhendra, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-Cores," in the proceedings of the 30th IEEE Real-Time Systems Symposium (RTSS), December, 2009.

[C4]

Yun Liang, Tulika Mitra. "Static Analysis for Fast and Accurate Design Space Exploration of Caches," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October, 2008.

[C3]

Yun Liang, Lei Ju, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware Optimization of BAN Applications," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis(CODES+ISSS), October, 2008. Best Paper Award Nomination.

[C2]

Yun Liang, Tulika Mitra.  "Cache Modeling in Probabilistic Execution Time Analysis," in the proceedings of the 45th Design Automation Conference (DAC), June, 2008.

[C1]

Yun Liang, Abhik Roychoudhury, Tulika Mitra.  "Timing analysis of body area network application," in the proceedings of the 7th International Workshop on Worst Case Execution Time Analysis (WCET, 2007.

 

 

PROFESSIONAL SERVICE

Editor Board

  •   Associate Editor, ACM  Transactions in Embedded Computing Systems (TECS), 2017-.     

 

Conference Organizing Committee Member

  •   Special Session Organizer and Chair, the 18th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.     

  •   Subcommittee Chair.  System Level Synthesis and Optimization, the 19th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.     

  •   Publication Chair. 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.     

 

 

Conference Program Committee Member

  •   Internation Symposium on High-Performance Computer Architecture(HPCA), 2018.     

  •   International Conference on Compiler Construction(CC), 2018.     

  •   Internation Conference on High Performance Computing Data, and Analytics(HiPC) 2017.     

  •   International Conference on Parallel Architectures and Compilation Techniques (PACT) 2015, 2016.     

  •   International Conference on Computer Aided Design (ICCAD) 2016, 2017.     

  •   International Symposium on Code Generation and Optimization (CGO) 2017.     

  •   Asia South Pacific Design Automation Conference (ASP-DAC) 2012, 2013, 2014, 2016, 2017.     

  •   Design Automation and Test in Europe (DATE) 2013, 2014, 2015, 2016, 2017.     

  •   International Conference on Compilers Architecture and Synthesis for Embedded Systems (CASES) 2013, 2014, 2015, 2016.     

  •   IEEE International Conference on Computer Design (ICCD) 2016.     

 

COURSES

  •   Programming Practice (English),  2013, 2014, 2015, 2016, 2017.     

  •   Compiler Design, 2016, 2017.     

 

STUDENTS SUPERVISED

Ph.D Students.

1.

Xiaolong Xie: “GPU Optimization: Algorithms, Systems and Architecture”

Winner: Top 10 Academic Achievement Award 2016, Qualcomm PhD Scholarship 2016, Merit Student of Peking University 2015, National Graduate Scholarship 2015.

2013-present

2.

Xuechao Wei:  “Algorithms Accelerations using Systolic Array on FPGAs”. (co-advised with Prof Jason Cong)

2013-present

3.

Xiuhong Li:    “Accelerating Irregular Applications on GPUs”

Winner: National Graduate Scholarship 2016. Academic Excellence Award.

2014-present

4.

Shuo Wang:   “Performance Modeling for Heterogeneous Systems”

2015-present

5.

Qingcheng Xiao: TBD

2016-present

6.

Liqiang Lu: TBD

 

2017-present

Undergraduate Students. 

 

Student

Graduation Year

Employment after Graduation

 1.

Xiaolong Xie

2013

PhD student in Peking University, China

 2.

Siyuan Ouyang

2013

Master student in CMU, USA

 3.

Jingyu Deng

2014

Master student in NYU, USA

 4.

Xiuhong Li

2014

PhD student in Peking University, China

 5.

Xibai Li

2015

Software Engineer in a Starup, China

 6.

Ruizhe Zhao

2016

PhD student in Imperial London, UK

 7.

Yudong Wu

2016

PhD student in UCSD, USA

 8.

Zhaowen Zou

2016

Master student in UCSD, USA

 9.

Qiqi Xiao

2016

Software Engineer in Face++, China

10.

Qingcheng Xiao

2016

PhD student in Peking University, China

11.

Qian Li

2017

PhD student in Stanford, USA

12.

Xinfeng Xie

2017

PhD student in UCSB, USA

13.

Liqiang Lu

2017

PhD student in Peking University, China

14.

Yilong Li

2017

Master student in Stanford, USA

15.

Dayou Du

2017

Master student in NYU, USA

16.

Han Qiu

2017

Software Engineer in Samsung, China

 

Top-10 School of EECS Bachelor Thesis Award Winner

 

  •   Ruizhe Zhao 2016, Xinfeng Xie 2017

 

 

Thesis Committee

梁云  Yun (Eric) Liang

北京大学高能效计算与应用中心助理教授 
北京大学高能效计算与应用中心助理主任 (Assistant director) 
北京大学信息科学技术学院新体制研究员 
电话:+86-10-6276-0779 
地址:北京大学理科5号楼518N室, 100871 
邮箱:ericlyun [at] pku.edu.cn

 

 

 

 

BIOGRAPHY

      Yun (Eric) Liang is an assistant professor in School of EECS, Peking University, China. He received his PhD in Computer Science from the National University of Singapore in 2010 and worked as a Research Scientist in UIUC before he joins PKU. His research focuses on heterogeneous computing, energy-efficient computing, computer architecture, compilation techniques, embedded system design, and real-time system. He has authored over 60 scientific publications in premier international journals and conferences in this domain. His research has been recognized by best paper award at FCCM 2011 and ICCAD 2017 and best paper nominations at DAC 2016, ASPDAC 2016, DAC 2012, FPT 2011, CODES+ISSS 2008. Prof Liang serves as Associate Editor for ACM Transactions in Embedded Computing Systems (TECS) and serves in the program committees in the premier conferences in the related domain including (HPCA, PACT, CGO, ICCAD, CC, DATE, CASES, ASPDAC, ICCD).  

 

 

 

AWARDS AND HONORS

 

  •   Best Paper Award, International Conference on Computer Aided Design (ICCAD), November, 2017.

  •   Best Paper Award Nomination, Design Automation Conference (DAC), June 2016.

    (14 nominations out of 676 submissions).

  •   Best Paper Award Nomination, Asia and South Pacific Design Automation Conference (ASP-DAC), January, 2016.

  •   Best Paper Award Nomination, Design Automation Conference (DAC), June 2012.

    (7 nominations out of 741 submissions).

  •   Best Paper Award Nomination, International Conference on Field Programmable Technology (FPT), December 2011.

  •   Best Paper Award,  IEEE International Symposium on Field-Programmable Custom Computing Machines 2011 (FCCM), May 2011. (1 out of 119 submissions)

  •   Best Paper Award NominationACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2008.

 

PUBLICATIONS

Journal Publications 

 

[J18]

Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. ""CRAT: Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," IEEE Transactions on Computer (TC)

[J17]

Yun Liang, Xiaolong Xie, Yu Wang, Guangyu Sun, Tao Wang. ""Optimizing Cache Bypassing and Warp Scheduling for GPUs, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)

[J16]

Xinfeng Xie, Dayou Du, Qian Li, Yun Liang, Wai Teng Tang, Zhong Liang Ong,Mian Lu, Huynh Phung Huynh , Rick Siow Mong Goh. "Exploiting Sparsity to Accelerate Fully Connected Layers of CNN-based Applications on Mobile SoCs, " ACM Transactions on Embedded Computing Systems (TECS)

[J15]

Yun Liang, Waiteng Tang, Ruizhe Zhao, Mian Lu, Huynh Phung Huynh, Rick Siow Mong Goh. "Scale-free Sparse Matrix-Vector Multiplication on Many-Core Architectures, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(TCAD).

[J14]

Yun Liang, Xiuhong Li. “Efficient Kernel Management on GPUs.” ACM Transactions on Embedded Computing Systems (TECS), Vol 16, Issue 4, May 2017.

 

[J13]   

Yun Liang, Muhammad T. Satria, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 7, July 2016.

 

[J12]

Yao Chen, Swathi T. Gurumani, Yun Liang, Guofeng Li, Donghui Guo, Kyle Rupnow, Deming Chen. "FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow, " IEEE Transactions on Very Large Scale Integration Systems (TVLSI), Vol. 24, No. 6, pp. 2220–2233, June 2016.

 

[J11]

Ying Chen, Tan Nguyen, Yao Chen, Swathi Gurumani, Yun Liang, Kyle Rupnow, Jason Cong, Wen-mei Hwu, Deming Chen. "FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 12, April 2016.

 

[J10]

Yun Liang, Shuo Wang. "Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs, " Journal of Computer Science and Technology (JCST), Vol 31, No.1, Janurary. 2016.

 

[J9]

Mian Lu, Yun Liang,  Huynh Phung Huynh, Zhongliang Ong, Bingsheng He, Rick Siow Mong Goh. "MrPhi : An Optimized MapReduce Framework on Intel Xeon Phi Coprocessors, " IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 11, pp. 3066-3078, November 2015.

[J8]

Yun Liang, Xiaolong Xie, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 10, pp. 1677-1690, October 2015.

[J7]

Yun Liang, Tulika Mitra, Lei Ju. "Instruction Cache Locking using Temporal Reuse Profile," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 9, pp. 1387-1400, August 2015.

[J6]

Yun Liang, Huynh Phung Huynh, Kyle Rupnow, Rick Siow Mong Goh, Deming Chen. "Efficient GPU Spatial-Temporal Multitasking," IEEE Transactions on Parallel and Distributed Systems (TPDS),  Vol. 26, No. 3, pp. 748-760, March 2015.

[J5]

Yun Liang, Tulika Mitra. "An Analytical Approach for Fast and Accurate Design Space Exploration of Instruction Caches," ACM Transactions on Embedded Computing Systems (TECS), 13(3), Article 43, December, 2013.

[J4]

Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, Vivy Suhendra. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-cores," Real-Time Systems Journal (RTS) 48(6), November, 2012.

[J3]

Yun Liang, Kyle Rupnow, Yinan Li, Dongbo Min, Minh Do, and Deming Chen. "High Level Synthesis: Productivity, Performance and Software Constraints, "

Journal of Electrical and Computer Engineering, Special Issue on ESL Design Methodology, Volume 2012 (2012), 649057, 2012.

[J2]

Lei Ju, Yun Liang, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware optimization of BAN applications," Journal of Design Automation for Embedded System, Volume 13 (3), September, 2009.

[J1]

Xianfeng Li, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Chronos: A Timing Analyzer for Embedded Software,"  Science of Computer Programming, Special issue on Experimental Software and Toolkit, 69(1-3), December 2007.

Conference Papers

 

[C51]

Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Qinru Qiu, Yanzhi Wang ,Yun Liang. "C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs, " to appear in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Feb 2018. 

[C50]

Yun Liang, Xiuhong Li, Xiaolong Xie. "Exploring Cache Bypassing and Partitioning for MultiTasking on GPUs, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017.

[C49]

Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang. ""A Hybrid Approach to Cache Management in Heterogeneous CPU-FPGA Platforms, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. (invited paper).

[C48]

Jieru Zhao, Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang, Bingsheng He. "COMBA: A Comprehensive Model-Based Analysis Framework for High Level Synthesis of Real Applications, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. Best Paper Award.

[C47]

Xiaolong Xie, Wei Tan, Liana L. Fong,Yun Liang. "CUMF_SGD:Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUS, " to appear in the proceedings of the 26th International Symposium on High Performance Parallel and Distributed Computing (HPDC), June 2017.

[C46]

Shuo Wang, Yun Liang. "A Comprehensive Framework for Synthesizing Stencil Algorithms on FPGAs using OpenCL Model, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

[C45]

Shuo Wang, Yun Liang, Wei Zhang. "FlexCL: An Analytical Performance Model for OpenCL Workloads on Flexible FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

[C44]

Qingcheng Xiao, Yun Liang, Liqiang Lu, Shengen Yan, Yu-Wing Tai. "Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

[C43]

Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, Yuxin Wang, Han Hu, Yun Liang, Jason Cong. "Automating the systolic array generation and

optimizations for high throughput convolution neural network," to appear in the

proceedings of the Design Automation Conference (DAC), June 2017. Best Paper Award Nomination.

[C42]

Liqiang Lu, Yun Liang, Qingcheng Xiao and Shengen Yan. "Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs," to appear in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2017.

[C41]

Guanwen Zhong, Alok Prakash, Siqi Wang, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of FPGA-based Accelerators with Multi-level Parallelism," to appear in the proceedings of the Design Automation and Test in Europe (DATE), March, 2017.

[C40]

Xuechao Wei, Yun Liang, Tao Wang, Songwu Lu, Jason Cong.  "Throughput Optimization for Streaming Applications on CPU-FPGA Heterogeneous Systems," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2017.

[C39]

Guanwen Zhong, Alok Prakash, Yun Liang, Tulika Mitra, Smail Niar. "Lin-Analyzer: A High-level Performance Analysis Tool for FPGA-based Accelerators,"  in the proceedings of the Design Automation Conference (DAC), June, 2016.

[C38]             

Xiuhong Li, Yun Liang. "Efficient Kernel Management on GPUs,"  in the proceedings of the Design Automation and Test in Europe (DATE), March, 2016.

[C37]

Shuo Wang, Yun Liang, Chao Zhang, Xiaolong Xie, Guangyu Sun, Yongpan Liu, Yu Wang, Xiuhong Li. "Performance-centric Register File Design for GPUs using Racetrack Memory," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2016. Best Paper Award Nomination.

[C36]

Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. "Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," in the proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015.

[C35]

Xian Zhang, Guangyu Sun, Chao Zhang, Weiqi Zhang, Yun Liang, Tao Wang, Yiran Chen, and Jia Di. "Fork Path: Improving Efficiency of ORAM by Removing Redundant Memory Accesses," in the proceedings of 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015.

[C34]

Yun Liang, Shuo Wang.  "Quantitative Performance and Power Analysis of LTE using High Level Synthesis," in the proceedings of International Conference on ASIC, Novemeber, 2015.(invited paper).

[C33]

XuechaoWei, Yun Liang, Xibai Li, Tao Wang, Songwu Lu, Jason Cong.  "Evaluation of Software Defined Radio on Heterogeneous Systems," in the proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT), 2015.(Poster).

[C32]

Chao Zhang, Guangyu Sun, Xian Zhang, Weiqi Zhang, Weisheng Zhao, Tao Wang, Yun Liang, Yongpan Liu, Yu Wang, and Jiwu Shu, "Hi-fi Playback: Tolerating Position Errors in Shift Operations of Racetrack Memory, " in the proceedings of the 42nd International Symposium on Computer Architecture (ISCA), June 2015.

[C31]

Xiaolong Xie, Yun Liang, Yu Wang, Guangyu Sun, Tao Wang. "Coordinated Static and Dynamic Cache Bypassing on GPUs,"  in the proceedings of 21st IEEE International Symposium on High Performance Computer Architecture (HPCA), February 2015.

[C30]

Waiteng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huynh, Xibai Li, Rick Siow Mong Goh."Optimizing and Auto-Tuning Scale-Free Sparse Matrix-Vector Multiplication on Intel Xeon Phi, " in the proceedings of the International Symposium on Code Generation and Optimization (CGO), February 2015.

[C29]

Guanwen Zhong, Vanchinathan Venkataramani, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of Multiple Loops on FPGAs using High Level Synthesis," in the proceedings of IEEE International Conference on Computer Design (ICCD), October 2014.

[C28]

Xiaoming Chen, Yu Wang, Yun Liang, Yuan Xie, Huazhong Yang, "Run-time Techniques for Simultaneous Aging and Power Optimization in GPGPUs," in the proceedings of the 51th Design Automation Conference (DAC), June, 2014.

[C27]

Jingyu Deng, Yun Liang, Guojie Luo, Guangyu Sun. "Rapid Design Space Exploration of Two-level Unified Caches," in the proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), June 2014.

[C26]

Swathi Gurumani, Jacob Tolar, Yao Chen, Yun Liang, Kyle Rupnow, Deming Chen. "Integrated CUDA-to-FPGA Synthesis with Network-on-Chip," in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2014.

[C25]

Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Dynamic Instruction Cache Locking," in the proceedings of the Design Automation and Test in Europe (DATE), March, 2014.

[C24]

Zhimin Wu, Yang Liu, Yun Liang, Jun Sun. "GPU Accelerated Counterexample Generation in LTL Model Checking," in the proceedings of the International Conference on Formal Engineering Methods (ICFEM), November, 2014.

[C23]

Xiaolong Xie, Yun Liang, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2013.

[C22]

Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh. "Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor," in the proceedings of the IEEE Bag Data (BigData), Oct, 2013.

[C21]

Alexandros Papakonstantinou, Deming Chen, Wen Mei Hwu, Yun Liang, Jason Cong. "Throughput-Oriented Kernel Porting onto FPGAs," in the proceedings of the 50th Design Automation Conference (DAC), June, 2013.

[C20]

Huping Ding, Yun Liang, Tulika Mitra. "Integrated Instruction Cache Analysis and Locking in Multitasking Real-time Systems," in the proceedings of the 50th Design Automation Conference (DAC),  June, 2013.

[C19]

Wei Zuo, Yun Liang, Peng Li, Kyle Rupnow, Deming Chen, Jason Cong. "Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations," in the proceedings of the 21st ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Februrary, 2013.

[C18]

Swathi T. Gurumani, Hisham Cholakkal, Yun Liang, Kyle Rupnow, Deming Chen. "High-Level Synthesis of Multiple Dependent CUDA Kernels on FPGA,"  in the proceedings of18th Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2013 (invited paper).

[C17]

Yun Liang, Zheng Cui, Kyle Rupnow, Deming Chen. "Register and Thread Structure Optimization for GPUs," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013.

[C16]

Huping Ding, Yun Liang, Tulika Mitra. "Shared Cache Aware Task Mapping for WCRT Minimization," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013.

[C15]

Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Partial Instruction Cache Locking," in the proceedings of ACM 49th Design Automation Conference (DAC), June 2012. Best Paper Award Nomination (7 out of 741 submissions).

[C14]

Zheng Cui, Yun Liang, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization," in the proceedings of IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2012.

[C13]

Yun Liang, Zheng Cui, Shengkui Zhao, Kyle Rupnow, Yihao Zhang, Douglas L. Jones, Deming Chen. "Real-time Implementation and Performance Optimization of 3D Sound Localization on GPUs," in the proceedings of Design Automation and Test in Europe (DATE), March 2012.

[C12]

Shengkui Zhao, Saima Ahmed, Yun Liang, Kyle Rupnow, Deming Chen, Douglas L Jones. "A real-time 3D sound localization system with miniature microphone array for virtual reality," in the proceedings of of 7th IEEE Conference on Industrial Electronics and Applications(ICIEA), July 2012.

[C11]

Kyle Rupnow, Yun Liang, Yinan Li, Dongbo Min, Minh Do, Deming Chen. "High Level Synthesis of Stereo Matching: Productivity, Performance, and Software Constraints, " in the proceedings of International Conference on Field Programmable Technology (FPT), December 2011. Best Paper Award Nomination (4 out of 94 submissions).

[C10]

Alexandros Papakonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-Mei W. Hwu, Jason Cong. "Multilevel Granularity Parallelism Synthesis on FPGAs," in the proceedings of the 19th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2011.  Best Paper Award (1 out of 119 submissions).

[C9]

Kyle Rupnow, Yun Liang, Yinan Li, Deming Chen. "A study of high-level synthesis: Promises and challenges," in the proceedings of IEEE 9th International Conference on ASIC (ASICON), October, 2011.

[C8]

Yun Liang, Tulika Mitra. "Improved procedure placement for set associative caches," in the proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems (CASES’10), October, 2010.

[C7]

Yun Liang, Tulika Mitra. "Instruction Cache Locking using Temporal Reuse Profile," in the proceedings of the ACM 47th Design Automation Conference (DAC), June 2010.

[C6]

Huynh Phung Huynh, Yun Liang, Tulika Mitra. "Efficient custom instructions generation for system-level design," in the proceedings of the International Conference on Field-Programmable Technology (FPT), December, 2010.

[C5]

Yan Li, Vivy Suhendra, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-Cores," in the proceedings of the 30th IEEE Real-Time Systems Symposium (RTSS), December, 2009.

[C4]

Yun Liang, Tulika Mitra. "Static Analysis for Fast and Accurate Design Space Exploration of Caches," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October, 2008.

[C3]

Yun Liang, Lei Ju, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware Optimization of BAN Applications," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis(CODES+ISSS), October, 2008. Best Paper Award Nomination.

[C2]

Yun Liang, Tulika Mitra.  "Cache Modeling in Probabilistic Execution Time Analysis," in the proceedings of the 45th Design Automation Conference (DAC), June, 2008.

[C1]

Yun Liang, Abhik Roychoudhury, Tulika Mitra.  "Timing analysis of body area network application," in the proceedings of the 7th International Workshop on Worst Case Execution Time Analysis (WCET, 2007.

 

 

PROFESSIONAL SERVICE

Editor Board

  •   Associate Editor, ACM  Transactions in Embedded Computing Systems (TECS), 2017-.     

 

Conference Organizing Committee Member

  •   Special Session Organizer and Chair, the 18th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.     

  •   Subcommittee Chair.  System Level Synthesis and Optimization, the 19th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.     

  •   Publication Chair. 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.     

 

 

Conference Program Committee Member

  •   Internation Symposium on High-Performance Computer Architecture(HPCA), 2018.     

  •   International Conference on Compiler Construction(CC), 2018.     

  •   Internation Conference on High Performance Computing Data, and Analytics(HiPC) 2017.     

  •   International Conference on Parallel Architectures and Compilation Techniques (PACT) 2015, 2016.     

  •   International Conference on Computer Aided Design (ICCAD) 2016, 2017.     

  •   International Symposium on Code Generation and Optimization (CGO) 2017.     

  •   Asia South Pacific Design Automation Conference (ASP-DAC) 2012, 2013, 2014, 2016, 2017.     

  •   Design Automation and Test in Europe (DATE) 2013, 2014, 2015, 2016, 2017.     

  •   International Conference on Compilers Architecture and Synthesis for Embedded Systems (CASES) 2013, 2014, 2015, 2016.     

  •   IEEE International Conference on Computer Design (ICCD) 2016.     

 

COURSES

  •   Programming Practice (English),  2013, 2014, 2015, 2016, 2017.     

  •   Compiler Design, 2016, 2017.     

 

STUDENTS SUPERVISED

Ph.D Students.

1.

Xiaolong Xie: “GPU Optimization: Algorithms, Systems and Architecture”

Winner: Top 10 Academic Achievement Award 2016, Qualcomm PhD Scholarship 2016, Merit Student of Peking University 2015, National Graduate Scholarship 2015.

2013-present

2.

Xuechao Wei:  “Algorithms Accelerations using Systolic Array on FPGAs”. (co-advised with Prof Jason Cong)

2013-present

3.

Xiuhong Li:    “Accelerating Irregular Applications on GPUs”

Winner: National Graduate Scholarship 2016. Academic Excellence Award.

2014-present

4.

Shuo Wang:   “Performance Modeling for Heterogeneous Systems”

2015-present

5.

Qingcheng Xiao: TBD

2016-present

6.

Liqiang Lu: TBD

 

2017-present

Undergraduate Students. 

 

Student

Graduation Year

Employment after Graduation

 1.

Xiaolong Xie

2013

PhD student in Peking University, China

 2.

Siyuan Ouyang

2013

Master student in CMU, USA

 3.

Jingyu Deng

2014

Master student in NYU, USA

 4.

Xiuhong Li

2014

PhD student in Peking University, China

 5.

Xibai Li

2015

Software Engineer in a Starup, China

 6.

Ruizhe Zhao

2016

PhD student in Imperial London, UK

 7.

Yudong Wu

2016

PhD student in UCSD, USA

 8.

Zhaowen Zou

2016

Master student in UCSD, USA

 9.

Qiqi Xiao

2016

Software Engineer in Face++, China

10.

Qingcheng Xiao

2016

PhD student in Peking University, China

11.

Qian Li

2017

PhD student in Stanford, USA

12.

Xinfeng Xie

2017

PhD student in UCSB, USA

13.

Liqiang Lu

2017

PhD student in Peking University, China

14.

Yilong Li

2017

Master student in Stanford, USA

15.

Dayou Du

2017

Master student in NYU, USA

16.

Han Qiu

2017

Software Engineer in Samsung, China

 

Top-10 School of EECS Bachelor Thesis Award Winner

 

  •   Ruizhe Zhao 2016, Xinfeng Xie 2017

 

 

Thesis Committee

 

  •   Peng Wang (PKU), Yuxin Wang (PKU), Chao Zhang (PKU), Chen Zhang (PKU)     

  •   Fubing Mao (Nanyang Technological University, Singapore)

 

Software Release

 CRAT:  A PTX to PTX compiler that enables register allocation for GPUs. CRAT, which is the abbr. of Coordinated Register Allocation and Thread-level parallelism. Register allocation on GPUs plays an important role for performance as it affects both single thread performance and the thread level parallelism. CRAT can enable flexible register allocation at compiler intermediate language level. CRAT has been downloaded by users from Michigan, CMU, and other univiersities. I am leading faculty for this project. URL:  http://ceca.pku.edu.cn/crat/

 

  •   Peng Wang (PKU), Yuxin Wang (PKU), Chao Zhang (PKU), Chen Zhang (PKU)     

  •   Fubing Mao (Nanyang Technological University, Singapore)

 

Software Release

 CRAT:  A PTX to PTX compiler that enables register allocation for GPUs. CRAT, which is the abbr. of Coordinated Register Allocation and Thread-level parallelism. Register allocation on GPUs plays an important role for performance as it affects both single thread performance and the thread level parallelism. CRAT can enable flexible register allocation at compiler intermediate language level. CRAT has been downloaded by users from Michigan, CMU, and other univiersities. I am leading faculty for this project. URL:  http://ceca.pku.edu.cn/crat/