全职教师

梁云  Yun (Eric) Liang

北京大学高能效计算与应用中心助理教授
北京大学高能效计算与应用中心助理主任 (Assistant director)
北京大学信息科学技术学院特聘研究员
电话:+86-10-6276-0779
地址:北京大学理科5号楼515S室, 100871
邮箱:ericlyun [at] pku.edu.cn

 

BIOGRAPHY

       Yun (Eric) Liang is an assistant professor in School of EECS, Peking University, China. He received his PhD in Computer Science from the National University of Singapore in 2010 and worked as a Research Scientist in UIUC before he joins PKU.His research focuses on heterogeneous computing, energy-efficient computing, computer architecture, compilation techniques, embedded system design, and real-time system. He has authored over 50 scientific publications in premier international journals and conferences in this domain. His research has been recognized by best paper award at FCCM 2011 and best paper nominations at ASPDAC 2016, DAC 2012, FPT 2011, CODES+ISSS 2008. Prof Liang serves as Associate Editor for ACM Transactions in Embedded Computing Systems (TECS) and serves in the program committees in the premier conferences in the related domain including (PACT, CGO, ICCAD, DATE, CASES, ASPDAC, ICCD).

 

AWARDS AND HONORS

       ● Best Paper Award Nomination, Asia and South Pacific Design Automation Conference (ASP-DAC), January, 2016. 

       ● Best Paper Award Nomination, Design Automation Conference (DAC), June 2012. (7 nominations out of 741 submissions).

       ● Best Paper Award Nomination, International Conference on Field Programmable Technology (FPT), December 2011.

       ● Best Paper Award,  IEEE International Symposium on Field-Programmable Custom Computing Machines 2011 (FCCM), May 2011. (1 out of 119 submissions)

       ● Best Paper Award Nomination, ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2008.

 

PUBLICATIONS

Journal Publications 

TECS17

Yun Liang, Xiuhong Li. "Efficient Kernel Management on GPUs, " ACM Transactions on Embedded Computing Systems (TECS)to appear (TECS), June 2017.

 TCAD17

Yun Liang, Waiteng Tang, Ruizhe Zhao, Mian Lu, Huynh Phung Huynh, Rick Siow Mong Goh. "Scale-free Sparse Matrix-Vector Multiplication on Many-Core Architectures, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD).

 TCAD16   

Yun Liang, Muhammad T. Satria, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 7, July 2016.

 TVLSI16

Yao Chen, Swathi T. Gurumani, Yun Liang, Guofeng Li, Donghui Guo, Kyle Rupnow, Deming Chen. "FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow, " IEEE Transactions on Very Large Scale Integration Systems (TVLSI), Vol. 24, No. 6, pp. 2220–2233, June 2016.

 TCAD16

Ying Chen, Tan Nguyen, Yao Chen, Swathi Gurumani, Yun Liang, Kyle Rupnow, Jason Cong, Wen-mei Hwu, Deming Chen. "FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 12, April 2016.

 JCST16

Yun Liang, Shuo Wang. "Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs, " Journal of Computer Science and Technology (JCST), Vol 31, No.1, Janurary. 2016.

 TPDS15

Mian Lu, Yun Liang,  Huynh Phung Huynh, Zhongliang Ong, Bingsheng He, Rick Siow Mong Goh. "MrPhi : An Optimized MapReduce Framework on Intel Xeon Phi Coprocessors, " IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 11, pp. 3066-3078, November 2015.

TCAD15

Yun Liang, Xiaolong Xie, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 10, pp. 1677-1690, October 2015.

TCAD15

Yun Liang, Tulika Mitra, Lei Ju. "Instruction Cache Locking using Temporal Reuse Profile," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 9, pp. 1387-1400, August 2015.

TPDS15

Yun Liang, Huynh Phung Huynh, Kyle Rupnow, Rick Siow Mong Goh, Deming Chen. "Efficient GPU Spatial-Temporal Multitasking," IEEE Transactions on Parallel and Distributed Systems (TPDS),  Vol. 26, No. 3, pp. 748-760, March 2015.

TECS13

Yun Liang, Tulika Mitra. "An Analytical Approach for Fast and Accurate Design Space Exploration of Instruction Caches," ACM Transactions on Embedded Computing Systems (TECS), 13(3), Article 43, December, 2013.

RTS12

Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, Vivy Suhendra. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-cores," Real-Time Systems Journal (RTS) 48(6), November, 2012.

JECE12

Yun Liang, Kyle Rupnow, Yinan Li, Dongbo Min, Minh Do, and Deming Chen. "High Level Synthesis: Productivity, Performance and Software Constraints, "

Journal of Electrical and Computer Engineering, Special Issue on ESL Design Methodology, Volume 2012 (2012), 649057, 2012.

JDAES12

Lei Ju, Yun Liang, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware optimization of BAN applications," Journal of Design Automation for Embedded System, Volume 13 (3), September, 2009.

SCP12

Xianfeng Li, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Chronos: A Timing Analyzer for Embedded Software,"  Science of Computer Programming, Special issue on Experimental Software and Toolkit, 69(1-3), December 2007.

 

Conference Papers

 

HPDC17

Xiaolong Xie, Wei Tan, Liana L. Fong,Yun Liang. "CUMF_SGD:Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUS, " to appear in the proceedings of the 26th International Symposium on High Performance Parallel and Distributed Computing (HPDC), June 2017.

DAC17

Shuo Wang, Yun Liang. "A Comprehensive Framework for Synthesizing Stencil Algorithms on FPGAs using OpenCL Model, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

DAC17

Shuo Wang, Yun Liang, Wei Zhang. "FlexCL: An Analytical Performance Model for OpenCL Workloads on Flexible FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

DAC17

Qingcheng Xiao, Yun Liang, Liqiang Lu, Shengen Yan, Yu-Wing Tai. "Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017.

DAC17

Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, Yuxin Wang, Han Hu, Yun Liang, Jason Cong. "Automating the systolic array generation and

optimizations for high throughput convolution neural network," to appear in the

proceedings of the Design Automation Conference (DAC), June 2017.

FCCM17

Liqiang Lu, Yun Liang, Qingcheng Xiao and Shengen Yan. "Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs," to appear in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2017.

DATE17

Guanwen Zhong, Alok Prakash, Siqi Wang, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of FPGA-based Accelerators with Multi-level Parallelism," to appear in the proceedings of the Design Automation and Test in Europe (DATE), March, 2017.

ASPDAC17

Xuechao Wei, Yun Liang, Tao Wang, Songwu Lu, Jason Cong.  "Throughput Optimization for Streaming Applications on CPU-FPGA Heterogeneous Systems," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2017.

DAC16

Guanwen Zhong, Alok Prakash, Yun Liang, Tulika Mitra, Smail Niar. "Lin-Analyzer: A High-level Performance Analysis Tool for FPGA-based Accelerators,"  in the proceedings of the Design Automation Conference (DAC), June, 2016.

DATE16             

Xiuhong Li, Yun Liang. "Efficient Kernel Management on GPUs,"  in the proceedings of the Design Automation and Test in Europe (DATE), March, 2016.

ASPDAC16

Shuo Wang, Yun Liang, Chao Zhang, Xiaolong Xie, Guangyu Sun, Yongpan Liu, Yu Wang, Xiuhong Li. "Performance-centric Register File Design for GPUs using Racetrack Memory," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2016. Best Paper Award Nomination.

MICRO15

Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. "Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," in the proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015.

MICRO15

Xian Zhang, Guangyu Sun, Chao Zhang, Weiqi Zhang, Yun Liang, Tao Wang, Yiran Chen, and Jia Di. "Fork Path: Improving Efficiency of ORAM by Removing Redundant Memory Accesses," in the proceedings of 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015.

ISCA15

Chao Zhang, Guangyu Sun, Xian Zhang, Weiqi Zhang, Weisheng Zhao, Tao Wang, Yun Liang, Yongpan Liu, Yu Wang, and Jiwu Shu, "Hi-fi Playback: Tolerating Position Errors in Shift Operations of Racetrack Memory, " in the proceedings of the 42nd International Symposium on Computer Architecture (ISCA), June 2015.

HPCA15

Xiaolong Xie, Yun Liang, Yu Wang, Guangyu Sun, Tao Wang. "Coordinated Static and Dynamic Cache Bypassing on GPUs,"  in the proceedings of 21st IEEE International Symposium on High Performance Computer Architecture (HPCA), February 2015.

CGO15

Waiteng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huynh, Xibai Li, Rick Siow Mong Goh. "Optimizing and Auto-Tuning Scale-Free Sparse Matrix-Vector Multiplication on Intel Xeon Phi, " in the proceedings of the International Symposium on Code Generation and Optimization (CGO), February 2015.

ICCD14

Guanwen Zhong, Vanchinathan Venkataramani, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of Multiple Loops on FPGAs using High Level Synthesis," in the proceedings of IEEE International Conference on Computer Design (ICCD), October 2014.

DAC14

Xiaoming Chen, Yu Wang, Yun Liang, Yuan Xie, Huazhong Yang, "Run-time Techniques for Simultaneous Aging and Power Optimization in GPGPUs," in the proceedings of the 51th Design Automation Conference (DAC), June, 2014.

ISCAS14

Jingyu Deng, Yun Liang, Guojie Luo, Guangyu Sun. "Rapid Design Space Exploration of Two-level Unified Caches," in the proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), June 2014.

FCCM14

Swathi Gurumani, Jacob Tolar, Yao Chen, Yun Liang, Kyle Rupnow, Deming Chen. "Integrated CUDA-to-FPGA Synthesis with Network-on-Chip," in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2014.

DATE14

Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Dynamic Instruction Cache Locking," in the proceedings of the Design Automation and Test in Europe (DATE), March, 2014.

ICFEM14

Zhimin WuYang LiuYun LiangJun Sun. "GPU Accelerated Counterexample Generation in LTL Model Checking," in the proceedings of the International Conference on Formal Engineering Methods (ICFEM), November, 2014.

ICCAD13

Xiaolong Xie, Yun Liang, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2013.

BigData13

Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh. "Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor," in the proceedings of the IEEE Bag Data (BigData), Oct, 2013.

DAC13

Alexandros Papakonstantinou, Deming Chen, Wen Mei Hwu, Yun Liang, Jason Cong. "Throughput-Oriented Kernel Porting onto FPGAs," in the proceedings of the 50th Design Automation Conference (DAC), June, 2013.

DAC13

Huping Ding, Yun Liang, Tulika Mitra. "Integrated Instruction Cache Analysis and Locking in Multitasking Real-time Systems," in the proceedings of the 50th Design Automation Conference (DAC),  June, 2013.

FPGA13

Wei Zuo, Yun Liang, Peng Li, Kyle Rupnow, Deming Chen, Jason Cong. "Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations," in the proceedings of the 21st ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Februrary, 2013.

ASPDAC13

Swathi T. Gurumani, Hisham Cholakkal, Yun Liang, Kyle Rupnow, Deming Chen. "High-Level Synthesis of Multiple Dependent CUDA Kernels on FPGA,"  in the proceedings of18th Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2013 (invited paper).

ASPDAC13

Yun Liang, Zheng Cui, Kyle Rupnow, Deming Chen. "Register and Thread Structure Optimization for GPUs," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013.

ASPDAC13

Huping Ding, Yun Liang, Tulika Mitra. "Shared Cache Aware Task Mapping for WCRT Minimization," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013.

DAC12

Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Partial Instruction Cache Locking," in the proceedings of ACM 49th Design Automation Conference (DAC), June 2012. Best Paper Award Nomination (7 out of 741 submissions).

IPDPS12

Zheng Cui, Yun Liang, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization," in the proceedings of IEEE International Parallel & Distributed Processing Symposium (IPDPS), May, 2012.

DATE12

Yun Liang, Zheng Cui, Shengkui Zhao, Kyle Rupnow, Yihao Zhang, Douglas L. Jones, Deming Chen. "Real-time Implementation and Performance Optimization of 3D Sound Localization on GPUs," in the proceedings of Design Automation and Test in Europe (DATE), March, 2012.

FPT11

Kyle Rupnow, Yun Liang, Yinan Li, Dongbo Min, Minh Do, Deming Chen. "High Level Synthesis of Stereo Matching: Productivity, Performance, and Software Constraints, " in the proceedings of International Conference on Field Programmable Technology (FPT), December 2011. Best Paper Award Nomination (4 out of 94 submissions).

FCCM11

Alexandros Papakonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-Mei W. Hwu, Jason Cong. "Multilevel Granularity Parallelism Synthesis on FPGAs," in the proceedings of the 19th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2011.  Best Paper Award (1 out of 119 submissions).

DAC10

Yun Liang, Tulika Mitra. "Instruction Cache Locking using Temporal Reuse Profile," in the proceedings of the ACM 47th Design Automation Conference (DAC), June 2010.

FPT10

Huynh Phung Huynh, Yun Liang, Tulika Mitra. "Efficient custom instructions generation for system-level design," in the proceedings of the International Conference on Field-Programmable Technology (FPT), December, 2010.

CASES10

Yun Liang, Tulika Mitra. "Improved procedure placement for set associative caches", in the proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems (CASES’10), October, 2010.

RTSS09

Yan Li, Vivy Suhendra, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-Cores," in the proceedings of the 30th IEEE Real-Time Systems Symposium (RTSS), December, 2009.

CODES08

Yun Liang, Tulika Mitra. "Static Analysis for Fast and Accurate Design Space Exploration of Caches," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October, 2008.

CODES08

Yun Liang, Lei Ju, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware Optimization of BAN Applications," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis(CODES+ISSS), October, 2008. Best Paper Award Nomination.

DAC08

Yun Liang, Tulika Mitra.  "Cache Modeling in Probabilistic Execution Time Analysis," in the proceedings of the 45th Design Automation Conference (DAC), June, 2008.

 

 

PROFESSIONAL SERVICE

Editor Board

   ● Associate Editor, ACM  Transactions in Embedded Computing Systems (TECS), 2017.

 

Conference Organizing Committee Member

   ● Special Session Organizer and Chair, the 18th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.

   ● Subcommittee Chair.  System Level Synthesis and Optimization, the 19th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.

   ● Publication Chair. 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.

 

 

Conference Program Committee Member

   ● International Conference on Parallel Architectures and Compilation Techniques (PACT) 2015, 2016.

   ● International Conference on Computer Aided Design (ICCAD) 2016, 2017.

   ● International Symposium on Code Generation and Optimization (CGO) 2017.

   ● Asia South Pacific Design Automation Conference (ASP-DAC) 2012, 2013, 2014, 2016, 2017.

   ● Design Automation and Test in Europe (DATE) 2013, 2014, 2015, 2016, 2017.

   ● International Conference on Compilers Architecture and Synthesis for Embedded Systems (CASES) 2013, 2014, 2015, 2016.

   ● IEEE International Conference on Computer Design (ICCD) 2016.

 

COURSES

   ● Programming Practice (English),  2013, 2014, 2015, 2016, 2017.

   ● Compiler Design, 2016, 2017.

 

STUDENTS SUPERVISED

Ph.D Students.

1.

Xiaolong Xie: “GPU Optimization: Algorithms, Systems and Architecture”

Winner: Top 10 Academic Achievement Award 2016, Qualcomm PhD Scholarship 2016, Merit Student of Peking University 2015, National Graduate Scholarship 2015.

2013-present

2.

Xiuhong Li:    TBD

Winner: National Graduate Scholarship 2016. Academic Excellence Award.

2014-present

3.

Xuechao Wei:  “Algorithms Accelerations using Systolic Array”. (co-advised with Prof Jason Cong)

2014-present

4.

Shuo Wang:   “Performance Modeling for Heterogeneous Systems”

2015-present

5.

Qingcheng Xiao: “Energy-efficient Machine Learning Designs”

 

2016-present

Undergraduate Students. 

 

Student

Graduation Year

Employment after Graduation

 1.

Xiaolong Xie

2013

PhD student in Peking University, China

 2.

Siyuan Ouyang

2013

Master student in CMU, USA

 3.

Jingyu Deng

2014

Master student in NYU, USA

 4.

Xiuhong Li

2014

PhD student in Peking University, China

 5.

Xibai Li

2015

Software Engineer in a Starup, China

 6.

Ruizhe Zhao

2016

PhD student in Imperial London, UK

 7.

Yudong Wu

2016

PhD student in UCSD, USA

 8.

Zhaowen Zou

2016

Master student in UCSD, USA

 9.

Qiqi Xiao

2016

Software Engineer in Face++, China

10.

Qingcheng Xiao

2016

PhD student in Peking University, China

 

Software Release

 CRAT:  A PTX to PTX compiler that enables register allocation for GPUs. CRAT, which is the abbr. of Coordinated Register Allocation and Thread-level parallelism. Register allocation on GPUs plays an important role for performance as it affects both single thread performance and the thread level parallelism. CRAT can enable flexible register allocation at compiler intermediate language level. CRAT has been downloaded by users from Michigan, CMU, and other univiersities. I am leading faculty for this project. URL:  http://ceca.pku.edu.cn/crat/