北京大学高能效计算与应用中心助理教授
北京大学高能效计算与应用中心助理主任 (Assistant director)
北京大学信息科学技术学院新体制研究员
电话:+86-10-6276-0779
地址:北京大学理科5号楼518N室, 100871
邮箱:ericlyun [at] pku.edu.cn
BIOGRAPHY
Yun (Eric) Liang is an assistant professor in School of EECS, Peking University, China. He received his PhD in Computer Science from the National University of Singapore in 2010 and worked as a Research Scientist in UIUC before he joins PKU. His research focuses on heterogeneous computing, energy-efficient computing, computer architecture, compilation techniques, embedded system design, and real-time system. He has authored over 60 scientific publications in premier international journals and conferences in this domain. His research has been recognized by best paper award at FCCM 2011 and ICCAD 2017 and best paper nominations at DAC 2016, ASPDAC 2016, DAC 2012, FPT 2011, CODES+ISSS 2008. Prof Liang serves as Associate Editor for ACM Transactions in Embedded Computing Systems (TECS) and serves in the program committees in the premier conferences in the related domain including (HPCA, PACT, CGO, ICCAD, CC, DATE, CASES, ASPDAC, ICCD).
AWARDS AND HONORS
Best Paper Award, International Conference on Computer Aided Design (ICCAD), November, 2017.
Best Paper Award Nomination, Design Automation Conference (DAC), June 2016.
(14 nominations out of 676 submissions).
Best Paper Award Nomination, Asia and South Pacific Design Automation Conference (ASP-DAC), January, 2016.
Best Paper Award Nomination, Design Automation Conference (DAC), June 2012.
(7 nominations out of 741 submissions).
Best Paper Award Nomination, International Conference on Field Programmable Technology (FPT), December 2011.
Best Paper Award, IEEE International Symposium on Field-Programmable Custom Computing Machines 2011 (FCCM), May 2011. (1 out of 119 submissions)
Best Paper Award Nomination, ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2008.
PUBLICATIONS
Journal Publications
[J18] |
Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. ""CRAT: Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," IEEE Transactions on Computer (TC). |
[J17] |
Yun Liang, Xiaolong Xie, Yu Wang, Guangyu Sun, Tao Wang. ""Optimizing Cache Bypassing and Warp Scheduling for GPUs, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD). |
[J16] |
Xinfeng Xie, Dayou Du, Qian Li, Yun Liang, Wai Teng Tang, Zhong Liang Ong,Mian Lu, Huynh Phung Huynh , Rick Siow Mong Goh. "Exploiting Sparsity to Accelerate Fully Connected Layers of CNN-based Applications on Mobile SoCs, " ACM Transactions on Embedded Computing Systems (TECS). |
[J15] |
Yun Liang, Waiteng Tang, Ruizhe Zhao, Mian Lu, Huynh Phung Huynh, Rick Siow Mong Goh. "Scale-free Sparse Matrix-Vector Multiplication on Many-Core Architectures, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(TCAD). |
[J14] |
Yun Liang, Xiuhong Li. “Efficient Kernel Management on GPUs.” ACM Transactions on Embedded Computing Systems (TECS), Vol 16, Issue 4, May 2017. |
[J13] |
Yun Liang, Muhammad T. Satria, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 7, July 2016. |
[J12] |
Yao Chen, Swathi T. Gurumani, Yun Liang, Guofeng Li, Donghui Guo, Kyle Rupnow, Deming Chen. "FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow, " IEEE Transactions on Very Large Scale Integration Systems (TVLSI), Vol. 24, No. 6, pp. 2220–2233, June 2016. |
[J11] |
Ying Chen, Tan Nguyen, Yao Chen, Swathi Gurumani, Yun Liang, Kyle Rupnow, Jason Cong, Wen-mei Hwu, Deming Chen. "FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 12, April 2016. |
[J10] |
Yun Liang, Shuo Wang. "Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs, " Journal of Computer Science and Technology (JCST), Vol 31, No.1, Janurary. 2016. |
[J9] |
Mian Lu, Yun Liang, Huynh Phung Huynh, Zhongliang Ong, Bingsheng He, Rick Siow Mong Goh. "MrPhi : An Optimized MapReduce Framework on Intel Xeon Phi Coprocessors, " IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 11, pp. 3066-3078, November 2015. |
[J8] |
Yun Liang, Xiaolong Xie, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 10, pp. 1677-1690, October 2015. |
[J7] |
Yun Liang, Tulika Mitra, Lei Ju. "Instruction Cache Locking using Temporal Reuse Profile," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 9, pp. 1387-1400, August 2015. |
[J6] |
Yun Liang, Huynh Phung Huynh, Kyle Rupnow, Rick Siow Mong Goh, Deming Chen. "Efficient GPU Spatial-Temporal Multitasking," IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 3, pp. 748-760, March 2015. |
[J5] |
Yun Liang, Tulika Mitra. "An Analytical Approach for Fast and Accurate Design Space Exploration of Instruction Caches," ACM Transactions on Embedded Computing Systems (TECS), 13(3), Article 43, December, 2013. |
[J4] |
Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, Vivy Suhendra. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-cores," Real-Time Systems Journal (RTS) 48(6), November, 2012. |
[J3] |
Yun Liang, Kyle Rupnow, Yinan Li, Dongbo Min, Minh Do, and Deming Chen. "High Level Synthesis: Productivity, Performance and Software Constraints, " Journal of Electrical and Computer Engineering, Special Issue on ESL Design Methodology, Volume 2012 (2012), 649057, 2012. |
[J2] |
Lei Ju, Yun Liang, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware optimization of BAN applications," Journal of Design Automation for Embedded System, Volume 13 (3), September, 2009. |
[J1] |
Xianfeng Li, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Chronos: A Timing Analyzer for Embedded Software," Science of Computer Programming, Special issue on Experimental Software and Toolkit, 69(1-3), December 2007. |
Conference Papers
[C51] |
Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Qinru Qiu, Yanzhi Wang ,Yun Liang. "C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs, " to appear in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Feb 2018. |
[C50] |
Yun Liang, Xiuhong Li, Xiaolong Xie. "Exploring Cache Bypassing and Partitioning for MultiTasking on GPUs, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. |
[C49] |
Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang. ""A Hybrid Approach to Cache Management in Heterogeneous CPU-FPGA Platforms, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. (invited paper). |
[C48] |
Jieru Zhao, Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang, Bingsheng He. "COMBA: A Comprehensive Model-Based Analysis Framework for High Level Synthesis of Real Applications, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. Best Paper Award. |
[C47] |
Xiaolong Xie, Wei Tan, Liana L. Fong,Yun Liang. "CUMF_SGD:Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUS, " to appear in the proceedings of the 26th International Symposium on High Performance Parallel and Distributed Computing (HPDC), June 2017. |
[C46] |
Shuo Wang, Yun Liang. "A Comprehensive Framework for Synthesizing Stencil Algorithms on FPGAs using OpenCL Model, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017. |
[C45] |
Shuo Wang, Yun Liang, Wei Zhang. "FlexCL: An Analytical Performance Model for OpenCL Workloads on Flexible FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017. |
[C44] |
Qingcheng Xiao, Yun Liang, Liqiang Lu, Shengen Yan, Yu-Wing Tai. "Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017. |
[C43] |
Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, Yuxin Wang, Han Hu, Yun Liang, Jason Cong. "Automating the systolic array generation and optimizations for high throughput convolution neural network," to appear in the proceedings of the Design Automation Conference (DAC), June 2017. Best Paper Award Nomination. |
[C42] |
Liqiang Lu, Yun Liang, Qingcheng Xiao and Shengen Yan. "Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs," to appear in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2017. |
[C41] |
Guanwen Zhong, Alok Prakash, Siqi Wang, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of FPGA-based Accelerators with Multi-level Parallelism," to appear in the proceedings of the Design Automation and Test in Europe (DATE), March, 2017. |
[C40] |
Xuechao Wei, Yun Liang, Tao Wang, Songwu Lu, Jason Cong. "Throughput Optimization for Streaming Applications on CPU-FPGA Heterogeneous Systems," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2017. |
[C39] |
Guanwen Zhong, Alok Prakash, Yun Liang, Tulika Mitra, Smail Niar. "Lin-Analyzer: A High-level Performance Analysis Tool for FPGA-based Accelerators," in the proceedings of the Design Automation Conference (DAC), June, 2016. |
[C38] |
Xiuhong Li, Yun Liang. "Efficient Kernel Management on GPUs," in the proceedings of the Design Automation and Test in Europe (DATE), March, 2016. |
[C37] |
Shuo Wang, Yun Liang, Chao Zhang, Xiaolong Xie, Guangyu Sun, Yongpan Liu, Yu Wang, Xiuhong Li. "Performance-centric Register File Design for GPUs using Racetrack Memory," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2016. Best Paper Award Nomination. |
[C36] |
Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. "Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," in the proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015. |
[C35] |
Xian Zhang, Guangyu Sun, Chao Zhang, Weiqi Zhang, Yun Liang, Tao Wang, Yiran Chen, and Jia Di. "Fork Path: Improving Efficiency of ORAM by Removing Redundant Memory Accesses," in the proceedings of 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015. |
[C34] |
Yun Liang, Shuo Wang. "Quantitative Performance and Power Analysis of LTE using High Level Synthesis," in the proceedings of International Conference on ASIC, Novemeber, 2015.(invited paper). |
[C33] |
XuechaoWei, Yun Liang, Xibai Li, Tao Wang, Songwu Lu, Jason Cong. "Evaluation of Software Defined Radio on Heterogeneous Systems," in the proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT), 2015.(Poster). |
[C32] |
Chao Zhang, Guangyu Sun, Xian Zhang, Weiqi Zhang, Weisheng Zhao, Tao Wang, Yun Liang, Yongpan Liu, Yu Wang, and Jiwu Shu, "Hi-fi Playback: Tolerating Position Errors in Shift Operations of Racetrack Memory, " in the proceedings of the 42nd International Symposium on Computer Architecture (ISCA), June 2015. |
[C31] |
Xiaolong Xie, Yun Liang, Yu Wang, Guangyu Sun, Tao Wang. "Coordinated Static and Dynamic Cache Bypassing on GPUs," in the proceedings of 21st IEEE International Symposium on High Performance Computer Architecture (HPCA), February 2015. |
[C30] |
Waiteng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huynh, Xibai Li, Rick Siow Mong Goh."Optimizing and Auto-Tuning Scale-Free Sparse Matrix-Vector Multiplication on Intel Xeon Phi, " in the proceedings of the International Symposium on Code Generation and Optimization (CGO), February 2015. |
[C29] |
Guanwen Zhong, Vanchinathan Venkataramani, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of Multiple Loops on FPGAs using High Level Synthesis," in the proceedings of IEEE International Conference on Computer Design (ICCD), October 2014. |
[C28] |
Xiaoming Chen, Yu Wang, Yun Liang, Yuan Xie, Huazhong Yang, "Run-time Techniques for Simultaneous Aging and Power Optimization in GPGPUs," in the proceedings of the 51th Design Automation Conference (DAC), June, 2014. |
[C27] |
Jingyu Deng, Yun Liang, Guojie Luo, Guangyu Sun. "Rapid Design Space Exploration of Two-level Unified Caches," in the proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), June 2014. |
[C26] |
Swathi Gurumani, Jacob Tolar, Yao Chen, Yun Liang, Kyle Rupnow, Deming Chen. "Integrated CUDA-to-FPGA Synthesis with Network-on-Chip," in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2014. |
[C25] |
Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Dynamic Instruction Cache Locking," in the proceedings of the Design Automation and Test in Europe (DATE), March, 2014. |
[C24] |
Zhimin Wu, Yang Liu, Yun Liang, Jun Sun. "GPU Accelerated Counterexample Generation in LTL Model Checking," in the proceedings of the International Conference on Formal Engineering Methods (ICFEM), November, 2014. |
[C23] |
Xiaolong Xie, Yun Liang, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2013. |
[C22] |
Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh. "Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor," in the proceedings of the IEEE Bag Data (BigData), Oct, 2013. |
[C21] |
Alexandros Papakonstantinou, Deming Chen, Wen Mei Hwu, Yun Liang, Jason Cong. "Throughput-Oriented Kernel Porting onto FPGAs," in the proceedings of the 50th Design Automation Conference (DAC), June, 2013. |
[C20] |
Huping Ding, Yun Liang, Tulika Mitra. "Integrated Instruction Cache Analysis and Locking in Multitasking Real-time Systems," in the proceedings of the 50th Design Automation Conference (DAC), June, 2013. |
[C19] |
Wei Zuo, Yun Liang, Peng Li, Kyle Rupnow, Deming Chen, Jason Cong. "Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations," in the proceedings of the 21st ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Februrary, 2013. |
[C18] |
Swathi T. Gurumani, Hisham Cholakkal, Yun Liang, Kyle Rupnow, Deming Chen. "High-Level Synthesis of Multiple Dependent CUDA Kernels on FPGA," in the proceedings of18th Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2013 (invited paper). |
[C17] |
Yun Liang, Zheng Cui, Kyle Rupnow, Deming Chen. "Register and Thread Structure Optimization for GPUs," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013. |
[C16] |
Huping Ding, Yun Liang, Tulika Mitra. "Shared Cache Aware Task Mapping for WCRT Minimization," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013. |
[C15] |
Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Partial Instruction Cache Locking," in the proceedings of ACM 49th Design Automation Conference (DAC), June 2012. Best Paper Award Nomination (7 out of 741 submissions). |
[C14] |
Zheng Cui, Yun Liang, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization," in the proceedings of IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2012. |
[C13] |
Yun Liang, Zheng Cui, Shengkui Zhao, Kyle Rupnow, Yihao Zhang, Douglas L. Jones, Deming Chen. "Real-time Implementation and Performance Optimization of 3D Sound Localization on GPUs," in the proceedings of Design Automation and Test in Europe (DATE), March 2012. |
[C12] |
Shengkui Zhao, Saima Ahmed, Yun Liang, Kyle Rupnow, Deming Chen, Douglas L Jones. "A real-time 3D sound localization system with miniature microphone array for virtual reality," in the proceedings of of 7th IEEE Conference on Industrial Electronics and Applications(ICIEA), July 2012. |
[C11] |
Kyle Rupnow, Yun Liang, Yinan Li, Dongbo Min, Minh Do, Deming Chen. "High Level Synthesis of Stereo Matching: Productivity, Performance, and Software Constraints, " in the proceedings of International Conference on Field Programmable Technology (FPT), December 2011. Best Paper Award Nomination (4 out of 94 submissions). |
[C10] |
Alexandros Papakonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-Mei W. Hwu, Jason Cong. "Multilevel Granularity Parallelism Synthesis on FPGAs," in the proceedings of the 19th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2011. Best Paper Award (1 out of 119 submissions). |
[C9] |
Kyle Rupnow, Yun Liang, Yinan Li, Deming Chen. "A study of high-level synthesis: Promises and challenges," in the proceedings of IEEE 9th International Conference on ASIC (ASICON), October, 2011. |
[C8] |
Yun Liang, Tulika Mitra. "Improved procedure placement for set associative caches," in the proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems (CASES’10), October, 2010. |
[C7] |
Yun Liang, Tulika Mitra. "Instruction Cache Locking using Temporal Reuse Profile," in the proceedings of the ACM 47th Design Automation Conference (DAC), June 2010. |
[C6] |
Huynh Phung Huynh, Yun Liang, Tulika Mitra. "Efficient custom instructions generation for system-level design," in the proceedings of the International Conference on Field-Programmable Technology (FPT), December, 2010. |
[C5] |
Yan Li, Vivy Suhendra, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-Cores," in the proceedings of the 30th IEEE Real-Time Systems Symposium (RTSS), December, 2009. |
[C4] |
Yun Liang, Tulika Mitra. "Static Analysis for Fast and Accurate Design Space Exploration of Caches," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October, 2008. |
[C3] |
Yun Liang, Lei Ju, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware Optimization of BAN Applications," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis(CODES+ISSS), October, 2008. Best Paper Award Nomination. |
[C2] |
Yun Liang, Tulika Mitra. "Cache Modeling in Probabilistic Execution Time Analysis," in the proceedings of the 45th Design Automation Conference (DAC), June, 2008. |
[C1] |
Yun Liang, Abhik Roychoudhury, Tulika Mitra. "Timing analysis of body area network application," in the proceedings of the 7th International Workshop on Worst Case Execution Time Analysis (WCET) , 2007. |
PROFESSIONAL SERVICE
Editor Board
Associate Editor, ACM Transactions in Embedded Computing Systems (TECS), 2017-.
Conference Organizing Committee Member
Special Session Organizer and Chair, the 18th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.
Subcommittee Chair. System Level Synthesis and Optimization, the 19th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.
Publication Chair. 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.
Conference Program Committee Member
Internation Symposium on High-Performance Computer Architecture(HPCA), 2018.
International Conference on Compiler Construction(CC), 2018.
Internation Conference on High Performance Computing Data, and Analytics(HiPC) 2017.
International Conference on Parallel Architectures and Compilation Techniques (PACT) 2015, 2016.
International Conference on Computer Aided Design (ICCAD) 2016, 2017.
International Symposium on Code Generation and Optimization (CGO) 2017.
Asia South Pacific Design Automation Conference (ASP-DAC) 2012, 2013, 2014, 2016, 2017.
Design Automation and Test in Europe (DATE) 2013, 2014, 2015, 2016, 2017.
International Conference on Compilers Architecture and Synthesis for Embedded Systems (CASES) 2013, 2014, 2015, 2016.
IEEE International Conference on Computer Design (ICCD) 2016.
COURSES
Programming Practice (English), 2013, 2014, 2015, 2016, 2017.
Compiler Design, 2016, 2017.
STUDENTS SUPERVISED
Ph.D Students.
1. |
Xiaolong Xie: “GPU Optimization: Algorithms, Systems and Architecture” Winner: Top 10 Academic Achievement Award 2016, Qualcomm PhD Scholarship 2016, Merit Student of Peking University 2015, National Graduate Scholarship 2015. |
2013-present |
2. |
Xuechao Wei: “Algorithms Accelerations using Systolic Array on FPGAs”. (co-advised with Prof Jason Cong) |
2013-present |
3. |
Xiuhong Li: “Accelerating Irregular Applications on GPUs” Winner: National Graduate Scholarship 2016. Academic Excellence Award. |
2014-present |
4. |
Shuo Wang: “Performance Modeling for Heterogeneous Systems” |
2015-present |
5. |
Qingcheng Xiao: TBD |
2016-present |
6. |
Liqiang Lu: TBD
|
2017-present |
Undergraduate Students.
|
Student |
Graduation Year |
Employment after Graduation |
1. |
Xiaolong Xie |
2013 |
PhD student in Peking University, China |
2. |
Siyuan Ouyang |
2013 |
Master student in CMU, USA |
3. |
Jingyu Deng |
2014 |
Master student in NYU, USA |
4. |
Xiuhong Li |
2014 |
PhD student in Peking University, China |
5. |
Xibai Li |
2015 |
Software Engineer in a Starup, China |
6. |
Ruizhe Zhao |
2016 |
PhD student in Imperial London, UK |
7. |
Yudong Wu |
2016 |
PhD student in UCSD, USA |
8. |
Zhaowen Zou |
2016 |
Master student in UCSD, USA |
9. |
Qiqi Xiao |
2016 |
Software Engineer in Face++, China |
10. |
Qingcheng Xiao |
2016 |
PhD student in Peking University, China |
11. |
Qian Li |
2017 |
PhD student in Stanford, USA |
12. |
Xinfeng Xie |
2017 |
PhD student in UCSB, USA |
13. |
Liqiang Lu |
2017 |
PhD student in Peking University, China |
14. |
Yilong Li |
2017 |
Master student in Stanford, USA |
15. |
Dayou Du |
2017 |
Master student in NYU, USA |
16. |
Han Qiu |
2017 |
Software Engineer in Samsung, China |
Top-10 School of EECS Bachelor Thesis Award Winner
Ruizhe Zhao 2016, Xinfeng Xie 2017
Thesis Committee
北京大学高能效计算与应用中心助理教授
北京大学高能效计算与应用中心助理主任 (Assistant director)
北京大学信息科学技术学院新体制研究员
电话:+86-10-6276-0779
地址:北京大学理科5号楼518N室, 100871
邮箱:ericlyun [at] pku.edu.cn
BIOGRAPHY
Yun (Eric) Liang is an assistant professor in School of EECS, Peking University, China. He received his PhD in Computer Science from the National University of Singapore in 2010 and worked as a Research Scientist in UIUC before he joins PKU. His research focuses on heterogeneous computing, energy-efficient computing, computer architecture, compilation techniques, embedded system design, and real-time system. He has authored over 60 scientific publications in premier international journals and conferences in this domain. His research has been recognized by best paper award at FCCM 2011 and ICCAD 2017 and best paper nominations at DAC 2016, ASPDAC 2016, DAC 2012, FPT 2011, CODES+ISSS 2008. Prof Liang serves as Associate Editor for ACM Transactions in Embedded Computing Systems (TECS) and serves in the program committees in the premier conferences in the related domain including (HPCA, PACT, CGO, ICCAD, CC, DATE, CASES, ASPDAC, ICCD).
AWARDS AND HONORS
Best Paper Award, International Conference on Computer Aided Design (ICCAD), November, 2017.
Best Paper Award Nomination, Design Automation Conference (DAC), June 2016.
(14 nominations out of 676 submissions).
Best Paper Award Nomination, Asia and South Pacific Design Automation Conference (ASP-DAC), January, 2016.
Best Paper Award Nomination, Design Automation Conference (DAC), June 2012.
(7 nominations out of 741 submissions).
Best Paper Award Nomination, International Conference on Field Programmable Technology (FPT), December 2011.
Best Paper Award, IEEE International Symposium on Field-Programmable Custom Computing Machines 2011 (FCCM), May 2011. (1 out of 119 submissions)
Best Paper Award Nomination, ACM International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October 2008.
PUBLICATIONS
Journal Publications
[J18] |
Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. ""CRAT: Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," IEEE Transactions on Computer (TC). |
[J17] |
Yun Liang, Xiaolong Xie, Yu Wang, Guangyu Sun, Tao Wang. ""Optimizing Cache Bypassing and Warp Scheduling for GPUs, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD). |
[J16] |
Xinfeng Xie, Dayou Du, Qian Li, Yun Liang, Wai Teng Tang, Zhong Liang Ong,Mian Lu, Huynh Phung Huynh , Rick Siow Mong Goh. "Exploiting Sparsity to Accelerate Fully Connected Layers of CNN-based Applications on Mobile SoCs, " ACM Transactions on Embedded Computing Systems (TECS). |
[J15] |
Yun Liang, Waiteng Tang, Ruizhe Zhao, Mian Lu, Huynh Phung Huynh, Rick Siow Mong Goh. "Scale-free Sparse Matrix-Vector Multiplication on Many-Core Architectures, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems(TCAD). |
[J14] |
Yun Liang, Xiuhong Li. “Efficient Kernel Management on GPUs.” ACM Transactions on Embedded Computing Systems (TECS), Vol 16, Issue 4, May 2017. |
[J13] |
Yun Liang, Muhammad T. Satria, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization, " IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 7, July 2016. |
[J12] |
Yao Chen, Swathi T. Gurumani, Yun Liang, Guofeng Li, Donghui Guo, Kyle Rupnow, Deming Chen. "FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow, " IEEE Transactions on Very Large Scale Integration Systems (TVLSI), Vol. 24, No. 6, pp. 2220–2233, June 2016. |
[J11] |
Ying Chen, Tan Nguyen, Yao Chen, Swathi Gurumani, Yun Liang, Kyle Rupnow, Jason Cong, Wen-mei Hwu, Deming Chen. "FCUDA-HB: Hierarchical and Scalable Bus Architecture Generation on FPGAs With the FCUDA Flow", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 35, No. 12, April 2016. |
[J10] |
Yun Liang, Shuo Wang. "Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs, " Journal of Computer Science and Technology (JCST), Vol 31, No.1, Janurary. 2016. |
[J9] |
Mian Lu, Yun Liang, Huynh Phung Huynh, Zhongliang Ong, Bingsheng He, Rick Siow Mong Goh. "MrPhi : An Optimized MapReduce Framework on Intel Xeon Phi Coprocessors, " IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 11, pp. 3066-3078, November 2015. |
[J8] |
Yun Liang, Xiaolong Xie, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 10, pp. 1677-1690, October 2015. |
[J7] |
Yun Liang, Tulika Mitra, Lei Ju. "Instruction Cache Locking using Temporal Reuse Profile," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), Vol. 34, No. 9, pp. 1387-1400, August 2015. |
[J6] |
Yun Liang, Huynh Phung Huynh, Kyle Rupnow, Rick Siow Mong Goh, Deming Chen. "Efficient GPU Spatial-Temporal Multitasking," IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 26, No. 3, pp. 748-760, March 2015. |
[J5] |
Yun Liang, Tulika Mitra. "An Analytical Approach for Fast and Accurate Design Space Exploration of Instruction Caches," ACM Transactions on Embedded Computing Systems (TECS), 13(3), Article 43, December, 2013. |
[J4] |
Yun Liang, Huping Ding, Tulika Mitra, Abhik Roychoudhury, Yan Li, Vivy Suhendra. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-cores," Real-Time Systems Journal (RTS) 48(6), November, 2012. |
[J3] |
Yun Liang, Kyle Rupnow, Yinan Li, Dongbo Min, Minh Do, and Deming Chen. "High Level Synthesis: Productivity, Performance and Software Constraints, " Journal of Electrical and Computer Engineering, Special Issue on ESL Design Methodology, Volume 2012 (2012), 649057, 2012. |
[J2] |
Lei Ju, Yun Liang, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware optimization of BAN applications," Journal of Design Automation for Embedded System, Volume 13 (3), September, 2009. |
[J1] |
Xianfeng Li, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Chronos: A Timing Analyzer for Embedded Software," Science of Computer Programming, Special issue on Experimental Software and Toolkit, 69(1-3), December 2007. |
Conference Papers
[C51] |
Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Qinru Qiu, Yanzhi Wang ,Yun Liang. "C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs, " to appear in the proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Feb 2018. |
[C50] |
Yun Liang, Xiuhong Li, Xiaolong Xie. "Exploring Cache Bypassing and Partitioning for MultiTasking on GPUs, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. |
[C49] |
Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang. ""A Hybrid Approach to Cache Management in Heterogeneous CPU-FPGA Platforms, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. (invited paper). |
[C48] |
Jieru Zhao, Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang, Bingsheng He. "COMBA: A Comprehensive Model-Based Analysis Framework for High Level Synthesis of Real Applications, " in the proceedings of International Conference on Computer Aided Design (ICCAD), Nov 2017. Best Paper Award. |
[C47] |
Xiaolong Xie, Wei Tan, Liana L. Fong,Yun Liang. "CUMF_SGD:Parallelized Stochastic Gradient Descent for Matrix Factorization on GPUS, " to appear in the proceedings of the 26th International Symposium on High Performance Parallel and Distributed Computing (HPDC), June 2017. |
[C46] |
Shuo Wang, Yun Liang. "A Comprehensive Framework for Synthesizing Stencil Algorithms on FPGAs using OpenCL Model, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017. |
[C45] |
Shuo Wang, Yun Liang, Wei Zhang. "FlexCL: An Analytical Performance Model for OpenCL Workloads on Flexible FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017. |
[C44] |
Qingcheng Xiao, Yun Liang, Liqiang Lu, Shengen Yan, Yu-Wing Tai. "Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs, " to appear in the proceedings of the Design Automation Conference (DAC), June 2017. |
[C43] |
Xuechao Wei, Cody Hao Yu, Peng Zhang, Youxiang Chen, Yuxin Wang, Han Hu, Yun Liang, Jason Cong. "Automating the systolic array generation and optimizations for high throughput convolution neural network," to appear in the proceedings of the Design Automation Conference (DAC), June 2017. Best Paper Award Nomination. |
[C42] |
Liqiang Lu, Yun Liang, Qingcheng Xiao and Shengen Yan. "Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs," to appear in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2017. |
[C41] |
Guanwen Zhong, Alok Prakash, Siqi Wang, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of FPGA-based Accelerators with Multi-level Parallelism," to appear in the proceedings of the Design Automation and Test in Europe (DATE), March, 2017. |
[C40] |
Xuechao Wei, Yun Liang, Tao Wang, Songwu Lu, Jason Cong. "Throughput Optimization for Streaming Applications on CPU-FPGA Heterogeneous Systems," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2017. |
[C39] |
Guanwen Zhong, Alok Prakash, Yun Liang, Tulika Mitra, Smail Niar. "Lin-Analyzer: A High-level Performance Analysis Tool for FPGA-based Accelerators," in the proceedings of the Design Automation Conference (DAC), June, 2016. |
[C38] |
Xiuhong Li, Yun Liang. "Efficient Kernel Management on GPUs," in the proceedings of the Design Automation and Test in Europe (DATE), March, 2016. |
[C37] |
Shuo Wang, Yun Liang, Chao Zhang, Xiaolong Xie, Guangyu Sun, Yongpan Liu, Yu Wang, Xiuhong Li. "Performance-centric Register File Design for GPUs using Racetrack Memory," in the proceedings of the Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2016. Best Paper Award Nomination. |
[C36] |
Xiaolong Xie, Yun Liang, Xiuhong Li, Yudong Wu, Guangyu Sun, Tao Wang, and Dongrui Fan. "Enabling Coordinated Register Allocation and Thread-level Parallelism Optimization for GPUs," in the proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015. |
[C35] |
Xian Zhang, Guangyu Sun, Chao Zhang, Weiqi Zhang, Yun Liang, Tao Wang, Yiran Chen, and Jia Di. "Fork Path: Improving Efficiency of ORAM by Removing Redundant Memory Accesses," in the proceedings of 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), December, 2015. |
[C34] |
Yun Liang, Shuo Wang. "Quantitative Performance and Power Analysis of LTE using High Level Synthesis," in the proceedings of International Conference on ASIC, Novemeber, 2015.(invited paper). |
[C33] |
XuechaoWei, Yun Liang, Xibai Li, Tao Wang, Songwu Lu, Jason Cong. "Evaluation of Software Defined Radio on Heterogeneous Systems," in the proceedings of International Conference on Parallel Architectures and Compilation Techniques (PACT), 2015.(Poster). |
[C32] |
Chao Zhang, Guangyu Sun, Xian Zhang, Weiqi Zhang, Weisheng Zhao, Tao Wang, Yun Liang, Yongpan Liu, Yu Wang, and Jiwu Shu, "Hi-fi Playback: Tolerating Position Errors in Shift Operations of Racetrack Memory, " in the proceedings of the 42nd International Symposium on Computer Architecture (ISCA), June 2015. |
[C31] |
Xiaolong Xie, Yun Liang, Yu Wang, Guangyu Sun, Tao Wang. "Coordinated Static and Dynamic Cache Bypassing on GPUs," in the proceedings of 21st IEEE International Symposium on High Performance Computer Architecture (HPCA), February 2015. |
[C30] |
Waiteng Tang, Ruizhe Zhao, Mian Lu, Yun Liang, Huynh Phung Huynh, Xibai Li, Rick Siow Mong Goh."Optimizing and Auto-Tuning Scale-Free Sparse Matrix-Vector Multiplication on Intel Xeon Phi, " in the proceedings of the International Symposium on Code Generation and Optimization (CGO), February 2015. |
[C29] |
Guanwen Zhong, Vanchinathan Venkataramani, Yun Liang, Tulika Mitra, Smail Niar. "Design Space Exploration of Multiple Loops on FPGAs using High Level Synthesis," in the proceedings of IEEE International Conference on Computer Design (ICCD), October 2014. |
[C28] |
Xiaoming Chen, Yu Wang, Yun Liang, Yuan Xie, Huazhong Yang, "Run-time Techniques for Simultaneous Aging and Power Optimization in GPGPUs," in the proceedings of the 51th Design Automation Conference (DAC), June, 2014. |
[C27] |
Jingyu Deng, Yun Liang, Guojie Luo, Guangyu Sun. "Rapid Design Space Exploration of Two-level Unified Caches," in the proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), June 2014. |
[C26] |
Swathi Gurumani, Jacob Tolar, Yao Chen, Yun Liang, Kyle Rupnow, Deming Chen. "Integrated CUDA-to-FPGA Synthesis with Network-on-Chip," in the proceedings of the IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2014. |
[C25] |
Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Dynamic Instruction Cache Locking," in the proceedings of the Design Automation and Test in Europe (DATE), March, 2014. |
[C24] |
Zhimin Wu, Yang Liu, Yun Liang, Jun Sun. "GPU Accelerated Counterexample Generation in LTL Model Checking," in the proceedings of the International Conference on Formal Engineering Methods (ICFEM), November, 2014. |
[C23] |
Xiaolong Xie, Yun Liang, Guangyu Sun, Deming Chen. "An Efficient Compiler Framework for Cache Bypassing on GPUs," in the proceedings of International Conference on Computer Aided Design (ICCAD) , Nov, 2013. |
[C22] |
Mian Lu, Lei Zhang, Huynh Phung Huynh, Zhongliang Ong, Yun Liang, Bingsheng He, Rick Siow Mong Goh, Richard Huynh. "Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor," in the proceedings of the IEEE Bag Data (BigData), Oct, 2013. |
[C21] |
Alexandros Papakonstantinou, Deming Chen, Wen Mei Hwu, Yun Liang, Jason Cong. "Throughput-Oriented Kernel Porting onto FPGAs," in the proceedings of the 50th Design Automation Conference (DAC), June, 2013. |
[C20] |
Huping Ding, Yun Liang, Tulika Mitra. "Integrated Instruction Cache Analysis and Locking in Multitasking Real-time Systems," in the proceedings of the 50th Design Automation Conference (DAC), June, 2013. |
[C19] |
Wei Zuo, Yun Liang, Peng Li, Kyle Rupnow, Deming Chen, Jason Cong. "Improving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations," in the proceedings of the 21st ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Februrary, 2013. |
[C18] |
Swathi T. Gurumani, Hisham Cholakkal, Yun Liang, Kyle Rupnow, Deming Chen. "High-Level Synthesis of Multiple Dependent CUDA Kernels on FPGA," in the proceedings of18th Asia and South Pacific Design Automation Conference (ASPDAC) , January, 2013 (invited paper). |
[C17] |
Yun Liang, Zheng Cui, Kyle Rupnow, Deming Chen. "Register and Thread Structure Optimization for GPUs," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013. |
[C16] |
Huping Ding, Yun Liang, Tulika Mitra. "Shared Cache Aware Task Mapping for WCRT Minimization," in the proceedings of 18th Asia and South Pacific Design Automation Conference (ASPDAC), January, 2013. |
[C15] |
Huping Ding, Yun Liang, Tulika Mitra. "WCET-Centric Partial Instruction Cache Locking," in the proceedings of ACM 49th Design Automation Conference (DAC), June 2012. Best Paper Award Nomination (7 out of 741 submissions). |
[C14] |
Zheng Cui, Yun Liang, Kyle Rupnow, Deming Chen. "An Accurate GPU Performance Model for Effective Control Flow Divergence Optimization," in the proceedings of IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2012. |
[C13] |
Yun Liang, Zheng Cui, Shengkui Zhao, Kyle Rupnow, Yihao Zhang, Douglas L. Jones, Deming Chen. "Real-time Implementation and Performance Optimization of 3D Sound Localization on GPUs," in the proceedings of Design Automation and Test in Europe (DATE), March 2012. |
[C12] |
Shengkui Zhao, Saima Ahmed, Yun Liang, Kyle Rupnow, Deming Chen, Douglas L Jones. "A real-time 3D sound localization system with miniature microphone array for virtual reality," in the proceedings of of 7th IEEE Conference on Industrial Electronics and Applications(ICIEA), July 2012. |
[C11] |
Kyle Rupnow, Yun Liang, Yinan Li, Dongbo Min, Minh Do, Deming Chen. "High Level Synthesis of Stereo Matching: Productivity, Performance, and Software Constraints, " in the proceedings of International Conference on Field Programmable Technology (FPT), December 2011. Best Paper Award Nomination (4 out of 94 submissions). |
[C10] |
Alexandros Papakonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-Mei W. Hwu, Jason Cong. "Multilevel Granularity Parallelism Synthesis on FPGAs," in the proceedings of the 19th Annual IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), May, 2011. Best Paper Award (1 out of 119 submissions). |
[C9] |
Kyle Rupnow, Yun Liang, Yinan Li, Deming Chen. "A study of high-level synthesis: Promises and challenges," in the proceedings of IEEE 9th International Conference on ASIC (ASICON), October, 2011. |
[C8] |
Yun Liang, Tulika Mitra. "Improved procedure placement for set associative caches," in the proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems (CASES’10), October, 2010. |
[C7] |
Yun Liang, Tulika Mitra. "Instruction Cache Locking using Temporal Reuse Profile," in the proceedings of the ACM 47th Design Automation Conference (DAC), June 2010. |
[C6] |
Huynh Phung Huynh, Yun Liang, Tulika Mitra. "Efficient custom instructions generation for system-level design," in the proceedings of the International Conference on Field-Programmable Technology (FPT), December, 2010. |
[C5] |
Yan Li, Vivy Suhendra, Yun Liang, Tulika Mitra, Abhik Roychoudhury. "Timing Analysis of Concurrent Programs Running on Shared Cache Multi-Cores," in the proceedings of the 30th IEEE Real-Time Systems Symposium (RTSS), December, 2009. |
[C4] |
Yun Liang, Tulika Mitra. "Static Analysis for Fast and Accurate Design Space Exploration of Caches," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), October, 2008. |
[C3] |
Yun Liang, Lei Ju, Samarjit Chakraborty, Tulika Mitra, Abhik Roychoudhury. "Cache-aware Optimization of BAN Applications," in the proceedings of the International Conference on Hardware/Software Codesign and System Synthesis(CODES+ISSS), October, 2008. Best Paper Award Nomination. |
[C2] |
Yun Liang, Tulika Mitra. "Cache Modeling in Probabilistic Execution Time Analysis," in the proceedings of the 45th Design Automation Conference (DAC), June, 2008. |
[C1] |
Yun Liang, Abhik Roychoudhury, Tulika Mitra. "Timing analysis of body area network application," in the proceedings of the 7th International Workshop on Worst Case Execution Time Analysis (WCET) , 2007. |
PROFESSIONAL SERVICE
Editor Board
Associate Editor, ACM Transactions in Embedded Computing Systems (TECS), 2017-.
Conference Organizing Committee Member
Special Session Organizer and Chair, the 18th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.
Subcommittee Chair. System Level Synthesis and Optimization, the 19th Asia South Pacific Design Automation Conference (ASP-DAC), 2014.
Publication Chair. 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.
Conference Program Committee Member
Internation Symposium on High-Performance Computer Architecture(HPCA), 2018.
International Conference on Compiler Construction(CC), 2018.
Internation Conference on High Performance Computing Data, and Analytics(HiPC) 2017.
International Conference on Parallel Architectures and Compilation Techniques (PACT) 2015, 2016.
International Conference on Computer Aided Design (ICCAD) 2016, 2017.
International Symposium on Code Generation and Optimization (CGO) 2017.
Asia South Pacific Design Automation Conference (ASP-DAC) 2012, 2013, 2014, 2016, 2017.
Design Automation and Test in Europe (DATE) 2013, 2014, 2015, 2016, 2017.
International Conference on Compilers Architecture and Synthesis for Embedded Systems (CASES) 2013, 2014, 2015, 2016.
IEEE International Conference on Computer Design (ICCD) 2016.
COURSES
Programming Practice (English), 2013, 2014, 2015, 2016, 2017.
Compiler Design, 2016, 2017.
STUDENTS SUPERVISED
Ph.D Students.
1. |
Xiaolong Xie: “GPU Optimization: Algorithms, Systems and Architecture” Winner: Top 10 Academic Achievement Award 2016, Qualcomm PhD Scholarship 2016, Merit Student of Peking University 2015, National Graduate Scholarship 2015. |
2013-present |
2. |
Xuechao Wei: “Algorithms Accelerations using Systolic Array on FPGAs”. (co-advised with Prof Jason Cong) |
2013-present |
3. |
Xiuhong Li: “Accelerating Irregular Applications on GPUs” Winner: National Graduate Scholarship 2016. Academic Excellence Award. |
2014-present |
4. |
Shuo Wang: “Performance Modeling for Heterogeneous Systems” |
2015-present |
5. |
Qingcheng Xiao: TBD |
2016-present |
6. |
Liqiang Lu: TBD
|
2017-present |
Undergraduate Students.
|
Student |
Graduation Year |
Employment after Graduation |
1. |
Xiaolong Xie |
2013 |
PhD student in Peking University, China |
2. |
Siyuan Ouyang |
2013 |
Master student in CMU, USA |
3. |
Jingyu Deng |
2014 |
Master student in NYU, USA |
4. |
Xiuhong Li |
2014 |
PhD student in Peking University, China |
5. |
Xibai Li |
2015 |
Software Engineer in a Starup, China |
6. |
Ruizhe Zhao |
2016 |
PhD student in Imperial London, UK |
7. |
Yudong Wu |
2016 |
PhD student in UCSD, USA |
8. |
Zhaowen Zou |
2016 |
Master student in UCSD, USA |
9. |
Qiqi Xiao |
2016 |
Software Engineer in Face++, China |
10. |
Qingcheng Xiao |
2016 |
PhD student in Peking University, China |
11. |
Qian Li |
2017 |
PhD student in Stanford, USA |
12. |
Xinfeng Xie |
2017 |
PhD student in UCSB, USA |
13. |
Liqiang Lu |
2017 |
PhD student in Peking University, China |
14. |
Yilong Li |
2017 |
Master student in Stanford, USA |
15. |
Dayou Du |
2017 |
Master student in NYU, USA |
16. |
Han Qiu |
2017 |
Software Engineer in Samsung, China |
Top-10 School of EECS Bachelor Thesis Award Winner
Ruizhe Zhao 2016, Xinfeng Xie 2017
Thesis Committee
Peng Wang (PKU), Yuxin Wang (PKU), Chao Zhang (PKU), Chen Zhang (PKU)
Fubing Mao (Nanyang Technological University, Singapore)
Software Release
CRAT: A PTX to PTX compiler that enables register allocation for GPUs. CRAT, which is the abbr. of Coordinated Register Allocation and Thread-level parallelism. Register allocation on GPUs plays an important role for performance as it affects both single thread performance and the thread level parallelism. CRAT can enable flexible register allocation at compiler intermediate language level. CRAT has been downloaded by users from Michigan, CMU, and other univiersities. I am leading faculty for this project. URL: http://ceca.pku.edu.cn/crat/
Peng Wang (PKU), Yuxin Wang (PKU), Chao Zhang (PKU), Chen Zhang (PKU)
Fubing Mao (Nanyang Technological University, Singapore)
Software Release
CRAT: A PTX to PTX compiler that enables register allocation for GPUs. CRAT, which is the abbr. of Coordinated Register Allocation and Thread-level parallelism. Register allocation on GPUs plays an important role for performance as it affects both single thread performance and the thread level parallelism. CRAT can enable flexible register allocation at compiler intermediate language level. CRAT has been downloaded by users from Michigan, CMU, and other univiersities. I am leading faculty for this project. URL: http://ceca.pku.edu.cn/crat/