Publications

Book Chapters (1) | Journal Articles (19) | Conference Publications (Full Papers) (44) | Conference Publications (Short Papers) (5) | Patents (3) | Technical Reports (4) | Conference Abstracts (15)


Copyright Claim

Most of the papers are copyrighted by IEEE or ACM. They are posted here for your personal use, to ensure timely dissemination of research work with no commercial purpose.

However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE or ACM.


Book Chapters

BC1

Development Report of China’s General-Purpose Processors

Zhenman Fang, Weihua Zhang, Binyu Zang
Chapter 1 in "2010 China Computer Science and Technology Development Report". ISBN:9787111364450. (In Chinese)


Journal Articles

J19

PASTA: Programming and Automation Support for Scalable Task-Parallel HLS Programs on Modern Multi-Die FPGAs TRETS 2024

Moazin Khatti, Xingyu Tian, Ahmad Sedigh Baroughi, Akhil Raj Baranwal, Yuze Chi, Licheng Guo, Jason Cong, Zhenman Fang.
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2024)), Volume 17, Issue 3, Sept 2024, Article No.: 42, Pages 1 - 31.

J18

SQL2FPGA: Automated Acceleration of SQL Query Processing on Modern CPU-FPGA Platforms TRETS 2024

Alec Lu, Jahanvi Narendra Agrawal, Zhenman Fang.
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2024), Volume 17, Issue 3, Sept 2024, Article No.: 39, Pages 1 - 28.

J17

Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation TIP 2024

Haisheng Fu, Feng Liang, Jie Liang, Yongqiang Wang, Zhenman Fang, Guohe Zhang, Jingning Han.
IEEE Transactions on Image Processing (TIP 2024), vol. 33, pp. 4702-4715, Aug 2024.

J16

HyBNN: Quantifying and Optimizing Hardware Efficiency of Binary Neural Networks TRETS 2024

Geng Yang, Jie Lei, Zhenman Fang, Yunsong Li, Jiaqing Zhang, Weiying Xie.
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2024), Volume 17, Issue 2, Article No.: 25, pp 1–24. FPT 2023 Journal Track.

J15

CHIP-KNNv2: A Configurable and High-Performance K-Nearest Neighbors Accelerator on HBM-based FPGAs TRETS 2023

Kenny Liu, Alec Lu, Kartik Samtani, Zhenman Fang, and Licheng Guo
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2023), Volume 16, Issue 4, Dec 2023, Article No.: 62, pp 1–26.

J14

TAPA: A Scalable Task-Parallel Dataflow Programming Framework for Modern FPGAs with Co-Optimization of HLS and Physical Design TRETS 2023

Licheng Guo, Yuze Chi, Jason Lau, Linghao Song, Xingyu Tian, Moazin Khatti, Weikang Qiao, Jie Wang, Ecenur Ustun, Zhenman Fang, Zhiru Zhang, and Jason Cong
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2023), Volume 16, Issue 4, Dec 2023, Article No.: 63, pp 1–31.

J13

SASA: A Scalable and Automatic Stencil Acceleration Framework for Optimized Hybrid Spatial and Temporal Parallelism on HBM-based FPGAs TRETS 2023

Xingyu Tian, Zhifan Ye, Alec Lu, Licheng Guo, Yuze Chi, and Zhenman Fang
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2023), Volume 16, Issue 2, Apr 2023, Article No.: 28, pp 1–33.

J12

SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery TGRS 2023

Jiaqing Zhang, Jie Lei, Weiying Xie, Zhenman Fang, Yunsong Li, Qian Du
IEEE Transactions on Geoscience and Remote Sensing (TGRS 2023), vol. 61, pp. 1-15, Mar 2023, Art no. 5605415.
This paper has been cited more than 120 times.

J11

[Invited Paper] TopSort: A High-Performance Two-Phase Sorting Accelerator Optimized on HBM-based FPGAs TETC 2023

Weikang Qiao, Licheng Guo, Zhenman Fang, Mau-Chung Frank Chang, and Jason Cong
IEEE Transactions on Emerging Topics in Computing (TETC 2023 Invited Paper), vol. 11, no. 2, pp. 404-419, 1 April-June 2023.

J10

Algorithm/Hardware Co-Design for Real-Time On-Satellite CNN based Ship Detection in SAR Imagery TGRS 2022

Geng Yang, Jie Lei, Weiying Xie, Zhenman Fang, Yunsong Li, Jiaxvan Wang, Xin Zhang
The IEEE Transactions on Geoscience and Remote Sensing (TGRS 2022), vol. 60, pp. 1-18, Mar 2022, Art no. 5226018.

J9

Demystifying the Soft and Hardened Memory Systems of Modern FPGAs for Software Programmers through Microbenchmarking TRETS 2022

Alec Lu, Zhenman Fang, and Lesley Shannon
The ACM Transactions on Reconfigurable Technology and Systems (TRETS 2022 Special Issue on FPGA 2021 Highlights), Volume 15, Issue 4, December 2022, Article No.: 43, pp 1–33.

J8

SyncNN: Evaluating and Accelerating Spiking Neural Networks on FPGAs TRETS 2022

Sathish Panchapakesan, Zhenman Fang, and Jian Li.
The ACM Transactions on Reconfigurable Technology and Systems (TRETS 2022), Volume 15, Issue 4, December 2022, Article No.: 48, pp 1–27.

J7

Quick-Div: Rethinking Integer Divider Design for FPGA-based Soft-Processors TRETS 2022

Eric Matthews, Alec Lu, Zhenman Fang, and Lesley Shannon
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2022), Volume 15, Issue 3, September 2022, Article No.: 32, pp 1–27.

J6

[Invited paper] Programming and Synthesis for Software-defined FPGA Acceleration: Status and Future Prospects TRETS 2021

Yi-Hsiang Lai, Ecenur Ustun, Shaojie Xiang, Zhenman Fang, Hongbo Rong, Zhiru Zhang
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2021), Volume 14, Issue 4, December 2021, Article No.: 17, pp 1–39.

J5

Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks TCAD 2019 Donald O. Pederson Best Paper Award 2019

Chen Zhang, Guangyu Sun, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD 2019 Best Paper), Volume 38, Issue 11, Pages 2072 - 2085, Nov. 2019.
This paper has been cited more than 700 times.

J4

In-depth Analysis on Microarchitectures of Modern Heterogeneous CPU-FPGA Platforms TRETS 2019

Young-kyu Choi, Jason Cong, Zhenman Fang, Yuchen Hao, Glenn Reinman, Peng Wei
ACM Transactions on Reconfigurable Technology and Systems (TRETS 2019), Volume 12, Issue 1, February 2019, Article No. 4.

J3

[Invited paper] Customizable Computing: From Single-Chip to Datacenters PIEEE 2019

Jason Cong, Zhenman Fang, Muhuan Huang, Peng Wei, Di Wu, Cody Hao Yu
Proceedings of the IEEE (PIEEE 2019), Volume 107, Issue 1, Pages 185 - 203, Jan. 2019.

J2

[Invited Paper] CPU-FPGA Co-Optimization for Big Data Applications D&T 2018

Jason Cong, Zhenman Fang, Muhuang Huang, Libo Wang, Di Wu
IEEE Design & Test (D&T 2018). 35(1): 16-22.

J1

Measuring Microarchitectural Details of Multi- and Many-core Memory Systems Through Microbenchmarking TACO 2015

Zhenman Fang, Sanyam Mehta, Pen-Chung Yew, Antonia Zhai, James Greensky, Gautham Beeraka, Binyu Zang
ACM Transactions on Architecture and Code Optimization (TACO 2015). 11(4): 55:1-26.


Conference Publications (Full Papers)

C44

FLUD: A Scalable and Configurable Systolic Array Design for LU Decomposition on FPGAs FPT 2024

Xingyu Tian, Geng Yang, Zhenman Fang.
Accepted by the 2024 IEEE International Conference on Field-Programmable Technology (FPT 2024), Sydney, Australia, December 2024.
Acceptance Rate: 27.5%, 19 out of 69.

C43

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model ECCV 2024

Haisheng Fu, Jie Liang, Zhenman Fang, Jingning Han, Feng Liang, Guohe Zhang.
Accepted by the 2024 European Conference on Computer Vision (ECCV 2024), Milano, Italy, Sept-Oct 2024.
Acceptance Rate: 27.9%, 2395 out of 8585.

C42

SERI: High-Throughput Streaming Acceleration of Electron Repulsion Integral Computation in Quantum Chemistry using HBM-based FPGAs FPL 2024

Philip Stachura, Guanyu Li, Xin Wu, Christian Plessl, Zhenman Fang.
The 34th IEEE International Conference on Field-Programmable Logic and Applications (FPL 2024 Stamatis Vassiliadis Best Paper Award), Turin, Italy, September 2024, pp. 60-68.
Acceptance Rate: 22.5%, 29 out of 129.

C41

SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs FPL 2024

Geng Yang, Yanyue Xie, Zhong Jia Xue, Sung-En Chang, Yanyu Li, Peiyan Dong, Jie Lei, Weiying Xie, Yanzhi Wang, Xue Lin, Zhenman Fang.
The 34th IEEE International Conference on Field-Programmable Logic and Applications (FPL 2024), Turin, Italy, September 2024, pp. 264-273.
Acceptance Rate: 22.5%, 29 out of 129.

C40

SA4: A Comprehensive Analysis and Optimization of Systolic Array Architecture for 4-bit Convolutions FPL 2024

Geng Yang, Jie Lei, Zhenman Fang, Jiaqing Zhang, Junrong Zhang, Weiying Xie and Yunsong Li.
The 34th IEEE International Conference on Field-Programmable Logic and Applications (FPL 2024), Turin, Italy, September 2024, pp. 204-212.
Acceptance Rate: 22.5%, 29 out of 129.

C39

BitBlender: Scalable Bloom Filter Acceleration on FPGAs with Dynamic Scheduling FPL 2024

Kenny Liu, Alec Lu, Zhenman Fang.
The 34th IEEE International Conference on Field-Programmable Logic and Applications (FPL 2024), Turin, Italy, September 2024, pp. 325-331.
Acceptance Rate: 22.5%, 29 out of 129.

C38

FORC: A High-Throughput Streaming FPGA Accelerator for Optimized Row Columnar File Decoders in Big Data Engines FPL 2024

Abdul Wadood, Alec Lu, Ken Zhang, Zhenman Fang.
The 34th IEEE International Conference on Field-Programmable Logic and Applications (FPL 2024), Turin, Italy, September 2024, pp. 318-324.
Acceptance Rate: 22.5%, 29 out of 129.

C37

HiTC: High-Performance Triangle Counting on HBM-Equipped FPGAs using HLS PacRim 2024

Junzhe Liang, Manoj Bheemasandra Rajashekar, Xingyu Tian, Zhenman Fang.
The 2024 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim 2024), Victoria, BC, Aug 2024, pp. 1-6.
This paper received the highest review score among all submissions in the Computers track of PacRim 2024.

C36

QUASAR-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers ICS 2024

Zhengang Li, Alec Lu, Yanyue Xie, Zhenglun Kong, Mengshu Sun, Hao Tang, Zhong Jia Xue, Peiyan Dong, Caiwen Ding, Yanzhi Wang, Xue Lin, Zhenman Fang.
The 38th ACM International Conference on Supercomputing (ICS 2024), Kyoto, Japan, Jun 2024, pp. 324–337.
Acceptance Rate: 36%, 45 out of 125.

C35

Efficient Learned Image Compression with Selective Kernel Residual Module and Channel-wise Causal Context Model ICASSP 2024

Haisheng Fu, Feng Liang, Jie Liang, Zhenman Fang, Guohe Zhang, Jingning Han.
The 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024), Seoul, Korea, April 2024, pp. 4040-4044.

C34

Learned Image Compression with Dual-Branch Encoder and Conditional Information Coding DCC 2024

Haisheng Fu, Feng Liang, Jie Liang, Zhenman Fang, Guohe Zhang, Jingning Han.
The 2024 IEEE Data Compression Conference (DCC 2024), Snowbird, Utah, March 2024, pp. 173-182.

C33

HiSpMV: Hybrid Row Distribution and Vector Buffering for Imbalanced SpMV Acceleration on FPGAs FPGA 2024

Manoj Bheemasandra Rajashekar, Xingyu Tian, Zhenman Fang.
The 32nd ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2024), Monterey, CA, Mar 2023, pp. 154-164.
Acceptance Rate: 23.6%, 21 out of 89.
This paper received the highest review score among all FPGA 2024 submissions, tied with two other papers.

C32

PASTA: Programming and Automation Support for Scalable Task-Parallel HLS Programs on Modern Multi-Die FPGAs FCCM 2023

Moazin Khatti, Xingyu Tian, Yuze Chi, Licheng Guo, Jason Cong, Zhenman Fang.
The 31st IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM 2023), Marina Del Rey, CA, May 2023, pp. 12-22.
Acceptance Rate: 21.4%, 15 out of 70.

C31

SQL2FPGA: Automatic Acceleration of SQL Query Processing on Modern CPU-FPGA Platforms FCCM 2023

Alec Lu, Zhenman Fang.
The 31st IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM 2023), Marina Del Rey, CA, May 2023, pp. 184-194.
Acceptance Rate: 21.4%, 15 out of 70.

C30

ESRU: Extremely Low-Bit and Hardware-Efficient Stochastic Rounding Unit Design for 8-Bit DNN Training DATE 2023

Sung-En Chang, Geng Yuan, Alec Lu, Mengshu Sun, Yanyu Li, Xiaolong Ma, Zhengang Li, YanyueXie, Minghai Qin, Xue Lin, Zhenman Fang, and Yanzhi Wang.
Design, Automation and Test in Europe Conference (DATE 2023), Antwerp, Belgium, Apr 2023, pp. 1-6.
Acceptance Rate: 25%.

C29

HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers HPCA 2023

Peiyan Dong, Mengshu Sun, Alec Lu, Yanyue Xie, Kenneth Liu, Zhenglun Kong, Xin Meng, Zhengang Li, Xue Lin, Zhenman Fang, and Yanzhi Wang.
The 29th IEEE International Symposium on High-Performance Computer Architecture (HPCA 2023), Montreal, QC, Canada, Feb-Mar 2023, pp. 442-455.
Acceptance Rate: 25%, 91 out of 360.

C28

You Already Have It: A Generator-Free Low-Precision DNN Training Framework using Stochastic Rounding ECCV 2022

Geng Yuan, Sung-En Chang, Qing Jin, Alec Lu, Yanyu Li, Yushu Wu, Zhenglun Kong, Yanyue Xie, Peiyan Dong, Minghai Qin, Xiaolong Ma, Xulong Tang, Zhenman Fang, and Yanzhi Wang.
The European Conference on Computer Vision (ECCV 2022), Tel Aviv, Israel, Oct 2022. Lecture Notes in Computer Science, vol 13672, pp. 34–51.
Acceptance Rate: 28%, 1,645 out of 5,804.

C27

Blind Data Adversarial Bit-flip Attack against Deep Neural Networks DSD 2022

Behnam Ghavami, Mani Sadati, Mohammad Shahidzadeh, Zhenman Fang and Lesley Shannon.
The Euromicro Conference on Digital Systems Design (DSD 2022), Virtual Conference, Aug-Sept 2022, pp. 899-904.

C26

A Majority-based Approximate Adder for FPGAs DSD 2022

Behnam Ghavami, Mahdi Sajedi, Mohsen Raji, Zhenman Fang and Lesley Shannon.
The Euromicro Conference on Digital Systems Design (DSD 2022), Virtual Conference, Aug-Sept 2022, pp. 53-59.

C25

Auto-ViT-Acc: FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization FPL 2022

Zhengang Li, Mengshu Sun, Alec Lu, Haoyu Ma, Geng Yuan, Yanyue Xie, Hao Tang, Yanyu Li, Miriam Leeser, Zhangyang Wang, Xue Lin, Zhenman Fang.
The 32nd International Conference on Field-Programmable Logic and Applications (FPL 2022), Belfast, UK, Aug-Sept 2022, pp. 109-116.
Acceptance Rate: 25.6%, 33 out of 129.

C24

Stealthy Attack on Algorithmic-Protected DNNs via Smart Bit Flipping ISQED 2022

Behnam Ghavami, Sayed Hamid Reza Mousavi, Zhenman Fang, Lesley Shannon.
The 23nd International Symposium on Quality Electronic Design (ISQED 2022), Virtual Conference, Apr 2022, pp. 358-364.

C23

FILM-QNN: Efficient FPGA Acceleration of Deep Neural Networks with Intra-Layer, Mixed-Precision Quantization FPGA 2022

Mengshu Sun, Zhengang Li, Alec Lu, Yanyu Li, Sung-En Chang, Xiaolong Ma, Xue Lin, and Zhenman Fang.
The 32nd ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2022), Virtual Conference, Feb/Mar 2022, pp. 134–145.
Acceptance Rate: 20.8%, 15 out of 72.

C22

FitAct: Error Resilient Deep Neural Networks via Fine-Grained Post-Trainable Activation Functions DATE 2022

Behnam Ghavami, Mani Sadati, Zhenman Fang, and Lesley Shannon.
The Design, Automation and Test in Europe Conference (DATE 2022), Virtual Conference, Mar 2022, pp. 1239-1244.
Acceptance Rate: 25%.

C21

SyncNN: Evaluating and Accelerating Spiking Neural Networks on FPGAs FPL 2021

Sathish Panchapakesan, Zhenman Fang, and Jian Li.
The 31st International Conference on Field-Programmable Logic and Applications (FPL 2021), Virtual Conference, Sept 2021, pp. 286-293.
Acceptance Rate: 22.2%, 32 out of 144.

C20

Demystifying the Memory System of Modern Datacenter FPGAs for Software Programmers through Microbenchmarking FPGA 2021

Alec Lu, Zhenman Fang, Weihua Liu, and Lesley Shannon
The 29th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2021) , Virtual Conference, Mar 2021, pp. 105–115.
Acceptance Rate: 19.8%, 22 out of 111.
This is one of the top papers at FPGA 2021 that are highlighted in the FPGA 2021 Special Issue in ACM TRETS.

C19

CHIP-KNN: A Configurable and High-Performance K-Nearest Neighbors Accelerator on Cloud FPGAs FPT 2020

Alec Lu, Zhenman Fang, Nazanin Farahpour, and Lesley Shannon
The 2020 The International Conference on Field-Programmable Technology (FPT 2020), Virtual Conference, Dec 2020, pp. 139-147.
Acceptance Rate: 24.7%, 21 out of 85.

C18

Reconfigurable Accelerator Compute Hierarchy: A Case Study Using Content-Based Image Retrieval IISWC 2020

Nazanin Farahpour, Yuchen Hao, Zhenman Fang and Glenn Reinman
The 2020 IEEE International Symposium on Workload Characterization (IISWC 2020), Virtual Conference, Oct 2020, pp. 276-287.
Acceptance Rate: 37.1%, 26 out of 70.

C17

Aadam: A Fast, Accurate, and Versatile Aging-Aware Cell Library Delay Model using Feed-Forward Neural Network ICCAD 2020

Seyed Milad Ebrahimipour, Behnam Ghavami, Hamdi Mousavi, Mohsen Raji, Zhenman Fang and Lesley Shannon
The 2020 International Conference On Computer Aided Design (ICCAD 2020), Virtual Conference, Nov 2020, pp. 1-9.
Acceptance Rate: 27.0%, 127 out of 470.

C16

Algorithm-Hardware Co-design for BQSR Acceleration in Genome Analysis ToolKit FCCM 2020

Michael Lo, Zhenman Fang, Jie Wang, Peipei Zhou, Mau-Chung Frank Chang and Jason Cong
The 28th IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM 2020), Fayetteville, AR, USA, May 2020, pp. 157-166.
Acceptance Rate: 20.7%, 19 out of 92.

C15

[Invited paper] Understanding Performance Gains of Accelerator-Rich Architectures ASAP 2019

Zhenman Fang, Farnoosh Javadi, Jason Cong, Glenn Reinman
The 30th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2019), New York NY, Jul 2019, pp. 239-246.

C14

Rethinking Integer Divider Design for FPGA-based Soft-Processors FCCM 2019

Eric Matthews, Alec Lu, Lesley Shannon, Zhenman Fang
The 27th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2019), San Diego CA, Apr 2019, pp. 289-291.
Acceptance Rate: 25.8%, 31 out of 120.

C13

High-Throughput Lossless Compression on Tightly-Coupled CPU-FPGA Platforms FCCM 2018

Weikang Qiao, Jieqiong Du, Zhenman Fang, Jason Cong, Mau-Chung Frank Chang
The 26th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2018), Boulder CO, May 2018, pp. 37-44.
Acceptance Rate: 20.8%, 22 out of 106.

C12

Doppio: I/O-Aware Performance Analysis, Modeling and Optimization for In-Memory Computing Framework ISPASS 2018 Best Paper Nominee

Peipei Zhou, Zhenyuan Ruan, Zhenman Fang, Jason Cong, Megan Shand, David Roazen
The 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2018 Best Paper Nominee), Belfast, Northern Ireland, UK, Apr 2018, pp. 22-32.
Acceptance Rate: 31.3%, 21 out of 67. Best paper nominee rate: 6.0%, 4 out of 67.

C11

AIM: Accelerating Computational Genomics through Scalable and Noninvasive Accelerator-Interposed Memory MEMSYS 2017 Best Paper

Jason Cong, Zhenman Fang, Michael Gill, Farnoosh Javadi, Glenn Reinman
The International Symposium on Memory Systems (MEMSYS 2017 Best Paper Award), Alexandria, VA, Oct 2017, pp. 3-14.
Acceptance Rate: N/A. Best paper award rate: << 2.4%, 1 out of 42 accepted papers.

C10

Supporting Address Translation for Accelerator-Centric Architectures HPCA 2017 Best Paper Nominee

Jason Cong, Zhenman Fang, Yuchen Hao, Glenn Reinman
The 23rd IEEE Symposium on High Performance Computer Architecture (HPCA 2017 Best Paper Nominee), Austin TX, Feb 2017, pp. 37-48.
Acceptance Rate: 22.3%, 50 out of 224. Best paper nominee rate: 1.8%, 4 out of 224.
This paper has been cited more than 120 times.

C9

Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks ICCAD 2016

Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong
The 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2016), Austin TX, Nov 2016, pp. 1-8.
Acceptance Rate: 23.7%, 97 out of 409.

C8

Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale ACM SoCC 2016

Muhuan Huang, Di Wu, Cody Hao Yu, Zhenman Fang, Matteo Interlandi, Tyson Condie, Jason Cong
The ACM Symposium on Cloud Computing (ACM SoCC 2016), Santa Clara, CA, Oct 2016, pp. 456-469.
Acceptance Rate: 25.2%, 38 out of 151.
This paper has been cited more than 100 times.

C7

When Apache Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration HotCloud 2016

Yu-Ting Chen, Jason Cong, Zhenman Fang, Jie Lei, Peng Wei
The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 2016), Denver CO, Jun 2016, pp. 64-70.
Acceptance Rate: 30.9%, 21 out of 68.
This paper has been cited more than 100 times.

C6

A Quantitative Analysis on Microarchitectures of Modern CPU-FPGA Platforms DAC 2016

Young-kyu Choi, Jason Cong, Zhenman Fang, Yuchen Hao, Glenn Reinman, Peng Wei
53rd Design Automation Conference (DAC 2016), Austin TX, Jun 2016, pp. 1-6.
Acceptance Rate: 22.6%, 152 out of 674.
This paper has been cited more than 190 times.

C5

PARADE: A Cycle-Accurate Full-System Simulation Platform for Accelerator-Rich Architectural Design and Exploration ICCAD 2015

Jason Cong, Zhenman Fang, Michael Gill, Glenn Reinman
2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2015), Austin TX, Nov 2015, pp. 380-387.
Acceptance Rate: 24.6%, 94 out of 382.

C4

Multi-Stage Coordinated Prefetching for Present-day Processors ICS 2014

Sanyam Mehta, Zhenman Fang, Antonia Zhai, Pen-Chung Yew
Proceedings of the 28th International Conference on Supercomputing (ICS 2014), Munich, Germany, Jun 2014, pp. 73-82.
Acceptance Rate: 21.3%, 34 out of 160.

C3

Transformer: A Functional-Driven Cycle-Accurate Multicore Simulator DAC 2012

Zhenman Fang, Qinghao Min, Keyong Zhou, Yi Lu, Yibin Hu, Weihua Zhang, Haibo Chen, Jian Li, Binyu Zang
The 49th Design Automation Conference (DAC 2012), San Francisco CA, Jun 2012, pp. 106-114.
Acceptance Rate: 22.7%, 168 out of 741.

C2

Improving Dynamic Prediction Accuracy Through Multi-Level Phase Analysis LCTES 2012

Zhenman Fang, Jiaxin Li, Weihua Zhang, Yi Li, Haibo Chen, Binyu Zang
Proceedings of the 2012 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2012), Beijing, China, Jun 2012, pp. 89-98.
Acceptance Rate: 22.7%, 15 out of 66.

C1

A Comprehensive Analysis and Parallelization of an Image Retrieval Algorithm ISPASS 2011

Zhenman Fang, Donglei Yang, Weihua Zhang, Haibo Chen, Binyu Zang
The 2011 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2011), Austin TX, Apr 2011, pp. 154-164.
Acceptance Rate: 37.5%, 24 out of 64.


Conference Publications (Short Papers)

SC5

MAPLE: A Machine Learning based Aging-Aware FPGA Architecture Exploration Framework FPL 2021 Short Paper

Behnam Ghavami, Milad Ebrahimi, Zhenman Fang, Lesley Shannon
The 31st International Conference on Field-Programmable Logic and Applications (FPL 2021 Short Paper), Virtual Conference, Sept 2021, pp. 369-373.
Acceptance Rate: 37.5%, 54 out of 144.

SC4

FPGA-based Near Data Processing Platform Selection Using Fast Performance Modeling LCTES 2020 Short WIP Paper

Nazanin Farahpour, Zhenman Fang, and Glenn Reinman
The 21st ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2020 Short WIP Paper), June 2020, pp. 151-155.

SC3

An FPGA-based BWT Accelerator for Bzip2 Data Compression FCCM 2019 Short Paper

Mau-Chung Frank Chang, Jason Cong, Zhenman Fang, Weikang Qiao
The 27th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2019 short paper), San Diego CA, Apr 2019, pp. 96-99.
Acceptance Rate: 17.9%, 7 out of 39.

SC2

Understanding Performance Differences of FPGAs and GPUs FCCM 2018 Short Paper

Jason Cong, Zhenman Fang, Michael Lo, Hanrui Wang, Jingxian Xu, Shaochong Zhang
The 26th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2018 short paper), Boulder CO, May 2018, pp. 172-175.
Acceptance Rate: 14.6%, 7 out of 48.
This paper has been cited more than 160 times.

SC1

Energy Efficiency of Fully Pipelining: A Case Study for Matrix Multiplication FCCM 2016 Short Paper

Peipei Zhou, HyunSeok Park, Zhenman Fang, Jason Cong, Andre DeHon
The 24th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2016 short paper), Washington DC, May 2016, pp. 172-175.
Acceptance Rate: 24.1%, 32 out of 133.


Patents

P3

Heterogeneous instantiation of high-level language callable library for hardware core

Zhenman Fang, James L Hwang, Alfred Huang, Michael Gill, Tom Shui
US patent: 10762265 B1. Publication date: Sept 1, 2020.

P2

Automatic creation of high-level language callable library for a hardware core

Zhenman Fang, James L Hwang, Samuel A Skalicky, Tom Shui, Michael Gill, Welson Sun, Alfred Huang, Jorge E Carrillo, Chen Pan
US patent: 10755013 B1. Publication date: Aug 25, 2020.

P1

Image/video feature extraction parallel algorithm based on multi-core system structure

Weihua Zhang, Zhenman Fang, Donglei Yang, Binyu Zang
China patent: CN102495725 A. Publication date: Jun 13, 2012.


Technical Reports

TR4

SeaPlace: Process Variation Aware Placement for Reliable Combinational Circuits against SETs and METs arXiv 2021

Kiarash Saremi, Hossein Pedram, Behnam Ghavami, Mohsen Raji, Zhenman Fang, and Lesley Shannon
arXiv:2112.04136 [cs.AR], 2021.

TR3

Best-Effort FPGA Programming: A Few Steps Can Go a Long Way arXiv 2018

Jason Cong, Zhenman Fang, Yuchen Hao, Peng Wei, Cody Hao Yu, Chen Zhang, Peipei Zhou
arXiv:1807.01340 [cs.AR] 2018.

TR2

ARAPrototyper: Enabling Rapid Prototyping and Evaluation for the Accelerator-Rich Architecture arXiv 2016

Yu-Ting Chen, Jason Cong, Zhenman Fang, Bingjun Xiao, Peipei Zhou
arXiv:1610.09761 [cs.AR] 2016.

TR1

Revisiting FPGA Acceleration of Molecular Dynamics Simulation with Dynamic Data Flow Behavior in High-Level Synthesis arXiv 2016

Jason Cong, Zhenman Fang, Hassan Kianinejad, Peng Wei
arXiv:1611.04474 [physics.comp-ph] 2016.


Conference Abstracts

A15

E4SA: An Ultra-Efficient Systolic Array Architecture for 4-Bit Convolutional Neural Networks FPGA 2024 poster

Geng Yang, Jie Lei, Zhenman Fang, Jiaqing Zhang, Junrong Zhang, Weiying Xie, Yunsong Li.
The 32nd ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2024 poster), Monterey, CA, Mar 2023.

A14

HyBNN: Quantifying and Optimizing Hardware Efficiency of Binary Neural Networks FCCM 2023 poster

Geng Yang, Jie Lei, Zhenman Fang, Yunsong Li, Jiaqing Zhang, Weiying Xie
The 31st IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM 2023 poster), Marina Del Rey, CA, May 2023, pp. 203.

A13

FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization DAC 2022 LBR

Mengshu Sun, Zhengang Li, Alec Lu, Haoyu Ma, Geng Yuan, Yanyue Xie, Hao Tang, Yanyu Li, Miriam Leeser, Zhangyang Wang, Xue Lin, and Zhenman Fang.
In Proceedings of the 59th ACM/IEEE Design Automation Conference (DAC 2022 Late Breaking Results), San Francisco, CA, USA, Jul 2022, pp. 1394–1395.

A12

Hardware-Efficient Stochastic Rounding Unit Design for DNN Training DAC 2022 LBR

Sung-En Chang, Geng Yuan, Alec Lu, Mengshu Sun, Yanyu Li, Xiaolong Ma, Zhengang Li, Yanyue Xie, Minghai Qin, Xue Lin, Zhenman Fang, and Yanzhi Wang.
In Proceedings of the 59th ACM/IEEE Design Automation Conference (DAC 2022 Late Breaking Results), San Francisco, CA, USA, Jul 2022, pp. 1396–1397.

A11

You Already Have It: A Generator-Free Low-Precision DNN Training Framework using Stochastic Rounding DAC 2022 WIP

Geng Yuan, Sung-En Chang, Qing Jin, Alec Lu, Yanyu Li, Yushu Wu, Zhenglun Kong, Yanyue Xie, Peiyan Dong, Xiaolong Ma, Xulong Tang, Minghai Qin, Zhenman Fang, and Yanzhi Wang.
The 59th ACM/IEEE Design Automation Conference 2022 (DAC 2022 WIP), San Francisco, CA, USA, Jul 2022.

A10

TopSort: A High-Performance Two-Phase Sorting Accelerator Optimized on HBM-based FPGAs FCCM 2022 poster

Weikang Qiao, Licheng Guo, Zhenman Fang, Mau-Chung Frank Chang and Jason Cong
The 30th IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM 2022 poster), New York City, NY, USA, May 2020.

A9

Stealth Attack on Protected DNNs: Compromising Robustness without Losing Accuracy via Smart Bit Flipping DAC 2021 WIP

Behnam Ghavami, Sayed Hamid Reza Mousavi, Zhenman Fang and Lesley Shannon
The Design Automation Conference 2021 (DAC 2021 WIP), San Francisco, CA, USA, Dec 2021.

A8

LEAP: A Deep Learning based Aging-Aware Architecture Exploration Framework for FPGAs FPGA 2021 poster

Behnam Ghavami, S. Milad Ebrahimipour, Zhenman Fang and Lesley Shannon
The 29th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2021 poster) , Virtual Conference, Mar 2021.

A7

EASpiNN: Effective Automated Spiking Neural Network Evaluation on FPGA FCCM 2020 poster

Sathish Panchapakesan, Zhenman Fang and Nitin Chandrachoodan
The 28th IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM 2020 poster), Fayetteville, AR, USA, May 2020.

A6

K-Flow: A Dynamic Job Scheduling Framework to Optimize Dataflow Execution on CPU-FPGA Platforms FPGA 2018 poster

Jason Cong, Zhenman Fang, Yao Hu and Di Wu
The 26th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2018 poster), 2018.

A5

Understanding Performance Differences of FPGAs and GPUs FPGA 2018 poster

Jason Cong, Zhenman Fang, Michael Lo, Hanrui Wang, Jingxian Xu, Shaochong Zhang
The 26th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2018 poster), 2018.

A4

High-Throughput Lossless Compression on Tightly-Coupled CPU-FPGA Platforms FPGA 2018 poster

Weikang Qiao, Jieqiong Du, Zhenman Fang, Jason Cong, Mau-Chung Frank Chang
The 26th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2018 poster), 2018.

A3

CPU-FPGA Co-Optimization for Big Data Applications: A Case Study of In-Memory Samtool Sorting FPGA 2017 poster

Jason Cong, Zhenman Fang, Muhuang Huang, Libo Wang, Di Wu
The 25th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2017 poster), 2017.

A2

When Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration FCCM 2016 poster

Yu-Ting Chen, Jason Cong, Zhenman Fang, Jie Lei, Peng Wei
The 24th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2016 poster), 2016.

A1

ARAPrototyper: Enabling Rapid Prototyping and Evaluation for the Accelerator-Rich Architecture FPGA 2016 Poster

Yu-Ting Chen, Jason Cong, Zhenman Fang, Peipei Zhou
The 24th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 16 poster), 2016.