一种基于FMA加速FFT计算的向量化方法 A Vectorization of Accelerating FFT ComputationBased on FMA Instruction期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

一种基于FMA加速FFT计算的向量化方法

引用本文：	刘仲,陈海燕,向宏卫.一种基于FMA加速FFT计算的向量化方法[J].国防科技大学学报,2015,37(2).

作者姓名：	刘仲陈海燕向宏卫

作者单位：	国防科学技术大学计算机学院,国防科学技术大学计算机学院,国防科学技术大学计算机学院

基金项目：	国家自然科学基金项目（面上项目，重点项目，重大项目）

摘要：	提出一种基于融合乘加指令加速FFT计算的向量化方法，通过变换FFT的蝶形单元运算流程，将传统计算方式中独立的乘法和加法操作组合成次数更少的融合乘加操作，使得DIT基2 FFT算法的蝶形单元计算的实数浮点操作由原来的10次乘(加)操作减少到6次融合乘加操作，DIT基4 FFT算法的蝶形单元计算的实数浮点操作由原来的34次乘(加)操作减少到24次融合乘加操作；优化了蝶形因子的向量访问，减少存储开销。实验结果表明，提出的方法能够显著加速FFT的计算，取得高效的计算性能和效率。
关键词：	快速傅里叶变换融合乘加向量化向量处理器.
A Vectorization of Accelerating FFT ComputationBased on FMA Instruction

Abstract:	A vectorization of accelerating FFT computation based on Fused Multiply-Add (FMA) instruction was presented. Separate multiplication and addition operations in conventional computation are manipulated into less FMA operations by transforming process of FFT butterfly computation, which decreases the real floating-point operations of radix-2 DIT FFT butterfly computation from 10 multiplication (addition) operations to 6 Multiply-Add operations and the real floating-point operations of radix-4 DIT FFT butterfly computation from 34 multiplication (addition) operations to 24 Multiply-Add operations. Moreover, vector data access on twiddle factors was optimized to reduce memory cost. Experimental results show that the presented method can greatly accelerate FFT computation and achieve efficient performance and efficiency.

Keywords:	Fast Fourier Transform Fused Multiply-Add Vectorization Vector Processor

	点击此处可从《国防科技大学学报》浏览原始摘要信息
	点击此处可从《国防科技大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏