首页 | 本学科首页   官方微博 | 高级检索  
   检索      

异构千核处理器系统的统一内存地址空间访问方法
引用本文:裴颂文,吴小东,唐作其,熊乃学.异构千核处理器系统的统一内存地址空间访问方法[J].国防科技大学学报,2015,37(1):28-33.
作者姓名:裴颂文  吴小东  唐作其  熊乃学
作者单位:1. 上海理工大学 计算机科学与工程系,上海 200093; 中国科学院 计算机体系结构国家重点实验室,北京 100190; 加利福尼亚大学 电气工程与计算机科学系,加利福尼亚 92697
2. 上海理工大学 计算机科学与工程系,上海,200093
3. 贵州大学 计算机科学与技术学院,贵州 贵阳,550025
4. 科罗拉多科技大学 计算机科学学院,科罗拉多 80907
基金项目:计算机体系结构国家重点实验室开放资助项目(CARCH201206);上海理工大学国家级项目培育基金资助项目(12XGQ07);贵阳市科技计划项目(2011101414);贵州省科技支撑项目(20123050)
摘    要:为了达到异构多核处理器能直接交叉访问对方的内存地址空间的目的,通过构建统一的三级Cache结构和数据块状态标记方法,并优化Cache块状态的修改算法,提出了异构千核处理器系统的统一内存地址空间访问方法,避免了当前独立式异构计算机系统结构下复制和传输数据块所带来的大量额外访存开销。通过采用部分Rodinia基准测试程序测试,获得了最高9.8倍的系统加速比,最多减少了90%的访存频率。因此,采用该方法能有效减少异构核心间交换数据块所带来的系统开销,提高异构千核处理器的系统性能加速比。

关 键 词:异构千核处理器  内存地址空间  交叉式直接访问  Cache
收稿时间:2014/6/10 0:00:00

An approach to accessing unified memory address space of heterogeneous kilo-cores system
PEI Songwen,WU Xiaodong,TANG Zuoqi and XIONG Naixue.An approach to accessing unified memory address space of heterogeneous kilo-cores system[J].Journal of National University of Defense Technology,2015,37(1):28-33.
Authors:PEI Songwen  WU Xiaodong  TANG Zuoqi and XIONG Naixue
Institution:PEI Songwen;WU Xiaodong;TANG Zuoqi;XIONG Naixue;Department of Computer Science & Engineering,University of Shanghai for Science and Technology;State Key Laboratory of Computer Architecture,Chinese Academy of Sciences;Department of Electrical Engineering and Computer Science,University of California;School of Computer Science and Technology,University of Guizhou;School of Computer Science,Colorado Technical University;
Abstract:In order to access independent memory space of CPU and GPU directly from opposite directions, an effective approach to accessing unified memory address space of heterogeneous kilo-cores system is proposed, which is implemented by building a unified level-3 Cache and tagging blocks in Cache, and optimizing the algorithms of modifying the states of blocks. Therefore, the heterogeneous kilo-cores system avoids significant overhead of accessing memory instead of that in current discrete hybrid computer system equipped with GPUs by PCI-E. According to the results of experiments from partial programs of Rodinia benchmarks, a maximal speedup by 9.8x and maximal decrease of load/store instructions by 90% are gained. In conclusion, it's certified that our solution is effective to decrease overhead of transferring data among computing units in heterogeneous system and significantly enhance the whole system computing performance.
Keywords:heterogeneous kilo-cores processors  memory address space  directly access from opposite directions  Cache
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《国防科技大学学报》浏览原始摘要信息
点击此处可从《国防科技大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号