gemm_hls:在Xilinx FPGA的Vivado HLS中实现的基于可伸缩脉动阵列的矩阵矩阵乘法-源码

上传者: 42121058 | 上传时间: 2021-09-16 09:28:32 | 文件大小: 46KB | 文件类型: ZIP
FPGA上的可扩展矩阵矩阵乘法 该存储库包括用于Xilinx FPGA的矩阵矩阵乘法(A * B = C)的纯Vivado HLS实现,使用Xilinx Vitis / SDx / SDAccel实例化内存和PCIe控制器并与主机接口。 在上进行的实验实现了一半,单精度和双精度的462 GFLOP / s,301 GFLOP / s和132 GFLOP / s,其中跨越三个SLR的路由是主要瓶颈,阻止了进一步扩展。 该代码不是特定于设备的,可以为Xilinx OpenCL运行时支持的任何Xilinx FPGA进行配置。 内核也已验证可在TUL KU115和Alveo U250板上执行,结果相似。 该实现使用脉动阵列方法,其中线性连接的处理元素计算对输出矩阵图块的外部乘积的不同贡献。 在 [1]中介绍了用于实现该内核的方法。 有关我们应用的优化技术的一般说明,请参阅有关的文章[2]。

文件下载

资源详情

[{"title":"( 24 个子文件 46KB ) gemm_hls:在Xilinx FPGA的Vivado HLS中实现的基于可伸缩脉动阵列的矩阵矩阵乘法-源码","children":[{"title":"gemm_hls-master","children":[{"title":".gitignore <span style='color:#111;'> 31B </span>","children":null,"spread":false},{"title":"powermeter","children":null,"spread":false},{"title":"src","children":[{"title":"PrintSpecifications.cpp <span style='color:#111;'> 3.23KB </span>","children":null,"spread":false}],"spread":true},{"title":"LICENSE.md <span style='color:#111;'> 1.53KB </span>","children":null,"spread":false},{"title":"kernel","children":[{"title":"Top.cpp <span style='color:#111;'> 4.50KB </span>","children":null,"spread":false},{"title":"Memory.cpp <span style='color:#111;'> 14.79KB </span>","children":null,"spread":false},{"title":"Compute.cpp <span style='color:#111;'> 8.60KB </span>","children":null,"spread":false}],"spread":true},{"title":".gitmodules <span style='color:#111;'> 183B </span>","children":null,"spread":false},{"title":"host","children":[{"title":"RunHardware.cpp <span style='color:#111;'> 7.29KB </span>","children":null,"spread":false}],"spread":true},{"title":"scripts","children":[{"title":"optimal_memory_tile_size.py <span style='color:#111;'> 1.44KB </span>","children":null,"spread":false},{"title":"Synthesis.tcl.in <span style='color:#111;'> 591B </span>","children":null,"spread":false},{"title":"build_manager.py <span style='color:#111;'> 27.08KB </span>","children":null,"spread":false}],"spread":true},{"title":"README.md <span style='color:#111;'> 6.06KB </span>","children":null,"spread":false},{"title":"cmake","children":[{"title":"CheckFunctionExists.cmake <span style='color:#111;'> 6.20KB </span>","children":null,"spread":false},{"title":"CheckFunctionExists.c <span style='color:#111;'> 417B </span>","children":null,"spread":false},{"title":"CMakePushCheckState.cmake <span style='color:#111;'> 6.10KB </span>","children":null,"spread":false},{"title":"FindBLAS.cmake <span style='color:#111;'> 22.03KB </span>","children":null,"spread":false},{"title":"CheckFortranFunctionExists.cmake <span style='color:#111;'> 5.03KB </span>","children":null,"spread":false}],"spread":true},{"title":"include","children":[{"title":"Utility.h <span style='color:#111;'> 4.32KB </span>","children":null,"spread":false},{"title":"Compute.h <span style='color:#111;'> 647B </span>","children":null,"spread":false},{"title":"Memory.h <span style='color:#111;'> 2.32KB </span>","children":null,"spread":false},{"title":"Config.h.in <span style='color:#111;'> 1.77KB </span>","children":null,"spread":false},{"title":"MatrixMultiplication.h <span style='color:#111;'> 5.69KB </span>","children":null,"spread":false}],"spread":true},{"title":"hlslib","children":null,"spread":false},{"title":"test","children":[{"title":"TestSimulation.cpp <span style='color:#111;'> 3.12KB </span>","children":null,"spread":false}],"spread":true},{"title":"CMakeLists.txt <span style='color:#111;'> 12.52KB </span>","children":null,"spread":false}],"spread":false}],"spread":true}]

评论信息

免责申明

【只为小站】的资源来自网友分享,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,【只为小站】 无法对用户传输的作品、信息、内容的权属或合法性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论 【只为小站】 经营者是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。
本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二条之规定,若资源存在侵权或相关问题请联系本站客服人员,zhiweidada#qq.com,请把#换成@,本站将给予最大的支持与配合,做到及时反馈和处理。关于更多版权及免责申明参见 版权及免责申明