Gemm/Block Class Template Overview
【免费下载链接】catlass本项目是CANN的算子模板库,提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass
API List
blockMmad List
| Component | Description |
|---|---|
| block_mmad | Basic template, including BlockMmad. |
| block_mmad_pingpong | Partial specialization of BlockMmad implementing ping-pong matrix multiplication. |
Swizzle List
| Component | Description |
|---|---|
| block_swizzle | Basic swizzle methods |
| GemmIdentityBlockSwizzle | Basic swizzle policy for the GEMM operator |
API Breakdown
blockMmad
The blockMmad structure encapsulates the MMAD computation at the Block layer, mapping directly to execution on a single AI Core of the Ascend NPU. Through template parameters, it receives configuration details defining the matrix shapes, tensor layouts (such as row-major or column-major), and data types (DType).
The namespace isCatlass::Gemm::Block. Core members:
| Type | Name | Function |
|---|---|---|
| Constructor | BlockMmad() | Initializes buffers, registers event IDs, and insertssetFlagprimitives for pipeline synchronization. |
| Destructor | ~BlockMmad() | InsertswaitFlagprimitives for pipeline synchronization. |
| Function | void operator() | Executes the matrix multiplication for a Block task. |
【免费下载链接】catlass本项目是CANN的算子模板库,提供NPU上高性能矩阵乘及其相关融合类算子模板样例。项目地址: https://gitcode.com/cann/catlass
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考