news 2026/5/9 13:59:40

CANN PTO-ISA 矩阵乘法

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
CANN PTO-ISA 矩阵乘法

Matrix Multiply

【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa

This document describes matrix multiplication and matrix-vector operations.

Total Operations:8


Operations

TGEMV_MX

For detailed instruction documentation, see isa/TGEMV_MX

AS Level 1 (SSA):

%acc = pto.tgemv.mx %a, %a_scale, %b, %b_scale : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS):

pto.tgemv.mx ins(%a, %a_scale, %b, %b_scale : (!pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>)) outs(%acc : !pto.tile_buf<...>)

TMATMUL_MX

For detailed instruction documentation, see isa/TMATMUL_MX

AS Level 1 (SSA):

%c = pto.tmatmul.mx %a, %a_scale, %b, %b_scale : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...> %c_out = pto.tmatmul.mx.acc %c_in, %a, %a_scale, %b, %b_scale : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...> %c = pto.tmatmul.mx.bias %a, %a_scale, %b, %b_scale, %bias : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS):

pto.tmatmul.mx ins(%a, %a_scale, %b, %b_scale : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>) pto.tmatmul.mx.acc ins(%c_in, %a, %a_scale, %b, %b_scale : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c_out : !pto.tile_buf<...>) pto.tmatmul.mx.bias ins(%a, %a_scale, %b, %b_scale, %bias : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>)

TMATMUL

For detailed instruction documentation, see isa/TMATMUL

AS Level 1 (SSA):

%c = pto.tmatmul %a, %b : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS):

pto.tmatmul ins(%a, %b : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>)

TMATMUL_ACC

For detailed instruction documentation, see isa/TMATMUL_ACC

AS Level 1 (SSA):

%c_out = pto.tmatmul.acc %c_in, %a, %b : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS):

pto.tmatmul.acc ins(%c_in, %a, %b : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c_out : !pto.tile_buf<...>)

TMATMUL_BIAS

For detailed instruction documentation, see isa/TMATMUL_BIAS

AS Level 1 (SSA):

%c = pto.tmatmul.bias %a, %b, %bias : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS):

pto.tmatmul.bias ins(%a, %b, %bias : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>)

TGEMV

For detailed instruction documentation, see isa/TGEMV

AS Level 1 (SSA):

%c = pto.tgemv %a, %b : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...> %c_out = pto.tgemv.acc %c_in, %a, %b : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...> %c = pto.tgemv.bias %a, %b, %bias : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS):

pto.tgemv ins(%a, %b : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>) pto.tgemv.acc ins(%c_in, %a, %b : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c_out : !pto.tile_buf<...>) pto.tgemv.bias ins(%a, %b, %bias : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>)

TGEMV_ACC

For detailed instruction documentation, see isa/TGEMV_ACC

AS Level 1 (SSA):

%c = pto.tgemv %a, %b : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...> %c_out = pto.tgemv.acc %c_in, %a, %b : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...> %c = pto.tgemv.bias %a, %b, %bias : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS):

pto.tgemv ins(%a, %b : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>) pto.tgemv.acc ins(%c_in, %a, %b : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c_out : !pto.tile_buf<...>) pto.tgemv.bias ins(%a, %b, %bias : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>)

TGEMV_BIAS

For detailed instruction documentation, see isa/TGEMV_BIAS

AS Level 1 (SSA):

%c = pto.tgemv %a, %b : (!pto.tile<...>, !pto.tile<...>) -> !pto.tile<...> %c_out = pto.tgemv.acc %c_in, %a, %b : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...> %c = pto.tgemv.bias %a, %b, %bias : (!pto.tile<...>, !pto.tile<...>, !pto.tile<...>) -> !pto.tile<...>

AS Level 2 (DPS):

pto.tgemv ins(%a, %b : !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>) pto.tgemv.acc ins(%c_in, %a, %b : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c_out : !pto.tile_buf<...>) pto.tgemv.bias ins(%a, %b, %bias : !pto.tile_buf<...>, !pto.tile_buf<...>, !pto.tile_buf<...>) outs(%c : !pto.tile_buf<...>)

【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/5/9 13:58:33

Python量化投资终极指南:如何使用pywencai快速获取同花顺问财数据

Python量化投资终极指南&#xff1a;如何使用pywencai快速获取同花顺问财数据 【免费下载链接】pywencai 获取同花顺问财数据 项目地址: https://gitcode.com/gh_mirrors/py/pywencai 在量化投资和金融数据分析领域&#xff0c;获取高质量、结构化的金融数据是每个分析师…

作者头像 李华
网站建设 2026/5/9 13:49:38

行深智能开源EdgeFM推理框架:为物流小车解锁灵魂的底层技术实践

点击下方卡片&#xff0c;关注“自动驾驶之心”公众号戳我-> 领取自动驾驶近30个方向学习路线编辑 | 自动驾驶之心>>自动驾驶前沿信息获取→自动驾驶之心知识星球01.让具身智能在国产芯片上跑通确定性低延迟&#xff0c;行深智能如何打破边缘AI的生态垄断在行深智能的…

作者头像 李华
网站建设 2026/5/9 13:46:24

第五篇:锻造大脑——为什么算法公开,你却造不出 GPT?

书接上文。同学问&#xff1a;“既然 CNN、Transformer 的论文和代码都是开源的&#xff0c;我能不能在寝室里手搓一个 DeepSeek 或者 GPT-4&#xff1f;” 这就像虽然米其林餐厅的菜谱&#xff08;算法&#xff09;是公开的&#xff0c;但要把菜做成艺术品&#xff0c;你还需要…

作者头像 李华
网站建设 2026/5/9 13:45:46

外贸版GEO优化和海外版GEO区别?

在全球数字经济一体化的背景下&#xff0c;生成式引擎优化&#xff08;GEO&#xff09;作为应对AI搜索变革的关键技术&#xff0c;其应用策略因目标市场与生态系统的不同而产生显著分野。本文旨在从行业分析视角&#xff0c;厘清面向中国出口企业的“外贸版GEO优化”与广义上面…

作者头像 李华
网站建设 2026/5/9 13:45:38

CANN/ge图编译器API文档

SetCompileConfig&#xff08;GraphPp类&#xff09; 【免费下载链接】ge GE&#xff08;Graph Engine&#xff09;是面向昇腾的图编译器和执行器&#xff0c;提供了计算图优化、多流并行、内存复用和模型下沉等技术手段&#xff0c;加速模型执行效率&#xff0c;减少模型内存占…

作者头像 李华
网站建设 2026/5/9 13:44:50

CANN自动融合精度测试报告

自动融合精度测试报告 【免费下载链接】graph-autofusion Graph-autofusion 是一个面向昇腾&#xff08;Ascend&#xff09;芯片的轻量级、解耦式组件集合&#xff0c;旨在通过自动融合技术加速模型执行。 目前已开源 SuperKernel 组件&#xff0c;未来将持续开放更多自动融合相…

作者头像 李华