news 2026/5/9 12:36:15

CANN torchtitan-npu测试指南

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
CANN torchtitan-npu测试指南

Test Guide

【免费下载链接】torchtitan-npuAscend Extension for torchtitan项目地址: https://gitcode.com/cann/torchtitan-npu

Core Commands

Unit Tests

# Run all unit tests and generate reports sh build.sh -u --generate-report # Run only local `torchtitan-npu` unit tests RUN_TORCHTITAN_UT=false sh build.sh -u --generate-report

Smoke Tests

# Run the default smoke suite (core + extended) sh build.sh -s --generate-report # Run only core smoke ONLY_CORE_SMOKE=true sh build.sh -s --generate-report # Run only extended smoke ONLY_EXTENDED_SMOKE=true sh build.sh -s --generate-report # Run only upstream smoke ONLY_UPSTREAM_SMOKE=true sh build.sh -s --generate-report

Integration Test

tests/smoke_tests/integration_test.pyis the entry point for end-to-end integration tests, used to validate:

  • New model functionality support
  • Feature compatibility
  • Parallelism strategy compatibility
Running
# Via build.sh (runs core + extended smoke by default) ONLY_CORE_SMOKE=true sh build.sh -s --generate-report # Run integration_test.py directly python tests/smoke_tests/integration_test.py output_dir \ --test_name all \ --ngpu 2
Command-line Arguments
ArgumentDefaultDescription
output_dirNone (required)Output directory for test results
--config_path./tests/smoke_tests/base_test.tomlBase config file path
--test_nameallSpecific test case name
--ngpu2Maximum GPU count
OverrideDefinitions Usage

OverrideDefinitionsis the configuration class for defining integration test cases:

OverrideDefinitions( override_args=[[...]], # Required: command-line argument list test_descr="...", # Required: test description test_name="...", # Required: test name ngpu=2, # Optional: required GPU count disabled=False, # Optional: whether disabled )
Steps to Add a New Test Case
  1. Opentests/smoke_tests/integration_test.py
  2. Add a new configuration to thesmoke_caseslist ingenerate_smoke_tests():
OverrideDefinitions( [ [ "--model.name your_model", "--model.flavor your_flavor", "--parallelism.tensor_parallel_degree 2", ], ], "Your Model TP Test", "your_model_tp", ngpu=2, )
  1. Run tests to verify:
python tests/smoke_tests/integration_test.py ./outputs --test_name your_model_tp
base_test.toml Configuration

tests/smoke_tests/base_test.tomlis the base configuration for integration tests. All tests run based on this configuration file, and parameters inoverride_argsoverride identically-named parameters in the base configuration.

Model Parallel Commands

# Basic model-parallel smoke python3 -m pytest -v tests/smoke_tests/model_parallel/ # Multi-rank model-parallel smoke RUN_MODEL_PARALLEL_MULTI_RANK=true torchrun --nproc_per_node=4 -m pytest -v tests/smoke_tests/model_parallel/

When to Use Which Command

CommandUse It When
build.sh -uYou changed hardware-independent logic such as converters, config, helpers, or patches
build.sh -sYou changed real NPU execution paths or wrapper behavior and want the default core + extended smoke set
ONLY_CORE_SMOKE=trueYou changed the minimal training path (i.e., end-to-end integration tests defined in integration_test)
ONLY_EXTENDED_SMOKE=trueYou changed local feature or model-parallel behavior
ONLY_UPSTREAM_SMOKE=trueYou changed logic that depends on reused torchtitan upstream integration, or want to run the heavier upstream smoke path separately

Quick Decision Rule

  • Changed only hardware-independent logic: start withbuild.sh -u
  • Changed NPU feature paths or wrappers: runbuild.sh -s
  • Changed training-path wiring: at least runONLY_CORE_SMOKE=true build.sh -s
  • Changed model-parallel behavior: runONLY_EXTENDED_SMOKE=true build.sh -s
  • Upstream integration compatibility needs a separate check: runONLY_UPSTREAM_SMOKE=true build.sh -s

Test Reports

  • Output directory:test_reports/
  • Common artifacts:
    • *.xml: JUnit results
    • *.html: HTML reports when--generate-reportis enabled
    • coverage/: UT coverage reports
    • README.md: generated index of report artifacts

Quick Tips

  1. Start with the smallest command that matches your change.
  2. Preferbuild.sh -uwhen NPU is not required.
  3. Use targeted smoke variants instead of full smoke when possible.
  4. Update docs when test layout or execution changes.

【免费下载链接】torchtitan-npuAscend Extension for torchtitan项目地址: https://gitcode.com/cann/torchtitan-npu

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/5/9 12:33:30

求推荐芜湖靠谱的装修公司?来看我的真实体验

刚装完新房的我太懂找装修的焦虑了,当初我翻遍芜湖本地论坛、问了一圈朋友,就是为了求推荐芜湖靠谱的装修公司,踩了两次免费报价的坑后,最终敲定了雅园装饰,整个装修过程的体验远超我的预期,今天就把整个过…

作者头像 李华
网站建设 2026/5/9 12:30:55

OpenCore Legacy Patcher:三步让老Mac焕发新生,轻松体验最新macOS

OpenCore Legacy Patcher:三步让老Mac焕发新生,轻松体验最新macOS 【免费下载链接】OpenCore-Legacy-Patcher Experience macOS just like before 项目地址: https://gitcode.com/GitHub_Trending/op/OpenCore-Legacy-Patcher 你是否还在为老旧的…

作者头像 李华
网站建设 2026/5/9 12:30:27

CANN/catlass Gemm/Block类模板概述

Gemm/Block 类模板概述 【免费下载链接】catlass 本项目是CANN的算子模板库,提供NPU上高性能矩阵乘及其相关融合类算子模板样例。 项目地址: https://gitcode.com/cann/catlass API 清单 blockMmad清单 组件描述block_mmad基础模板,包含BlockMm…

作者头像 李华
网站建设 2026/5/9 12:28:54

数据不再是成本,而是资产:企业为何必须拥抱数据治理?

在许多企业的月度经营分析会上,常会出现以下尴尬局面:销售副总裁汇报客户增长率为15%,财务总监测算的销售收入增幅仅为8%,而市场部展示的获客名单又与两者截然不同。各部门不仅为业绩好坏争执不休,更因“哪个数据才是真…

作者头像 李华
网站建设 2026/5/9 12:28:38

CANN/metadef C_Format接口

C_Format 【免费下载链接】metadef Ascend Metadata Definition 项目地址: https://gitcode.com/cann/metadef typedef enum {C_FORMAT_NCHW 0, // NCHWC_FORMAT_NHWC, // NHWCC_FORMAT_ND, // Nd TensorC_FORMAT_NC1HWC0, // NC1HWC0C_FORMAT_FRAC…

作者头像 李华
网站建设 2026/5/9 12:27:49

GD32中的DMA使用教程

一、概述平台:GD32F4XX资源:DMA,当前系列的DMA可分为DMA0和DMA1,每个DMA各有8个通道,总共16个通道可以映射到外设,提供使用数据:长度最大65536,支持8位,16位和32位的数据…

作者头像 李华