ResNet18入门教程：手把手教你实现图像分类-程序员充电站

ResNet18入门教程：手把手教你实现图像分类

1. 引言：为什么选择ResNet18进行图像分类？

在深度学习领域，图像分类是计算机视觉的基础任务之一。从识别一只猫到判断一张风景图是否为雪山场景，背后都依赖于强大的卷积神经网络（CNN）。而ResNet18作为ResNet系列中最轻量级的模型之一，凭借其简洁结构、高精度和低计算开销，成为初学者和工业部署中的首选。

本教程将带你从零开始，基于TorchVision 官方实现的 ResNet-18 模型，构建一个完整的图像分类系统。该系统不仅支持对ImageNet 1000类物体与场景的精准识别（如“alp”高山、“ski”滑雪场），还集成了可视化 WebUI 界面，并针对 CPU 进行了推理优化，适合本地快速部署与测试。

通过本文，你将掌握： - 如何加载预训练的 ResNet18 模型 - 图像预处理的标准流程 - 构建 Flask WebUI 实现交互式识别 - 在 CPU 上高效运行深度学习推理

无需 GPU，也能体验 AI 万物识别的魅力！

2. ResNet18 核心原理与技术优势

2.1 什么是 ResNet18？

ResNet（Residual Network）由微软研究院于 2015 年提出，解决了深层网络中梯度消失和退化的问题。其核心创新在于引入了残差连接（Residual Connection）——允许信息绕过若干层直接传递，从而让网络可以训练得更深而不损失性能。

ResNet18 是该系列中层数较浅的版本，包含 18 层卷积层（含残差块），参数量仅约1170万，模型文件大小不足 45MB，非常适合边缘设备或 CPU 推理场景。

2.2 残差块工作原理解析

ResNet 的基本单元是残差块（Residual Block），其数学表达为：

$$ y = F(x) + x $$

其中 $F(x)$ 是主路径上的卷积变换，$x$ 是输入，$y$ 是输出。这种“跳跃连接”使得网络只需学习输入与输出之间的残差 $F(x)$，而非完整的映射，极大提升了训练稳定性。

import torch import torch.nn as nn class BasicBlock(nn.Module): expansion = 1 def __init__(self, in_channels, out_channels, stride=1, downsample=None): super(BasicBlock, self).__init__() self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False) self.bn1 = nn.BatchNorm2d(out_channels) self.relu = nn.ReLU(inplace=True) self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1, bias=False) self.bn2 = nn.BatchNorm2d(out_channels) self.downsample = downsample def forward(self, x): identity = x if self.downsample is not None: identity = self.downsample(x) out = self.conv1(x) out = self.bn1(out) out = self.relu(out) out = self.conv2(out) out = self.bn2(out) out += identity # 残差连接 out = self.relu(out) return out

🔍代码说明：上述BasicBlock是 ResNet18 使用的核心模块。注意最后一行out += identity实现了残差连接，确保梯度可直达浅层。

2.3 为何选择 TorchVision 官方模型？

使用torchvision.models.resnet18(pretrained=True)有三大优势：

优势	说明
✅ 高稳定性	权重来自官方 ImageNet 训练，无第三方魔改风险
✅ 易用性强	接口统一，一行代码即可加载完整模型
✅ 内置预处理	提供标准归一化参数（均值、标准差）

这正是我们强调“内置原生权重，无需联网验证权限”的原因——完全离线可用，抗造性极强。

3. 实战：搭建 ResNet18 图像分类系统

3.1 环境准备

首先安装必要的依赖库：

pip install torch torchvision flask pillow numpy

建议使用 Python 3.8+ 和 PyTorch 1.12+ 版本以获得最佳兼容性。

3.2 加载 ResNet18 模型并进行预测

以下是一个完整的图像分类脚本示例：

import torch from torchvision import models, transforms from PIL import Image import json # 1. 加载预训练 ResNet18 模型 model = models.resnet18(pretrained=True) model.eval() # 切换为评估模式 # 2. 定义图像预处理 pipeline preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) # 3. 加载 ImageNet 类别标签 with open("imagenet_classes.txt", "r") as f: labels = [line.strip() for line in f.readlines()] # 4. 图像识别函数 def predict_image(image_path, top_k=3): img = Image.open(image_path).convert("RGB") input_tensor = preprocess(img) input_batch = input_tensor.unsqueeze(0) # 增加 batch 维度 with torch.no_grad(): output = model(input_batch) probabilities = torch.nn.functional.softmax(output[0], dim=0) top_probs, top_indices = torch.topk(probabilities, top_k) results = [] for i in range(top_k): idx = top_indices[i].item() label = labels[idx] prob = top_probs[i].item() results.append({"label": label, "probability": round(prob, 4)}) return results

📌关键点解析： -transforms.Normalize使用的是 ImageNet 的统计均值和标准差，必须保持一致。 -torch.no_grad()关闭梯度计算，提升推理速度。 -imagenet_classes.txt可从公开资源下载，包含 1000 个类别名称。

3.3 构建 Flask WebUI 交互界面

接下来我们创建一个简单的 Web 页面，支持上传图片并显示 Top-3 分类结果。

（1）Flask 后端代码（app.py）

from flask import Flask, request, render_template, jsonify import os app = Flask(__name__) UPLOAD_FOLDER = 'static/uploads' os.makedirs(UPLOAD_FOLDER, exist_ok=True) @app.route('/') def index(): return render_template('index.html') @app.route('/predict', methods=['POST']) def predict(): if 'file' not in request.files: return jsonify({"error": "No file uploaded"}), 400 file = request.files['file'] if file.filename == '': return jsonify({"error": "No selected file"}), 400 filepath = os.path.join(UPLOAD_FOLDER, file.filename) file.save(filepath) try: results = predict_image(filepath) return jsonify(results) except Exception as e: return jsonify({"error": str(e)}), 500 if __name__ == '__main__': app.run(host='0.0.0.0', port=5000, debug=False)

（2）前端 HTML 模板（templates/index.html）

<!DOCTYPE html> <html> <head> <title>ResNet18 图像分类器</title> <style> body { font-family: Arial; text-align: center; margin-top: 50px; } .upload-box { border: 2px dashed #ccc; padding: 30px; width: 400px; margin: 0 auto; } button { margin-top: 10px; padding: 10px 20px; background: #007bff; color: white; border: none; cursor: pointer; } img { max-width: 100%; margin-top: 20px; } .result { margin-top: 20px; font-weight: bold; } </style> </head> <body> <h1>🔍 ResNet18 万物识别系统</h1> <div class="upload-box"> <input type="file" id="imageInput" accept="image/*"> <br><br> <button onclick="upload()">🔍 开始识别</button> </div> <img id="preview" style="display:none;"> <div id="result" class="result"></div> <script> function upload() { const input = document.getElementById('imageInput'); const file = input.files[0]; if (!file) return alert("请先选择图片"); const formData = new FormData(); formData.append('file', file); const reader = new FileReader(); reader.onload = function(e) { document.getElementById('preview').src = e.target.result; document.getElementById('preview').style.display = 'block'; }; reader.readAsDataURL(file); fetch('/predict', { method: 'POST', body: formData }) .then(res => res.json()) .then(data => { if (data.error) throw data.error; let resultText = "<h2>Top-3 识别结果：</h2>"; data.forEach(item => { resultText += `<p>${item.label} (${(item.probability*100).toFixed(2)}%)</p>`; }); document.getElementById('result').innerHTML = resultText; }) .catch(err => { document.getElementById('result').innerHTML = `<p style="color:red;">错误：${err}</p>`; }); } </script> </body> </html>

3.4 目录结构与启动方式

项目目录结构如下：

resnet18_classifier/ ├── app.py ├── imagenet_classes.txt ├── static/ │ └── uploads/ ├── templates/ │ └── index.html └── requirements.txt

启动服务：

python app.py

访问http://localhost:5000即可使用 WebUI 进行图像分类。

4. 性能优化与常见问题解决

4.1 CPU 推理加速技巧

尽管 ResNet18 本身轻量，但在 CPU 上仍可通过以下方式进一步提速：

启用 TorchScript 或 ONNX 导出python scripted_model = torch.jit.script(model) scripted_model.save("resnet18_scripted.pt")序列化后模型加载更快，且无需 Python 解释器参与推理。
使用多线程 DataLoader设置torch.set_num_threads(4)提升数据加载效率。
量化压缩（Quantization）将 FP32 模型转为 INT8，减少内存占用并提升推理速度：python model_quantized = torch.quantization.quantize_dynamic( model, {nn.Linear}, dtype=torch.qint8 )

4.2 常见问题与解决方案

问题	原因	解决方案
报错`urllib.error.URLError`	尝试在线下载权重	使用`pretrained=False`并手动加载本地`.pth`文件
内存溢出	批次过大或未释放变量	使用`del input_batch`,`torch.cuda.empty_cache()`（如有GPU）
分类不准	输入图像偏离 ImageNet 分布	确保预处理参数正确，避免过度裁剪
Web 页面无法上传	路径权限不足	检查`UPLOAD_FOLDER`是否存在且可写