VibeThinker-1.5B实战教程：结合LangChain构建智能代理-程序员充电站

VibeThinker-1.5B实战教程：结合LangChain构建智能代理

1. 引言

1.1 学习目标

本文旨在指导开发者如何将微博开源的小参数语言模型VibeThinker-1.5B与主流AI应用开发框架LangChain相结合，构建具备数学推理与代码生成能力的智能代理（Intelligent Agent）。通过本教程，读者将掌握：

如何部署并调用本地运行的 VibeThinker-1.5B 模型
使用 LangChain 封装自定义 LLM 接口
构建能够解决 LeetCode 风格编程题的自动化代理
提升小模型在特定任务中的表现技巧

完成本教程后，你将能基于低成本、低资源消耗的小模型实现高效的任务自动化系统。

1.2 前置知识

为顺利跟随本教程，建议具备以下基础：

Python 编程经验（熟悉 requests、asyncio 等）
对 LangChain 的基本理解（LLM、PromptTemplate、Agent 等概念）
熟悉 REST API 调用机制
已部署支持 VibeThinker-1.5B 的 WebUI 或 APP 推理环境

获取更多AI镜像
想探索更多AI镜像和应用场景？访问 CSDN星图镜像广场，提供丰富的预置镜像，覆盖大模型推理、图像生成、视频生成、模型微调等多个领域，支持一键部署。

2. VibeThinker-1.5B 模型特性解析

2.1 模型背景与定位

VibeThinker-1.5B 是微博团队发布的一款实验性小型语言模型，参数量仅为 15 亿，但专注于竞争性编程与数学推理任务。其设计目标并非通用对话或文本生成，而是验证“小模型+强数据”是否能在特定领域媲美更大模型的表现。

该模型总训练成本控制在7,800 美元以内，却在多个基准测试中超越了参数规模高达其 400 倍的 DeepSeek R1 模型，展现出极高的性价比潜力。

2.2 核心性能指标

测试项目	VibeThinker-1.5B 得分	对比模型（DeepSeek R1）
AIME24 数学基准	80.3	79.8
AIME25 数学基准	74.4	70.0
HMMT25 数学基准	50.4	41.7
LiveCodeBench v5	55.9	-
LiveCodeBench v6	51.1	Magistral Medium: 50.3

从数据可见，VibeThinker-1.5B 在数学和编程类任务上表现出色，尤其适合用于算法竞赛辅助、自动解题系统等场景。

2.3 使用限制与最佳实践

尽管性能亮眼，但需注意以下几点：

不适用于通用任务：如内容创作、客服问答等，因训练数据聚焦于技术类推理。
英文提问效果更佳：建议使用英语描述问题以获得更高准确率。
必须设置系统提示词：进入推理界面后，在系统提示框中输入类似 “You are a programming assistant.” 可显著提升输出质量。
依赖高质量 Prompt 设计：小模型对输入结构敏感，需精心构造指令。

3. 环境准备与模型调用

3.1 部署与启动流程

根据官方说明，快速启动步骤如下：

部署包含 VibeThinker-1.5B 的镜像环境；
登录 Jupyter Notebook，进入/root目录；
执行脚本1键推理.sh启动本地推理服务；
返回控制台，点击“网页推理”按钮打开交互界面。

此过程会启动一个基于 WebUI 的本地 API 服务，通常监听在http://localhost:8080或类似端口。

3.2 获取 API 访问方式

假设模型服务已暴露以下接口：

POST http://localhost:8080/v1/completions

请求体示例：

{ "prompt": "You are a programming assistant.\n\nUser: Write a Python function to check if a number is prime.\nAssistant:", "max_tokens": 200, "temperature": 0.7, "top_p": 0.9 }

响应格式为标准 JSON，包含生成文本字段"text"。

我们将基于此接口封装 LangChain 兼容的 LLM 类。

4. 集成 LangChain 构建自定义 LLM

4.1 定义 VibeThinkerWrapper 类

我们需要继承langchain.llms.base.LLM并实现_call和_identifying_params方法。

from langchain.llms.base import LLM from typing import Any, List, Mapping, Optional import requests import json class VibeThinker1_5B(LLM): """自定义封装 VibeThinker-1.5B 模型""" endpoint: str = "http://localhost:8080/v1/completions" max_tokens: int = 200 temperature: float = 0.7 top_p: float = 0.9 system_prompt: str = "You are a programming assistant." @property def _llm_type(self) -> str: return "vibethinker" @property def _identifying_params(self) -> Mapping[str, Any]: return { "endpoint": self.endpoint, "max_tokens": self.max_tokens, "temperature": self.temperature, "top_p": self.top_p } def _call( self, prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[Any] = None, **kwargs: Any, ) -> str: # 构造完整输入 full_prompt = f"{self.system_prompt}\n\nUser: {prompt}\nAssistant:" payload = { "prompt": full_prompt, "max_tokens": self.max_tokens, "temperature": self.temperature, "top_p": self.top_p } try: response = requests.post(self.endpoint, json=payload, timeout=60) response.raise_for_status() data = response.json() text = data.get("text", "") # 截取 Assistant 后的内容 if "Assistant:" in text: text = text.split("Assistant:")[-1] return text.strip() except Exception as e: return f"Error calling VibeThinker API: {str(e)}"

注意：确保你的运行环境可以访问localhost:8080，若模型运行在远程服务器，请替换为实际 IP 地址。

4.2 测试基础调用

# 初始化模型实例 llm = VibeThinker1_5B() # 简单测试 result = llm("Write a Python function to reverse a string.") print(result)

预期输出：

def reverse_string(s): return s[::-1]

5. 构建智能代理解决编程问题

5.1 设计代理工作流

我们的目标是让代理完成以下任务：

给定一道 LeetCode 风格题目，输出可运行的 Python 函数，并附带简要解释。

为此，我们使用 LangChain 的initialize_agent搭配Tool和ZeroShotAgent。

5.2 创建工具函数

虽然 VibeThinker 本身能写代码，但我们仍可添加额外工具增强可靠性。

from langchain.agents import Tool from langchain.tools import BaseTool class CodeExecutionTool(BaseTool): name = "run_code" description = "执行提供的Python代码并返回结果" def _run(self, code: str) -> str: try: exec_globals = {} exec(code, exec_globals) # 假设最后定义了一个函数 func_name = code.strip().split("def ")[-1].split("(")[0] return f"Function '{func_name}' defined successfully." except Exception as e: return f"Error executing code: {str(e)}" async def _arun(self, query: str) -> str: raise NotImplementedError # 注册工具 tools = [ Tool( name="Programming Assistant", func=llm.invoke, description="Useful for writing Python functions for algorithmic problems." ), CodeExecutionTool() ]

5.3 初始化智能代理

from langchain.agents import initialize_agent, AgentType from langchain.memory import ConversationBufferMemory memory = ConversationBufferMemory(memory_key="chat_history") agent = initialize_agent( tools, llm, agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION, verbose=True, memory=memory )

5.4 运行代理示例

question = """ Write a Python function called 'two_sum' that takes a list of integers `nums` and an integer `target`, and returns the indices of the two numbers such that they add up to target. You may assume that each input has exactly one solution. """ response = agent.run(f"Solve this coding problem:\n{question}") print(response)

输出示例（简化）：

I'll solve this step by step using the programming assistant tool.

Action: Programming Assistant
Action Input: Write a Python function called 'two_sum' that takes a list of integersnumsand an integertarget, and returns the indices of the two numbers such that they add up to target. You may assume that each input has exactly one solution.

Observation: ```python def two_sum(nums, target): num_map = {} for i, num in enumerate(nums): complement = target - num if complement in num_map: return [num_map[complement], i] num_map[num] = i return []

This function uses a hash map to store previously seen numbers and their indices, achieving O(n) time complexity. > Final Answer: The `two_sum` function has been written with optimal performance. --- ## 6. 性能优化与工程建议 ### 6.1 提升推理稳定性的策略 由于 VibeThinker-1.5B 是小参数模型，输出可能存在波动。以下是几条实用建议： - **固定系统提示词**：始终在 prompt 中前置角色设定，如 “You are a helpful coding assistant.” - **使用 Few-shot 示例**：在 prompt 中加入 1~2 个输入/输出样例，引导模型遵循格式。 - **限制输出长度**：避免生成过长代码块导致截断错误。 - **后处理过滤**：提取代码块时使用正则表达式匹配 ```python ... ``` 区间。 ### 6.2 错误处理与重试机制 在生产环境中，应增加异常捕获与自动重试逻辑： ```python import time from functools import wraps def retry_on_failure(max_retries=3, delay=2): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): for i in range(max_retries): try: return func(*args, **kwargs) except Exception as e: if i == max_retries - 1: raise e time.sleep(delay) return None return wrapper return decorator @retry_on_failure(max_retries=3) def safe_llm_call(prompt): return llm(prompt)

6.3 多任务并发支持

若需批量处理多个编程问题，可使用异步请求提升效率：

import asyncio import aiohttp async def async_query_vibethinker(session, prompt): payload = { "prompt": f"You are a programming assistant.\n\nUser: {prompt}\nAssistant:", "max_tokens": 200 } async with session.post("http://localhost:8080/v1/completions", json=payload) as resp: data = await resp.json() return data.get("text", "").split("Assistant:")[-1].strip() async def batch_solve(questions): async with aiohttp.ClientSession() as session: tasks = [async_query_vibethinker(session, q) for q in questions] results = await asyncio.gather(*tasks) return results

7. 总结

7.1 技术价值总结

本文详细介绍了如何将微博开源的小参数模型VibeThinker-1.5B与LangChain框架集成，构建面向编程与数学推理任务的智能代理系统。核心成果包括：

成功封装本地模型为 LangChain 兼容的 LLM 接口
实现了基于 Conversational Agent 的自动化解题流程
提供了完整的部署、调用、优化闭环方案

VibeThinker-1.5B 虽然参数量仅 1.5B，但在特定任务上展现了接近大模型的能力，特别适合资源受限环境下的轻量化 AI 应用。

7.2 实践建议

专注垂直场景：优先应用于算法题求解、代码补全、数学推导等结构化任务
强化 Prompt 工程：通过系统提示词和 few-shot 示例提升输出稳定性
结合外部工具链：可接入单元测试、静态分析工具形成完整代码验证闭环
监控输出质量：定期评估生成代码的正确性与可读性

随着小型模型推理能力的持续进步，未来有望在边缘设备、教育辅助、竞赛培训等领域发挥更大作用。

获取更多AI镜像
想探索更多AI镜像和应用场景？访问 CSDN星图镜像广场，提供丰富的预置镜像，覆盖大模型推理、图像生成、视频生成、模型微调等多个领域，支持一键部署。

VibeThinker-1.5B实战教程：结合LangChain构建智能代理