Phi-3-mini-4k-instruct与Vue3前端集成：实时聊天应用开发-程序员充电站

Phi-3-mini-4k-instruct与Vue3前端集成：实时聊天应用开发

最近在做一个内部工具，需要给团队加个智能助手功能。要求很简单：响应要快，能连续对话，最好还能本地部署，别总依赖外部服务。试了一圈，发现微软的Phi-3-mini-4k-instruct这个小模型挺合适，3.8B参数不算大，但效果不错，关键是对硬件要求不高，普通开发机就能跑。

前端这块，Vue3用着顺手，组合式API写起来很灵活。但要把大模型和前端结合起来，特别是做实时聊天这种场景，还是有些坑要踩的。今天就跟大家分享一下，怎么把Phi-3-mini-4k-instruct集成到Vue3项目里，做个像模像样的智能聊天应用。

1. 为什么选Phi-3-mini-4k-instruct？

先说说为什么选这个模型。你可能听说过那些动辄几十亿、几百亿参数的大模型，效果是好，但部署起来也真够呛。Phi-3-mini-4k-instruct只有3.8B参数，算是轻量级选手，但你别小看它。

我用下来感觉，这模型有几个挺实在的优点：

对硬件要求不高：2-3GB显存就能跑，我的MacBook Pro都能流畅运行，不用专门搞个服务器
响应速度快：生成一段回复基本在1-3秒内，聊天体验很流畅
指令跟随不错：你让它怎么回答，它基本能按你的要求来，不会乱发挥
支持连续对话：能记住上下文，聊个五六轮问题不大

特别适合我们这种想快速做个智能功能，又不想折腾大模型的场景。而且它是开源的，MIT协议，商用也没问题。

2. 环境准备与模型部署

2.1 安装Ollama

要跑Phi-3-mini，最简单的方法就是用Ollama。这是个专门跑大模型的工具，安装特别简单。

如果你用macOS，打开终端直接运行：

curl -fsSL https://ollama.com/install.sh | sh

Windows用户可以去官网下载安装包，Linux用户也可以用上面的脚本。装好后，启动Ollama服务：

ollama serve

服务默认会在11434端口启动，前端应用就是通过这个端口和模型通信。

2.2 拉取Phi-3-mini模型

Ollama服务起来后，拉取模型：

ollama pull phi3:instruct

这个命令会下载Phi-3-mini-4k-instruct模型，大概2.2GB左右。下载速度看你的网络，一般几分钟到十几分钟。

下载完可以测试一下：

ollama run phi3:instruct "你好，介绍一下你自己"

如果看到模型正常回复，说明部署成功了。

2.3 创建Vue3项目

前端这边，我们用Vite创建Vue3项目，这样启动快，配置也简单：

npm create vue@latest phi3-chat-app

创建时选上TypeScript、Pinia、Vue Router，其他按需选择。创建完安装依赖：

cd phi3-chat-app npm install

3. 前端架构设计

聊天应用虽然看起来简单，但要做得体验好，还是得好好设计一下。我总结了几点关键考虑：

3.1 组件结构

src/ ├── components/ │ ├── ChatWindow.vue # 聊天主窗口 │ ├── MessageList.vue # 消息列表 │ ├── MessageItem.vue # 单条消息 │ ├── InputArea.vue # 输入区域 │ └── TypingIndicator.vue # 打字指示器 ├── stores/ │ └── chat.ts # Pinia状态管理 ├── services/ │ └── ollama.ts # API服务层 └── utils/ └── stream.ts # 流式处理工具

3.2 状态管理设计

用Pinia管理聊天状态，主要包含：

消息列表（用户消息和AI回复）
当前对话状态（空闲、等待、生成中）
模型配置（温度、最大token数等）
对话历史

3.3 关键特性实现

实时聊天有几个关键点要处理好：

流式响应：让用户看到模型一个字一个字生成，而不是等全部生成完才显示
连续对话：模型要能记住之前的对话内容
错误处理：网络问题、模型出错时要友好提示
性能优化：消息多了不能卡顿

4. 核心代码实现

4.1 API服务层

先创建和Ollama通信的服务。在src/services/ollama.ts里：

import type { ChatMessage } from '@/types/chat' const OLLAMA_API_URL = 'http://localhost:11434/api' export interface OllamaChatRequest { model: string messages: Array<{ role: 'user' | 'assistant' | 'system' content: string }> stream?: boolean options?: { temperature?: number top_p?: number top_k?: number num_predict?: number } } export interface OllamaChatResponse { model: string created_at: string message: { role: string content: string } done: boolean } export class OllamaService { private static instance: OllamaService public static getInstance(): OllamaService { if (!OllamaService.instance) { OllamaService.instance = new OllamaService() } return OllamaService.instance } // 普通聊天（非流式） async chat(messages: ChatMessage[]): Promise<string> { const request: OllamaChatRequest = { model: 'phi3:instruct', messages: messages.map(msg => ({ role: msg.role, content: msg.content })), stream: false, options: { temperature: 0.7, num_predict: 512 } } try { const response = await fetch(`${OLLAMA_API_URL}/chat`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(request) }) if (!response.ok) { throw new Error(`Ollama API error: ${response.status}`) } const data: OllamaChatResponse = await response.json() return data.message.content } catch (error) { console.error('Chat error:', error) throw error } } // 流式聊天（核心功能） async *chatStream(messages: ChatMessage[]): AsyncGenerator<string> { const request: OllamaChatRequest = { model: 'phi3:instruct', messages: messages.map(msg => ({ role: msg.role, content: msg.content })), stream: true, options: { temperature: 0.7, num_predict: 512 } } const response = await fetch(`${OLLAMA_API_URL}/chat`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify(request) }) if (!response.ok) { throw new Error(`Ollama API error: ${response.status}`) } const reader = response.body?.getReader() if (!reader) { throw new Error('Failed to get response reader') } const decoder = new TextDecoder() let buffer = '' try { while (true) { const { done, value } = await reader.read() if (done) { break } buffer += decoder.decode(value, { stream: true }) const lines = buffer.split('\n') buffer = lines.pop() || '' for (const line of lines) { if (line.trim() === '') continue try { const data = JSON.parse(line) if (data.message?.content) { yield data.message.content } } catch (e) { console.warn('Failed to parse line:', line) } } } } finally { reader.releaseLock() } } // 检查Ollama服务状态 async checkHealth(): Promise<boolean> { try { const response = await fetch(`${OLLAMA_API_URL}/tags`, { method: 'GET', headers: { 'Content-Type': 'application/json' } }) return response.ok } catch { return false } } } export const ollamaService = OllamaService.getInstance()

4.2 状态管理

用Pinia管理聊天状态，src/stores/chat.ts：

import { defineStore } from 'pinia' import { ref, computed } from 'vue' import type { ChatMessage, ChatSession } from '@/types/chat' import { ollamaService } from '@/services/ollama' export const useChatStore = defineStore('chat', () => { // 状态 const messages = ref<ChatMessage[]>([]) const currentSession = ref<ChatSession | null>(null) const isGenerating = ref(false) const error = ref<string | null>(null) const modelStatus = ref<'connected' | 'disconnected' | 'checking'>('checking') // 计算属性 const lastMessage = computed(() => { return messages.value.length > 0 ? messages.value[messages.value.length - 1] : null }) const hasMessages = computed(() => messages.value.length > 0) // Actions async function sendMessage(content: string) { if (!content.trim() || isGenerating.value) return // 添加用户消息 const userMessage: ChatMessage = { id: Date.now().toString(), role: 'user', content: content.trim(), timestamp: new Date() } messages.value.push(userMessage) // 添加空的AI消息占位 const aiMessageId = (Date.now() + 1).toString() const aiMessage: ChatMessage = { id: aiMessageId, role: 'assistant', content: '', timestamp: new Date(), isStreaming: true } messages.value.push(aiMessage) isGenerating.value = true error.value = null try { // 准备对话历史 const history = messages.value .filter(msg => msg.id !== aiMessageId) .map(msg => ({ role: msg.role, content: msg.content })) // 流式生成 const stream = ollamaService.chatStream(history) let fullResponse = '' for await (const chunk of stream) { fullResponse += chunk // 更新AI消息内容 const index = messages.value.findIndex(msg => msg.id === aiMessageId) if (index !== -1) { messages.value[index] = { ...messages.value[index], content: fullResponse } } } // 完成流式生成 const index = messages.value.findIndex(msg => msg.id === aiMessageId) if (index !== -1) { messages.value[index] = { ...messages.value[index], content: fullResponse, isStreaming: false, timestamp: new Date() } } } catch (err) { error.value = err instanceof Error ? err.message : '生成回复时出错' // 移除失败的AI消息 const index = messages.value.findIndex(msg => msg.id === aiMessageId) if (index !== -1) { messages.value.splice(index, 1) } } finally { isGenerating.value = false } } function clearMessages() { messages.value = [] error.value = null } function updateMessage(id: string, updates: Partial<ChatMessage>) { const index = messages.value.findIndex(msg => msg.id === id) if (index !== -1) { messages.value[index] = { ...messages.value[index], ...updates } } } async function checkModelStatus() { modelStatus.value = 'checking' const isHealthy = await ollamaService.checkHealth() modelStatus.value = isHealthy ? 'connected' : 'disconnected' return isHealthy } return { // 状态 messages, currentSession, isGenerating, error, modelStatus, // 计算属性 lastMessage, hasMessages, // Actions sendMessage, clearMessages, updateMessage, checkModelStatus } })

4.3 聊天窗口组件

主聊天组件src/components/ChatWindow.vue：

<template> <div class="chat-window"> <!-- 状态栏 --> <div class="status-bar"> <div class="status-indicator" :class="modelStatus"> <span class="status-dot"></span> <span class="status-text"> {{ statusText }} </span> </div> <button v-if="hasMessages" @click="clearChat" class="clear-btn" :disabled="isGenerating" > 清空对话 </button> </div> <!-- 错误提示 --> <div v-if="error" class="error-message"> {{ error }} <button @click="dismissError" class="dismiss-btn">×</button> </div> <!-- 消息列表 --> <div class="messages-container" ref="messagesContainer"> <div v-for="message in messages" :key="message.id" :class="['message', message.role]" > <div class="message-header"> <span class="message-role"> {{ message.role === 'user' ? '你' : 'AI助手' }} </span> <span class="message-time"> {{ formatTime(message.timestamp) }} </span> </div> <div class="message-content"> <div v-if="message.isStreaming" class="streaming-content"> {{ message.content }} <span class="cursor"></span> </div> <div v-else class="static-content"> {{ message.content }} </div> </div> </div> <!-- 生成指示器 --> <div v-if="isGenerating && !lastMessage?.isStreaming" class="typing-indicator"> <div class="dots"> <span></span> <span></span> <span></span> </div> <span class="text">思考中...</span> </div> </div> <!-- 输入区域 --> <div class="input-area"> <textarea ref="textareaRef" v-model="inputText" @keydown.enter.exact.prevent="handleSend" placeholder="输入消息... (Enter发送，Shift+Enter换行)" :disabled="isGenerating || modelStatus !== 'connected'" rows="3" class="message-input" ></textarea> <button @click="handleSend" :disabled="!canSend" class="send-btn" > {{ isGenerating ? '生成中...' : '发送' }} </button> </div> </div> </template> <script setup lang="ts"> import { ref, computed, nextTick, onMounted, onUnmounted, watch } from 'vue' import { useChatStore } from '@/stores/chat' const chatStore = useChatStore() const inputText = ref('') const textareaRef = ref<HTMLTextAreaElement>() const messagesContainer = ref<HTMLDivElement>() // 计算属性 const messages = computed(() => chatStore.messages) const isGenerating = computed(() => chatStore.isGenerating) const error = computed(() => chatStore.error) const modelStatus = computed(() => chatStore.modelStatus) const lastMessage = computed(() => chatStore.lastMessage) const hasMessages = computed(() => chatStore.hasMessages) const canSend = computed(() => { return inputText.value.trim() && !isGenerating.value && modelStatus.value === 'connected' }) const statusText = computed(() => { switch (modelStatus.value) { case 'connected': return '模型已连接' case 'disconnected': return '模型未连接' default: return '检查连接中...' } }) // 方法 function handleSend() { if (!canSend.value) return const text = inputText.value.trim() if (!text) return chatStore.sendMessage(text) inputText.value = '' // 保持焦点 nextTick(() => { textareaRef.value?.focus() }) } function clearChat() { if (confirm('确定要清空对话吗？')) { chatStore.clearMessages() } } function dismissError() { chatStore.error = null } function formatTime(date: Date) { return date.toLocaleTimeString('zh-CN', { hour: '2-digit', minute: '2-digit' }) } // 自动滚动到底部 async function scrollToBottom() { await nextTick() if (messagesContainer.value) { messagesContainer.value.scrollTop = messagesContainer.value.scrollHeight } } // 监听消息变化，自动滚动 watch(messages, scrollToBottom, { deep: true }) // 初始化检查模型状态 onMounted(async () => { await chatStore.checkModelStatus() // 定期检查连接状态 const interval = setInterval(() => { if (modelStatus.value !== 'connected') { chatStore.checkModelStatus() } }, 30000) onUnmounted(() => clearInterval(interval)) }) </script> <style scoped> .chat-window { display: flex; flex-direction: column; height: 100%; max-width: 800px; margin: 0 auto; background: #fff; border-radius: 12px; box-shadow: 0 4px 20px rgba(0, 0, 0, 0.1); overflow: hidden; } .status-bar { display: flex; justify-content: space-between; align-items: center; padding: 12px 20px; background: #f8f9fa; border-bottom: 1px solid #e9ecef; } .status-indicator { display: flex; align-items: center; gap: 8px; font-size: 14px; color: #6c757d; } .status-dot { width: 8px; height: 8px; border-radius: 50%; background: #6c757d; } .status-indicator.connected .status-dot { background: #28a745; animation: pulse 2s infinite; } .status-indicator.disconnected .status-dot { background: #dc3545; } @keyframes pulse { 0%, 100% { opacity: 1; } 50% { opacity: 0.5; } } .clear-btn { padding: 6px 12px; background: #6c757d; color: white; border: none; border-radius: 6px; cursor: pointer; font-size: 14px; transition: background 0.2s; } .clear-btn:hover:not(:disabled) { background: #5a6268; } .clear-btn:disabled { opacity: 0.5; cursor: not-allowed; } .error-message { display: flex; justify-content: space-between; align-items: center; padding: 12px 20px; background: #f8d7da; color: #721c24; border-bottom: 1px solid #f5c6cb; font-size: 14px; } .dismiss-btn { background: none; border: none; color: #721c24; font-size: 20px; cursor: pointer; padding: 0 8px; } .messages-container { flex: 1; overflow-y: auto; padding: 20px; background: #f8f9fa; } .message { margin-bottom: 20px; animation: slideIn 0.3s ease; } @keyframes slideIn { from { opacity: 0; transform: translateY(10px); } to { opacity: 1; transform: translateY(0); } } .message-header { display: flex; justify-content: space-between; margin-bottom: 4px; font-size: 12px; color: #6c757d; } .message.user .message-role { color: #007bff; } .message.assistant .message-role { color: #28a745; } .message-content { padding: 12px 16px; border-radius: 12px; line-height: 1.5; word-break: break-word; } .message.user .message-content { background: #007bff; color: white; border-top-right-radius: 4px; } .message.assistant .message-content { background: white; color: #212529; border: 1px solid #dee2e6; border-top-left-radius: 4px; } .streaming-content { position: relative; } .cursor { display: inline-block; width: 2px; height: 1em; background: #28a745; margin-left: 2px; animation: blink 1s infinite; } @keyframes blink { 0%, 100% { opacity: 1; } 50% { opacity: 0; } } .typing-indicator { display: flex; align-items: center; gap: 8px; padding: 12px 16px; background: white; border: 1px solid #dee2e6; border-radius: 12px; border-top-left-radius: 4px; } .dots { display: flex; gap: 4px; } .dots span { width: 8px; height: 8px; background: #6c757d; border-radius: 50%; animation: bounce 1.4s infinite ease-in-out both; } .dots span:nth-child(1) { animation-delay: -0.32s; } .dots span:nth-child(2) { animation-delay: -0.16s; } @keyframes bounce { 0%, 80%, 100% { transform: scale(0); } 40% { transform: scale(1); } } .input-area { display: flex; gap: 12px; padding: 20px; background: #f8f9fa; border-top: 1px solid #e9ecef; } .message-input { flex: 1; padding: 12px; border: 1px solid #ced4da; border-radius: 8px; font-size: 14px; resize: none; font-family: inherit; transition: border-color 0.2s; } .message-input:focus { outline: none; border-color: #007bff; box-shadow: 0 0 0 3px rgba(0, 123, 255, 0.25); } .message-input:disabled { background: #e9ecef; cursor: not-allowed; } .send-btn { padding: 12px 24px; background: #007bff; color: white; border: none; border-radius: 8px; cursor: pointer; font-size: 14px; font-weight: 500; transition: background 0.2s; align-self: flex-end; } .send-btn:hover:not(:disabled) { background: #0056b3; } .send-btn:disabled { background: #6c757d; cursor: not-allowed; } </style>

4.4 类型定义

创建类型定义文件src/types/chat.ts：

export interface ChatMessage { id: string role: 'user' | 'assistant' | 'system' content: string timestamp: Date isStreaming?: boolean } export interface ChatSession { id: string title: string createdAt: Date updatedAt: Date messageCount: number } export interface ModelConfig { temperature: number topP: number topK: number maxTokens: number repeatPenalty: number } export interface ChatState { currentSession: ChatSession | null messages: ChatMessage[] isGenerating: boolean error: string | null modelConfig: ModelConfig }

5. 高级功能实现

5.1 对话历史管理

实际使用中，用户可能需要管理多个对话会话。我们可以扩展状态管理：

// 在chat.ts中增加会话管理 const sessions = ref<ChatSession[]>([]) const activeSessionId = ref<string | null>(null) // 创建新会话 function createSession(title?: string) { const sessionId = Date.now().toString() const sessionTitle = title || `对话 ${sessions.value.length + 1}` const newSession: ChatSession = { id: sessionId, title: sessionTitle, createdAt: new Date(), updatedAt: new Date(), messageCount: 0 } sessions.value.unshift(newSession) activeSessionId.value = sessionId messages.value = [] return newSession } // 切换会话 function switchSession(sessionId: string) { const session = sessions.value.find(s => s.id === sessionId) if (session) { activeSessionId.value = sessionId // 这里可以从本地存储加载该会话的消息 loadSessionMessages(sessionId) } } // 删除会话 function deleteSession(sessionId: string) { const index = sessions.value.findIndex(s => s.id === sessionId) if (index !== -1) { sessions.value.splice(index, 1) if (activeSessionId.value === sessionId) { activeSessionId.value = sessions.value[0]?.id || null messages.value = [] } } }

5.2 消息持久化

为了更好的用户体验，我们可以把对话历史保存到本地：

// 本地存储工具 class ChatStorage { private static readonly SESSIONS_KEY = 'phi3_chat_sessions' private static readonly MESSAGES_PREFIX = 'phi3_messages_' static saveSessions(sessions: ChatSession[]) { try { localStorage.setItem(this.SESSIONS_KEY, JSON.stringify(sessions)) } catch (error) { console.error('Failed to save sessions:', error) } } static loadSessions(): ChatSession[] { try { const data = localStorage.getItem(this.SESSIONS_KEY) if (data) { const sessions = JSON.parse(data) // 转换日期字符串为Date对象 return sessions.map((session: any) => ({ ...session, createdAt: new Date(session.createdAt), updatedAt: new Date(session.updatedAt) })) } } catch (error) { console.error('Failed to load sessions:', error) } return [] } static saveMessages(sessionId: string, messages: ChatMessage[]) { try { const key = `${this.MESSAGES_PREFIX}${sessionId}` localStorage.setItem(key, JSON.stringify(messages)) } catch (error) { console.error('Failed to save messages:', error) } } static loadMessages(sessionId: string): ChatMessage[] { try { const key = `${this.MESSAGES_PREFIX}${sessionId}` const data = localStorage.getItem(key) if (data) { const messages = JSON.parse(data) return messages.map((msg: any) => ({ ...msg, timestamp: new Date(msg.timestamp) })) } } catch (error) { console.error('Failed to load messages:', error) } return [] } static clearSession(sessionId: string) { try { const key = `${this.MESSAGES_PREFIX}${sessionId}` localStorage.removeItem(key) } catch (error) { console.error('Failed to clear session:', error) } } }

5.3 模型参数调节

给用户提供调节模型参数的能力：

<template> <div class="settings-panel"> <h3>模型设置</h3> <div class="setting-item"> <label>温度 (Temperature)</label> <div class="slider-container"> <input type="range" v-model="temperature" min="0" max="2" step="0.1" class="slider" /> <span class="value">{{ temperature }}</span> </div> <p class="hint">值越高越有创意，值越低越稳定</p> </div> <div class="setting-item"> <label>最大生成长度</label> <div class="slider-container"> <input type="range" v-model="maxTokens" min="64" max="2048" step="64" class="slider" /> <span class="value">{{ maxTokens }} tokens</span> </div> </div> <div class="setting-item"> <label>Top P</label> <div class="slider-container"> <input type="range" v-model="topP" min="0" max="1" step="0.05" class="slider" /> <span class="value">{{ topP }}</span> </div> </div> <button @click="saveSettings" class="save-btn"> 保存设置 </button> </div> </template> <script setup lang="ts"> import { ref } from 'vue' import { useChatStore } from '@/stores/chat' const chatStore = useChatStore() const temperature = ref(0.7) const maxTokens = ref(512) const topP = ref(0.9) function saveSettings() { // 更新模型配置 chatStore.updateModelConfig({ temperature: temperature.value, maxTokens: maxTokens.value, topP: topP.value }) // 可以添加保存成功的提示 } </script>

6. 部署与优化建议

6.1 生产环境部署

开发环境用localhost没问题，但生产环境需要考虑：

反向代理：用Nginx代理Ollama服务
HTTPS：确保通信安全
跨域配置：如果前后端分离部署

Nginx配置示例：

server { listen 80; server_name your-domain.com; # 前端静态文件 location / { root /var/www/phi3-chat; index index.html; try_files $uri $uri/ /index.html; } # 代理Ollama API location /api/ { proxy_pass http://localhost:11434/api/; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; # 增加超时时间 proxy_read_timeout 300s; proxy_connect_timeout 75s; } }

6.2 性能优化

虚拟滚动：消息多了用虚拟滚动优化性能
图片懒加载：如果有图片消息
Web Worker：复杂的处理放到Worker里
请求合并：避免频繁的小请求

6.3 错误监控

添加错误监控和用户反馈：

// 错误监控服务 class ErrorMonitor { static captureError(error: Error, context?: Record<string, any>) { console.error('Chat Error:', error, context) // 可以发送到错误监控平台 // Sentry.captureException(error, { extra: context }) // 或者保存到本地日志 this.saveToLocalLog(error, context) } private static saveToLocalLog(error: Error, context?: Record<string, any>) { const logEntry = { timestamp: new Date().toISOString(), error: { name: error.name, message: error.message, stack: error.stack }, context, userAgent: navigator.userAgent } try { const logs = JSON.parse(localStorage.getItem('error_logs') || '[]') logs.unshift(logEntry) // 只保留最近100条 localStorage.setItem('error_logs', JSON.stringify(logs.slice(0, 100))) } catch (e) { console.error('Failed to save error log:', e) } } }

7. 总结

把Phi-3-mini-4k-instruct集成到Vue3前端项目里，做个实时聊天应用，整个过程走下来感觉还是挺顺畅的。关键点在于处理好流式响应和状态管理，这两块做好了，用户体验就不会差。

Phi-3-mini这个小模型确实让人惊喜，3.8B参数能有这样的效果，对于很多实际应用场景来说完全够用了。特别是它能在普通开发机上流畅运行，这点对开发者特别友好。

Vue3的组合式API在这种实时应用里表现不错，响应式系统让状态管理变得简单。加上TypeScript的类型安全，代码写起来心里有底。

实际用下来，这套方案有几个明显的优点：部署简单、响应快速、资源占用少。当然也有些局限性，比如上下文长度只有4K，处理超长对话时可能需要做些优化。

如果你也想给自己的项目加个智能聊天功能，不妨试试这个方案。从简单的对话开始，慢慢加上历史管理、参数调节这些高级功能，一步步完善。有什么问题或者更好的想法，欢迎一起交流。

获取更多AI镜像
想探索更多AI镜像和应用场景？访问 CSDN星图镜像广场，提供丰富的预置镜像，覆盖大模型推理、图像生成、视频生成、模型微调等多个领域，支持一键部署。

Phi-3-mini-4k-instruct与Vue3前端集成：实时聊天应用开发