单端口部署多模型最简单解决方案(vllm sglang 等均适用)
2026/6/26 18:40:31
本教程将指导您在本地环境(Windows/Linux/Mac)上部署该项目。
在开始之前,请确保您的电脑安装了以下软件:
1.1 Python 3.9+:
安装教程
1.2 Ollama 安装与配置
1.2.1下载 Ollama 并安装,直接点击Download
1.2.2拉取必要的模型:
ollama pull qwen2.5:3b# 通义千问 3B 模型ollama pull deepseek-r1:8b# 深度求索 8B 模型ollama pull llama3.1:latest# Llama 3.1 模型ollama pull granite3.2-vision:latest# 用于图像识别的模型ollama list-1.2.4验证 Ollama API 是否正常工作:
curlhttp://localhost:11434/api/tags打开终端(CMD 或 PowerShell),克隆文章中提到的 GitHub 仓库
gitclone https://github.com/Elaine-one/SmartChat.git该项目核心依赖包括 streamlit (Web界面), watchdog (监控防护), requests (API调用) 等。
在该项目根目录下有一个 requirements.txt 文件。
pipinstall-r requirements.txt{"api":{"endpoint":"http://localhost:11434/api/chat","max_retries":3,"retry_delay":1,"timeout":120}}"models":{"qwen2.5:3b":{"display_name":"Qwen 2.5-3B","description":"通义千问2.5-3B模型,适合中文对话,轻量高效","max_tokens":4096,"context_window":8192,"priority":1},"deepseek-r1:8b":{"display_name":"DeepSeek 8B","description":"深度求索8B模型,擅长中文理解和生成","max_tokens":4096,"context_window":4096,"priority":2},"llama3.1:latest":{"display_name":"Llama 3.1","description":"Meta最新Llama 3.1模型,多语言能力强","max_tokens":4096,"context_window":8192,"priority":3}}"emotion_detection":{"enabled":true,"keywords":{"positive":["开心","高兴","快乐","满意","感谢","赞","棒","好"],"negative":["失望","生气","伤心","悲伤","沮丧","郁闷","难过","烦"],"neutral":["可以","还行","一般","凑合","ok","OK"]}}一切准备就绪,启动 Streamlit 服务:
streamlit run chatbot.py启动成功后,浏览器会自动打开 http://localhost:8501,您应该能看到聊天界面。
开始对话
创建新对话
切换会话
切换语言