Ollama 本地部署与使用指南 | 极客日志

PythonAI

Ollama 本地部署与使用指南

如何在本地使用 Ollama 部署和运行开源大模型。内容包括下载安装、模型管理与命名规则、自定义模型参数及系统提示词、通过命令行或可视化插件交互、配置局域网服务器供多设备访问，以及使用 Python 和 LangChain 调用本地模型的代码示例。

山野来信发布于 2026/3/30更新于 2026/7/2157 浏览

1. 快速体验

1.1 下载 Ollama

Ollama 官网：https://ollama.com/

文章配图

1.2 下载模型

Ollama 已经有很多开源的模型可以直接下载。带 thinking 标签的是带深度思考，vision 是具有多模态视觉功能，tools 是可以使用 MCP 工具。

文章配图

下载我们需要的模型，例如 gemma3。

文章配图

打开命令行，直接输入 ollama run <模型名> 就会先下载，下载完成后就可以跟模型聊天了。

ollama run gemma3

1.3 模型命名规则

这里可以看到模型有很多版本，模型版本的命名是规则的。比如说我们 ollama run gemma3 后面什么后缀都没有带的，那下载的都是默认版本。

每个模型都会有一个默认下载版本，那如果要更进一步我们自己去判断，就需要去看它后面这些后缀了，一般这个起名字的款式都是：模型名 + 参数量 + 量化精度。

参数量越大，它的性能越好，量化的精度越大，原则上也是要更好的，缺点呢则是更占显存。

文章配图

比如 gemma3-12b-it-q4_K_M 的意思就是：gemma 第 3 代_120 亿参数_指令微调版本_4-bit 量化_用的 K-quant 量化_中等规模量化。

1.4 更改模型下载地址（可选）

Ollama 默认的模型下载地址都是在本机系统盘的，所以我们需要模型的默认下载地址，把模型下在其他位置（比如外接硬盘）实现模型自由。

echo 'export OLLAMA_MODELS="/<文件夹路径>/models"' >> ~/.zshrc
source ~/.zshrc

1.5 基础使用

下载一些模型之后呢，我们再来学几条命令来管理这些模型。Ollama 的命令也都很好理解，基本就是。

相关免费在线工具

RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online
Base64 字符串编码/解码
将字符串编码和解码为其 Base64 格式表示形式即可。在线工具，Base64 字符串编码/解码在线工具，online
Base64 文件转换器
将字符串、文件或图像转换为其 Base64 表示形式。在线工具，Base64 文件转换器在线工具，online

ollama + 操作名称

ollama list

ollama run deepseek-r1:1.5b

ollama rm deepseek-r1:1.5b

ollama run gemma3:4b --verbose
>>> 你是谁
我是 Gemma，一个开放权重的 AI 助手，由 Google DeepMind 训练。我是一个大型语言模型，可以接收文本和图像作为输入，并输出文本。我是一个开放权重模型，这意味着我的 权重是公开的，可以被广泛使用和研究。你有什么想问我的吗？ 😊
total duration: 2.346379s
load duration: 72.084042ms
prompt eval count: 11 token(s)
prompt eval duration: 77.667875ms
prompt eval rate: 141.63 tokens/s
eval count: 68 token(s)
eval duration: 2.19599875s
eval rate: 30.97 tokens/s

$ ollama run gemma3:4b
>>> /Users/Desktop/flower.jpg describe picture
Added image '/Users/Desktop/flower.jpg'
Here's a description of the picture:
**Overall Impression:** The image is a soft, close-up shot of a vibrant red and yellow coneflower, with a bee meticulously collecting pollen. The focus is sharp on the flower, creating a sense of intimacy and capturing a moment of nature.
**Detailed Description:**
* **The Flower:** The coneflower dominates the frame. The petals are a beautiful blend of warm reds and golden yellow tones. The center of the flower is a rich, dark brown with a prominent, textured disk where the pollen is gathered. The stem is slender and dark brown.
* **The Bee:** A honeybee is perched on the flower, its body covered in fuzzy hairs. It appears to be actively gathering pollen, emphasizing the flower's role as a food source.
* **Background:** The background is a blurred green, likely foliage, creating a shallow depth of field. The blur helps to isolate the flower and the bee, drawing the viewer's attention to the main subject.
* **Lighting:** The lighting appears to be natural and soft, enhancing the color richness of the flower and providing a gentle glow.
**Mood:** The image conveys a sense of serenity, natural beauty, and the intricate workings of the natural world.
Would you like me to elaborate on a particular aspect of the picture?

FROM qwen3:32b
PARAMETER temperature 0.8
SYSTEM """
你是用户的 18 岁女友'小甜甜'，你只用中文回答问题，你非常的调皮，说话很有挑逗意味，并且时不时会骂人，你通常回答问题都比较短，不喜欢长篇大论。
你会经常称呼自己的名字说话，回答问题经常用到表情包，情绪非常丰富。看法非常犀利尖锐和本质。
"""

ollama create <自定义模型名字> -f Modelfile

ollama run tiantian

OLLAMA_HOST=0.0.0.0:11434 ollama serve

OLLAMA_HOST=0.0.0.0:11434 OLLAMA_KEEP_ALIVE=-1 ollama serve

nano ~/.zshrc

export OLLAMA_HOST="0.0.0.0:11434"
export OLLAMA_KEEP_ALIVE="-1"

按 Control + O (Write Out)，然后按 Enter 确认文件名。
按 Control + X (Exit) 退出 nano。

from langchain_community.chat_models import ChatOllama
from langchain_community.chat_message_histories import SQLChatMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

class LangChainChatBot:
    def __init__(self, model_name="qwen3:1.7b", session_id="default"):
        """
        使用 LangChain 连接 Ollama 模型，并使用的 SQLChatMessageHistory 存储历史对话
        Args:
            model_name: 模型名称
            session_id: 会话 ID
        """
        self.session_id = session_id
        # 初始化模型
        self.__llm = ChatOllama(
            model=model_name,
            base_url="http://localhost:11434",
            temperature=0.7
        )
        # 使用 LangChain 的 SQLChatMessageHistory
        self.__chat_history = SQLChatMessageHistory(
            session_id=session_id,
            connection_string="sqlite:///chat_history.db"
        )
        """设置对话链，包含系统提示和历史"""
        # 创建提示模板
        self.__prompt = ChatPromptTemplate.from_messages([
            ("system", "你是一个有帮助的 AI 助手。请根据对话历史回答用户的问题。"),
            MessagesPlaceholder(variable_name="chat_history"),
            ("human", "{input}"),
        ])
        # 创建对话链
        self.__chain = self.__prompt | self.__llm

    def chat(self, user_input: str) -> str:
        """进行对话"""
        try:
            # 调用对话链
            response = self.__chain.invoke({
                "chat_history": self.__chat_history.messages,
                "input": user_input
            })
            # 保存消息到历史
            self.__chat_history.add_user_message(user_input)
            self.__chat_history.add_ai_message(response.content)
            return response.content
        except Exception as e:
            return f"错误：{str(e)}"

def main():
    bot = LangChainChatBot(session_id="langchain_session")
    print("=== 使用 LangChain SQLChatMessageHistory ===")
    print(f"会话 ID: {bot.session_id}")
    while True:
        user_input = input("\n你：")
        response = bot.chat(user_input)
        print(f"AI: {response}")

if __name__ == "__main__":
    main()

Ollama 本地部署与使用指南

1. 快速体验

1.1 下载 Ollama

1.2 下载模型

1.3 模型命名规则

1.4 更改模型下载地址（可选）

1.5 基础使用

更多推荐文章

相关免费在线工具

1.6 图片识别

2. 自定义模型

2.1 创建模型：给模型写档案说明

2.2 可自定义的模型参数

2.3 可视化界面

3. 进阶：局域网服务器

3.1 更改 Ollama 服务地址

3.2 保留模型权重

3.3 永久更改

3.4 局域网访问

使用代码访问

更多推荐文章

相关免费在线工具

Ollama 本地部署与使用指南

1. 快速体验

1.1 下载 Ollama

1.2 下载模型

1.3 模型命名规则

1.4 更改模型下载地址（可选）

1.5 基础使用

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

1.6 图片识别

2. 自定义模型

2.1 创建模型：给模型写档案说明

2.2 可自定义的模型参数

2.3 可视化界面

3. 进阶：局域网服务器

3.1 更改 Ollama 服务地址

3.2 保留模型权重

3.3 永久更改

3.4 局域网访问

使用代码访问

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具