大模型推理服务框架 Ollama 一键部署指南

大模型推理服务框架 Ollama 一键部署指南 | 极客日志

docker run -d --gpus=all -v /yourworkspaces/Ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

docker ps

docker exec -it ollama ollama run mistral

docker exec -it ollama ollama list

docker exec -it ollama ollama rm <model_name>

curl http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt": "who are you?",
  "stream": false
}'

curl http://localhost:11434/api/chat -d '{
  "model": "mistral",
  "messages": [
    {"role": "user", "content": "why is the sky blue?"}
  ],
  "stream": false
}'

import requests
import json

def chat_with_ollama(prompt):
    url = "http://localhost:11434/api/chat"
    data = {
        "model": "mistral",
        "messages": [{"role": "user", "content": prompt}],
        "stream": False
    }
    response = requests.post(url, json=data)
    return response.json()

result = chat_with_ollama("Hello, how are you?")
print(result['message']['content'])

export OLLAMA_HOST=0.0.0.0:11434

大模型推理服务框架 Ollama 一键部署指南

引言

一、Docker 环境部署

1. 安装命令

2. 参数详解

3. 状态检查

二、模型管理与运行

1. 拉取模型

2. 查看模型列表

3. 卸载模型

三、API 接口测试

1. 生成补全

2. 对话模式

3. Python 客户端调用

四、集成与应用

1. Dify 平台配置

五、安全与优化建议

1. 环境变量配置

2. 网络暴露风险

六、总结

更多推荐文章

相关免费在线工具

大模型推理服务框架 Ollama 一键部署指南

引言

一、Docker 环境部署

1. 安装命令

2. 参数详解

3. 状态检查

二、模型管理与运行

1. 拉取模型

2. 查看模型列表

3. 卸载模型

三、API 接口测试

1. 生成补全

2. 对话模式

3. Python 客户端调用

四、集成与应用

1. Dify 平台配置

五、安全与优化建议

1. 环境变量配置

2. 网络暴露风险

六、总结

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具