Stable Diffusion 宿主机部署痛点拆解 + Docker 三方案(Run/Compose/Dockerfile)深度测评

目录
一、引言:为什么选择容器化部署?
Stable Diffusion 宿主机部署 vs 容器化部署核心差异对比表
| 对比维度 | 传统宿主机部署 | 容器化部署 |
|---|---|---|
| 环境依赖管理 | 需手动处理依赖冲突,如 Python 版本、PyTorch 版本、CUDA 工具链的兼容性问题,配置过程繁琐且易出错 | 环境隔离,每个应用拥有独立的运行环境,依赖包仅存在于容器内,避免跨应用冲突 |
| 系统清洁度 | 全局安装的依赖包会污染系统环境,后续清理困难,可能残留冗余文件或无效配置 | 依赖与配置均封装在容器中,容器删除后无残留,不影响宿主机系统清洁度 |
| 环境一致性 | 易出现 “在我机器上能跑” 问题,团队协作时因设备配置差异导致复现困难,排查成本高 | 基于镜像构建,开发、测试、生产环境完全一致,消除环境差异带来的部署隐患 |
| 迁移与部署效率 | 更换机器需重新执行所有配置步骤(安装依赖、调试参数等),迁移成本高、耗时久 | 一次构建镜像,支持 “随处运行”,新设备仅需拉取镜像即可启动,部署效率大幅提升 |
| 版本控制能力 | 无统一版本管理机制,依赖更新或配置修改后难以回滚到历史可用状态 | 镜像支持版本化标记(如 v1.0 v2.1),可快速回滚到指定版本,风险可控 |
| 核心痛点总结 | 依赖冲突频发、系统污染、复现难、迁移成本高 | 无上述痛点,且额外具备资源隔离、可移植性强等优势(需掌握容器基础操作) |

二、Docker Run:快速启动官方镜像
1、方案概述
这是最简单的容器化部署方式,适合快速验证和体验。
2、实战步骤
- 拉取官方镜像
docker pull universonic/stable-diffusion-webui:full如果出现以下日志
需要设置代理,打开vim /etc/systemd/system/docker.service.d/http-proxy.conf
填写代理信息,保存
重启docker
查看代理是否生效
- 准备本地目录数据持久化目录
# 创建模型和输出目录 mkdir -p ~/sd-data/models mkdir -p ~/sd-data/outputs将此目录挂载到容器后会继承宿主机的权限属性(UID/GID 一致),而容器内运行用户是sduser(非 root),因此出现权限 denied。核心解决方案是 让宿主机挂载目录的 UID/GID 与容器内sduser的 UID/GID 完全匹配,以下是分步骤的根治方案:
# 格式:sudo chown -R 容器内sduser的UID:容器内sduser的GID 宿主机目录路径 sudo chown -R 1000:1000 ~/sd-data/models/ sudo chown -R 1000:1000 ~/sd-data/outputs/- 启动容器
docker run -d \ --gpus all \ --restart unless-stopped \ -p 7860:8080 \ -e HTTP_PROXY=http://192.168.100.237:7890 \ -e HTTPS_PROXY=http://192.168.100.237:7890 \ -e NO_PROXY=localhost,127.0.0.1 \ -v /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs \ -v /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models \ --name sd-webui \ --user sduser \ universonic/stable-diffusion-webui:full3、参数详解
| 参数全量写法 | 核心作用 | 详细说明 |
|---|---|---|
docker run | 基础命令 | Docker 的核心命令,用于「创建并启动一个新容器」(若本地无指定镜像,会自动从 Docker Hub 拉取)。 |
-d | 后台运行 | 全称 --detach,让容器在后台以「守护进程模式」运行,终端不会被容器日志占用(可通过 docker logs 容器名 查看日志)。 |
--gpus all | GPU 资源分配 | 允许容器访问宿主机的所有 NVIDIA GPU(需宿主机提前安装 NVIDIA 驱动 + nvidia-docker2 插件),是 Stable Diffusion 启用 GPU 加速的核心配置(无此参数则默认用 CPU 运行,生成图片极慢)。扩展:若想指定 GPU,可写 --gpus "device=0,1"(仅允许容器用第 0、1 号 GPU)。 |
--restart unless-stopped | 容器重启策略 | 配置容器退出后的重启规则, 1. 容器异常退出(如崩溃、OOM)时自动重启; 2. 手动执行 3. 宿主机重启后,容器会随 Docker 服务自动启动(适合需要长期运行的应用,如 SD WebUI)。其他可选策略: |
-p 7860:8080 | 端口映射 | 全称 --publish,格式为「宿主机端口:容器内端口」,实现宿主机与容器的端口打通:- 左侧 7860:宿主机对外暴露的端口(外部访问 SD WebUI 时用这个端口,如 http://宿主机IP:7860);- 右侧 8080:容器内 SD WebUI 实际运行的端口(需与 SD 启动命令中 --port 指定的端口一致,否则无法访问)。 |
-e HTTP_PROXY=http://192.168.100.237:7890 | 环境变量(HTTP 代理) | 全称 --env,向容器内注入环境变量,这里用于配置 HTTP 协议的代理:当容器内进程(如 SD WebUI 下载模型、插件时)发起 HTTP 请求,会通过 192.168.100.237:7890 这个代理服务器转发(解决内网无法直接访问外网资源的问题)。 |
-e HTTPS_PROXY=http://192.168.100.237:7890 | 环境变量(HTTPS 代理) | 与上一个参数类似,用于配置 HTTPS 协议的代理(注意:即使是 HTTPS 代理,这里地址也用 http:// 开头,因为代理服务器本身的地址是 HTTP 协议)。 |
-e NO_PROXY=localhost,127.0.0.1 | 环境变量(无需代理的地址) | 指定「不需要走代理」的地址列表:- localhost 和 127.0.0.1 均指代「容器自身」;作用:避免容器内进程访问自身服务时(如 SD 内部组件通信)也走代理,减少网络延迟和代理服务压力。 |
-v /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs | 目录挂载(输出目录) | 全称 --volume,实现「宿主机目录」与「容器内目录」的双向数据映射,核心作用是持久化数据(容器删除后数据不丢失):- 左侧 /home/sdwebui/sd-data/outputs:宿主机上用于存储 SD 生成图片的目录;- 右侧 /app/stable-diffusion-webui/outputs:容器内 SD WebUI 默认的图片输出目录;效果:容器内生成的图片会实时同步到宿主机目录,宿主机修改该目录下的文件也会同步到容器内。 |
-v /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models | 目录挂载(模型目录) | 与上一个 -v 逻辑一致,用于挂载 SD 模型文件目录:- 宿主机目录可提前存放模型(如 Stable Diffusion v1.5、RealVis 等 .safetensors 或 .ckpt 文件);- 容器内 SD WebUI 可直接读取该目录下的模型,无需在容器内重复下载,节省空间和时间。 |
--user sduser | 指定运行用户 | 让容器内的进程(如 SD WebUI)以 sduser 这个非 root 用户身份运行:1. 安全性:避免容器内进程用 root 权限操作,降低恶意程序破坏宿主机的风险;2. 权限匹配:需确保宿主机挂载目录(如 outputs、models)的 UID/GID 与容器内 sduser 一致,否则会出现「权限 denied」问题(参考之前的权限配置步骤)。 |
4、适用场景
- 快速体验和测试
- 单次临时使用
- 学习Docker基础操作
5、镜像保存
如果用官方universonic/stable-diffusion-webui:full镜像每次启动都会初始化下载依赖
建议等待下载完成后通过docker commit保存为新镜像
三、Docker Compose:生产级编排部署
1、方案概述
使用Docker Compose可以定义复杂的多服务架构,适合生产环境部署。
2、文件内容
# 声明 Docker Compose 版本(3.2+ 支持 GPU 相关配置,与原命令兼容性匹配) version: "3.8" # 定义服务列表(单服务场景,服务名可自定义,此处与容器名保持一致) services: # 服务名:sd-webui(与原命令 --name 对应) sd-webui: # 镜像名:sd-webui-new(与原命令最后指定的镜像一致) image: sd-webui-new # 容器运行用户:sduser(与原命令 --user sduser 对应) user: sduser # 端口映射:宿主机7860端口 → 容器8080端口(与原命令 -p 7860:8080 完全一致) ports: - "7860:8080" # 格式:"宿主机端口:容器端口",默认支持 TCP 协议,无需额外声明 # 目录挂载:对应原命令的两个 -v 参数,保持宿主机与容器路径一致 volumes: - /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs - /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models # GPU 配置:启用所有 GPU(与原命令 --gpus all 功能一致) # runtime: nvidia 用于启用 NVIDIA 运行时,deploy.resources 声明 GPU 资源预留 runtime: nvidia deploy: resources: reservations: devices: - driver: nvidia capabilities: [gpu] # 声明需要 GPU 资源,等价于 --gpus all # 重启策略:容器退出后自动重启(与原命令 --restart unless-stopped 完全一致) restart: unless-stopped # 可选:容器名配置(若不指定,默认生成 "项目名_sd-webui_1",此处显式指定与原命令一致的容器名) container_name: sd-webui3、操作命令
# 启动所有服务 docker compose up -d # 查看服务状态 docker compose logs -f # 停止服务 docker compose down # 更新镜像并重启 docker compose pull && docker compose up -d4、网络配置优势
- 反向代理(Nginx/Traefik)
- 监控服务(Prometheus/Grafana)
- 文件管理系统
- 数据库服务
5、适用场景
- 生产环境部署
- 多服务架构
- 团队协作开发
- 需要高可用性和监控
四、自定义Dockerfile:高度定制化部署
⭐点击查看
五、故障排除与常见问题
1、指定挂载GPU而不是全部挂载
1.1、查看容器内部GPU数量
# 进入容器内部 root@gpu-3090-vm09:/home/sdwebui# docker exec -it sd-webui bash # 查看GPU sduser@bb429aab044d:~/stable-diffusion-webui$ nvidia-smi Sun Sep 28 08:34:40 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.153.02 Driver Version: 570.153.02 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 Off | N/A | | 36% 31C P8 24W / 350W | 2909MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GeForce RTX 3090 Off | 00000000:02:00.0 Off | N/A | | 36% 29C P8 29W / 350W | 3MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| +-----------------------------------------------------------------------------------------+1.2、指定GPU
# 会使用第一张 RTX 3090 docker run -d \ --gpus "device=0" \ --restart unless-stopped \ -p 7860:8080 \ -v /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs \ -v /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models \ --name sd-webui \ --user sduser \ sd-webui-new version: "3.8" services: sd-webui: image: sd-webui-new user: sduser ports: - "7860:8080" volumes: - /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs - /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models runtime: nvidia restart: unless-stopped container_name: sd-webui deploy: resources: reservations: devices: - driver: nvidia capabilities: [gpu] device_ids: ["0"] # <-- 核心修改点:指定使用索引为 0 的 GPUsduser@dec6235c8a3c:~/stable-diffusion-webui$ nvidia-smi Sun Sep 28 08:39:58 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.153.02 Driver Version: 570.153.02 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 Off | N/A | | 36% 31C P8 24W / 350W | 2909MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| +-----------------------------------------------------------------------------------------+ 2、网络波动,一般是因为连接超时
################################################################ Install script for stable-diffusion + Web UI Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer. ################################################################ ################################################################ Running on sduser user ################################################################ ################################################################ Repo already cloned, using it as install directory ################################################################ ################################################################ Create and activate python venv ################################################################ ################################################################ Launching launch.py... ################################################################ Python 3.10.12 (main, Feb 4 2025, 14:57:36) [GCC 11.4.0] Version: v1.10.1 Commit hash: 82a973c04367123ae98bd9abdf80d9eda9b910e2 Installing xformers Traceback (most recent call last): File "/app/stable-diffusion-webui/launch.py", line 48, in <module> main() File "/app/stable-diffusion-webui/launch.py", line 39, in main prepare_environment() File "/app/stable-diffusion-webui/modules/launch_utils.py", line 402, in prepare_environment run_pip(f"install -U -I --no-deps {xformers_package}", "xformers") File "/app/stable-diffusion-webui/modules/launch_utils.py", line 144, in run_pip return run(f'"{python}" -m pip {command} --prefer-binary{index_url_line}', desc=f"Installing {desc}", errdesc=f"Couldn't install {desc}", live=live) File "/app/stable-diffusion-webui/modules/launch_utils.py", line 116, in run raise RuntimeError("\n".join(error_bits)) RuntimeError: Couldn't install xformers. Command: "/app/stable-diffusion-webui/venv/bin/python" -m pip install -U -I --no-deps xformers==0.0.23.post1 --prefer-binary Error code: 1 stdout: Collecting xformers==0.0.23.post1 Downloading xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl.metadata (1.0 kB) Downloading xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl (213.0 MB) 0.0/213.0 MB ? eta -:--:-- stderr: WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', TimeoutError('_ssl.c:990: The handshake operation timed out'))': /simple/xformers/ WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', ConnectionResetError(104, 'Connection reset by peer'))': /packages/f4/89/ce8e936d3e64b3b565c16312dd6446d54f6e485f864130702c6b3b3cbe7c/xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl.metadata WARNING: Connection timed out while downloading. error: incomplete-download × Download failed because not enough bytes were received (0 bytes/213.0 MB) ╰─> URL: https://files.pythonhosted.org/packages/f4/89/ce8e936d3e64b3b565c16312dd6446d54f6e485f864130702c6b3b3cbe7c/xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl note: This is an issue with network connectivity, not pip. hint: Consider using --resume-retries to enable download resumption. ~/stable-diffusion-webui 3、如何下载模型?
1、Models – Hugging Face
2、AIGate
3、Civitai: The Home of Open-Source Generative AI
Demo镜像经过commit后的总体积仅为12.2GB,包含一个预置镜像,如需要下载新模型,需要将下载的模型放入/home/sdwebui/sd-data/models指定的模型文件夹中
4、内存不足问题
生成图片时容器崩溃,日志显示OOM(Out of Memory)
version: "3.8" services: sd-webui: image: sd-webui-new user: sduser ports: - "7860:8080" volumes: - /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs - /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models runtime: nvidia restart: unless-stopped container_name: sd-webui deploy: resources: limits: memory: "8G" # 硬限制:容器最多只能使用 8GB 内存 reservations: memory: "4G" # 预留 4GB 内存 devices: # 预留 GPU 资源(与内存预留同级) - driver: nvidia capabilities: [gpu] device_ids: ["0"] # 指定使用索引为 0 的 GPU5、日志查看和调试
# 查看实时日志 docker logs -f sd-webui # 查看最近100行日志 docker logs --tail 100 sd-webui # 进入容器调试 docker exec -it sd-webui bash # 检查容器资源使用情况 docker stats sd-webui6、无cuda支持
--- 0%| | 0/20 [00:00<?, ?it/s] *** Error completing request *** Arguments: ('task(wan01r2cq08hz6h)', <gradio.routes.Request object at 0x7f43ce887850>, '11111', '', [], 1, 1, 7, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 20, 'DPM++ 2M', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {} Traceback (most recent call last): File "/app/modules/call_queue.py", line 74, in f res = list(func(*args, **kwargs)) File "/app/modules/call_queue.py", line 53, in f res = func(*args, **kwargs) File "/app/modules/call_queue.py", line 37, in f res = func(*args, **kwargs) File "/app/modules/txt2img.py", line 109, in txt2img processed = processing.process_images(p) File "/app/modules/processing.py", line 847, in process_images res = process_images_inner(p) File "/app/modules/processing.py", line 988, in process_images_inner samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts) File "/app/modules/processing.py", line 1346, in sample samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x)) File "/app/modules/sd_samplers_kdiffusion.py", line 230, in sample samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "/app/modules/sd_samplers_common.py", line 272, in launch_sampling return func() File "/app/modules/sd_samplers_kdiffusion.py", line 230, in <lambda> samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/app/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m denoised = model(x, sigmas[i] * s_in, **extra_args) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/modules/sd_samplers_cfg_denoiser.py", line 249, in forward x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in)) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs) File "/app/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps return self.inner_model.apply_model(*args, **kwargs) File "/app/modules/sd_hijack_utils.py", line 22, in <lambda> setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs)) File "/app/modules/sd_hijack_utils.py", line 34, in __call__ return self.__sub_func(self.__orig_func, *args, **kwargs) File "/app/modules/sd_hijack_unet.py", line 50, in apply_model result = orig_func(self, x_noisy.to(devices.dtype_unet), t.to(devices.dtype_unet), cond, **kwargs) File "/app/modules/sd_hijack_utils.py", line 22, in <lambda> setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs)) File "/app/modules/sd_hijack_utils.py", line 36, in __call__ return self.__orig_func(*args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model x_recon = self.model(x_noisy, t, **cond) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1335, in forward out = self.diffusion_model(x, t, context=cc) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/modules/sd_unet.py", line 91, in UNetModel_forward return original_forward(self, x, timesteps, context, *args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward h = module(h, emb, context) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward x = layer(x, context) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/modules/sd_hijack_utils.py", line 22, in <lambda> setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs)) File "/app/modules/sd_hijack_utils.py", line 34, in __call__ return self.__sub_func(self.__orig_func, *args, **kwargs) File "/app/modules/sd_hijack_unet.py", line 96, in spatial_transformer_forward x = block(x, context=context[i]) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 269, in forward return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 123, in checkpoint return func(*inputs) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 272, in _forward x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/modules/sd_hijack_optimizations.py", line 497, in xformers_attention_forward out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=get_xformers_flash_attention_op(q, k, v)) File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 223, in memory_efficient_attention return _memory_efficient_attention( File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 321, in _memory_efficient_attention return _memory_efficient_attention_forward( File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 337, in _memory_efficient_attention_forward op = _dispatch_fw(inp, False) File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 120, in _dispatch_fw return _run_priority_list( File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 63, in _run_priority_list raise NotImplementedError(msg) NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: query : shape=(2, 4096, 8, 40) (torch.float16) key : shape=(2, 4096, 8, 40) (torch.float16) value : shape=(2, 4096, 8, 40) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 `decoderF` is not supported because: xFormers wasn't build with CUDA support attn_bias type is <class 'NoneType'> operator wasn't built - see `python -m xformers.info` for more info `[email protected]` is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see `python -m xformers.info` for more info `tritonflashattF` is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see `python -m xformers.info` for more info triton is not available Only work on pre-MLIR triton for now `cutlassF` is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see `python -m xformers.info` for more info `smallkF` is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) operator wasn't built - see `python -m xformers.info` for more info unsupported embed per head: 40 新增指定cuda兼容的xformers
# 6. 安装所有 Python 依赖 RUN pip install \ -i https://pypi.tuna.tsinghua.edu.cn/simple \ --trusted-host pypi.tuna.tsinghua.edu.cn \ -r requirements_versions.txt # 7. 【新增】安装支持 CUDA 11.8 的 xformers # 我们使用 xformers 官方提供的预编译 wheel 文件,确保 CUDA 支持 RUN pip install --upgrade --force-reinstall \ --index-url https://download.pytorch.org/whl/cu118 \ --no-deps \ xformers==0.0.22.post7可正常生成图片
To create a public link, set `share=True` in `launch()`. Startup time: 50.9s (prepare environment: 41.7s, import torch: 3.8s, import gradio: 1.2s, setup paths: 1.6s, initialize shared: 0.5s, other imports: 0.5s, load scripts: 0.7s, create ui: 0.7s, gradio launch: 0.1s). 6ce0161689b3853acaa03779ec93eafe75a02f4ced659bee03f50797806fa2fa Loading weights [6ce0161689] from /app/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors Creating model from config: /app/configs/v1-inference.yaml /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:945: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Applying attention optimization: xformers... done. Model loaded in 17.7s (calculate hash: 12.5s, create model: 2.8s, apply weights to model: 1.9s, calculate empty prompt: 0.2s). 100%|██████████| 20/20 [00:01<00:00, 12.15it/s] Total progress: 100%|██████████| 20/20 [00:01<00:00, 15.26it/s] Total progress: 100%|██████████| 20/20 [00:01<00:00, 18.42it/s] 六、结语
通过这三种容器化部署方案,您可以根据实际需求选择最适合的方式。无论是快速体验还是生产部署,容器化都能提供稳定、可复现的环境保障。
选择建议:初学者:从Docker Run开始,快速上手生产环境:使用Docker Compose,便于维护和扩展定制需求:采用自定义Dockerfile,实现完全控制
容器化部署不仅解决了环境一致性问题,更为后续的扩展、监控和维护提供了坚实基础。