Stable Diffusion 宿主机部署痛点拆解 + Docker 三方案(Run/Compose/Dockerfile)深度测评

Stable Diffusion 宿主机部署痛点拆解 + Docker 三方案(Run/Compose/Dockerfile)深度测评

目录

一、引言:为什么选择容器化部署?

二、Docker Run:快速启动官方镜像

1、方案概述

2、实战步骤

3、参数详解

4、适用场景

5、镜像保存

三、Docker Compose:生产级编排部署

1、方案概述

2、文件内容

3、操作命令

4、网络配置优势

5、适用场景

四、自定义Dockerfile:高度定制化部署

五、故障排除与常见问题

1、指定挂载GPU而不是全部挂载

1.1、查看容器内部GPU数量

1.2、指定GPU

2、网络波动,一般是因为连接超时

3、如何下载模型?

4、内存不足问题

5、日志查看和调试

六、结语


一、引言:为什么选择容器化部署?

Stable Diffusion 宿主机部署 vs 容器化部署核心差异对比表

对比维度传统宿主机部署容器化部署
环境依赖管理需手动处理依赖冲突,如 Python 版本、PyTorch 版本、CUDA 工具链的兼容性问题,配置过程繁琐且易出错环境隔离,每个应用拥有独立的运行环境,依赖包仅存在于容器内,避免跨应用冲突
系统清洁度全局安装的依赖包会污染系统环境,后续清理困难,可能残留冗余文件或无效配置依赖与配置均封装在容器中,容器删除后无残留,不影响宿主机系统清洁度
环境一致性易出现 “在我机器上能跑” 问题,团队协作时因设备配置差异导致复现困难,排查成本高基于镜像构建,开发、测试、生产环境完全一致,消除环境差异带来的部署隐患
迁移与部署效率更换机器需重新执行所有配置步骤(安装依赖、调试参数等),迁移成本高、耗时久一次构建镜像,支持 “随处运行”,新设备仅需拉取镜像即可启动,部署效率大幅提升
版本控制能力无统一版本管理机制,依赖更新或配置修改后难以回滚到历史可用状态镜像支持版本化标记(如 v1.0 v2.1),可快速回滚到指定版本,风险可控
核心痛点总结依赖冲突频发、系统污染、复现难、迁移成本高无上述痛点,且额外具备资源隔离、可移植性强等优势(需掌握容器基础操作)

二、Docker Run:快速启动官方镜像

1、方案概述

这是最简单的容器化部署方式,适合快速验证和体验。

2、实战步骤

  • 拉取官方镜像
docker pull universonic/stable-diffusion-webui:full
如果出现以下日志

需要设置代理,打开vim /etc/systemd/system/docker.service.d/http-proxy.conf

填写代理信息,保存

重启docker

查看代理是否生效
  • 准备本地目录数据持久化目录
# 创建模型和输出目录 mkdir -p ~/sd-data/models mkdir -p ~/sd-data/outputs
将此目录挂载到容器后会继承宿主机的权限属性(UID/GID 一致),而容器内运行用户是 sduser(非 root),因此出现权限 denied。核心解决方案是 让宿主机挂载目录的 UID/GID 与容器内 sduser 的 UID/GID 完全匹配,以下是分步骤的根治方案:
# 格式:sudo chown -R 容器内sduser的UID:容器内sduser的GID 宿主机目录路径 sudo chown -R 1000:1000 ~/sd-data/models/ sudo chown -R 1000:1000 ~/sd-data/outputs/
  • 启动容器
docker run -d \ --gpus all \ --restart unless-stopped \ -p 7860:8080 \ -e HTTP_PROXY=http://192.168.100.237:7890 \ -e HTTPS_PROXY=http://192.168.100.237:7890 \ -e NO_PROXY=localhost,127.0.0.1 \ -v /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs \ -v /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models \ --name sd-webui \ --user sduser \ universonic/stable-diffusion-webui:full

3、参数详解

参数全量写法核心作用详细说明
docker run基础命令Docker 的核心命令,用于「创建并启动一个新容器」(若本地无指定镜像,会自动从 Docker Hub 拉取)。
-d后台运行全称 --detach,让容器在后台以「守护进程模式」运行,终端不会被容器日志占用(可通过 docker logs 容器名 查看日志)。
--gpus allGPU 资源分配允许容器访问宿主机的所有 NVIDIA GPU(需宿主机提前安装 NVIDIA 驱动 + nvidia-docker2 插件),是 Stable Diffusion 启用 GPU 加速的核心配置(无此参数则默认用 CPU 运行,生成图片极慢)。扩展:若想指定 GPU,可写 --gpus "device=0,1"(仅允许容器用第 0、1 号 GPU)。
--restart unless-stopped容器重启策略

配置容器退出后的重启规则,unless-stopped 表示:

1. 容器异常退出(如崩溃、OOM)时自动重启;

2. 手动执行 docker stop 停止容器后,不会自动重启;

3. 宿主机重启后,容器会随 Docker 服务自动启动(适合需要长期运行的应用,如 SD WebUI)。其他可选策略:always(无论何种退出都重启)、on-failure(仅失败退出时重启)、no(默认,不重启)。

-p 7860:8080端口映射全称 --publish,格式为「宿主机端口:容器内端口」,实现宿主机与容器的端口打通:- 左侧 7860:宿主机对外暴露的端口(外部访问 SD WebUI 时用这个端口,如 http://宿主机IP:7860);- 右侧 8080:容器内 SD WebUI 实际运行的端口(需与 SD 启动命令中 --port 指定的端口一致,否则无法访问)。
-e HTTP_PROXY=http://192.168.100.237:7890环境变量(HTTP 代理)全称 --env,向容器内注入环境变量,这里用于配置 HTTP 协议的代理:当容器内进程(如 SD WebUI 下载模型、插件时)发起 HTTP 请求,会通过 192.168.100.237:7890 这个代理服务器转发(解决内网无法直接访问外网资源的问题)。
-e HTTPS_PROXY=http://192.168.100.237:7890环境变量(HTTPS 代理)与上一个参数类似,用于配置 HTTPS 协议的代理(注意:即使是 HTTPS 代理,这里地址也用 http:// 开头,因为代理服务器本身的地址是 HTTP 协议)。
-e NO_PROXY=localhost,127.0.0.1环境变量(无需代理的地址)指定「不需要走代理」的地址列表:- localhost 和 127.0.0.1 均指代「容器自身」;作用:避免容器内进程访问自身服务时(如 SD 内部组件通信)也走代理,减少网络延迟和代理服务压力。
-v /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs目录挂载(输出目录)全称 --volume,实现「宿主机目录」与「容器内目录」的双向数据映射,核心作用是持久化数据(容器删除后数据不丢失):- 左侧 /home/sdwebui/sd-data/outputs:宿主机上用于存储 SD 生成图片的目录;- 右侧 /app/stable-diffusion-webui/outputs:容器内 SD WebUI 默认的图片输出目录;效果:容器内生成的图片会实时同步到宿主机目录,宿主机修改该目录下的文件也会同步到容器内。
-v /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models目录挂载(模型目录)与上一个 -v 逻辑一致,用于挂载 SD 模型文件目录:- 宿主机目录可提前存放模型(如 Stable Diffusion v1.5、RealVis 等 .safetensors 或 .ckpt 文件);- 容器内 SD WebUI 可直接读取该目录下的模型,无需在容器内重复下载,节省空间和时间。
--user sduser指定运行用户让容器内的进程(如 SD WebUI)以 sduser 这个非 root 用户身份运行:1. 安全性:避免容器内进程用 root 权限操作,降低恶意程序破坏宿主机的风险;2. 权限匹配:需确保宿主机挂载目录(如 outputsmodels)的 UID/GID 与容器内 sduser 一致,否则会出现「权限 denied」问题(参考之前的权限配置步骤)。

4、适用场景

  • 快速体验和测试
  • 单次临时使用
  • 学习Docker基础操作

5、镜像保存

如果用官方universonic/stable-diffusion-webui:full镜像每次启动都会初始化下载依赖

建议等待下载完成后通过docker commit保存为新镜像

三、Docker Compose:生产级编排部署

1、方案概述

使用Docker Compose可以定义复杂的多服务架构,适合生产环境部署。

2、文件内容

# 声明 Docker Compose 版本(3.2+ 支持 GPU 相关配置,与原命令兼容性匹配) version: "3.8" # 定义服务列表(单服务场景,服务名可自定义,此处与容器名保持一致) services: # 服务名:sd-webui(与原命令 --name 对应) sd-webui: # 镜像名:sd-webui-new(与原命令最后指定的镜像一致) image: sd-webui-new # 容器运行用户:sduser(与原命令 --user sduser 对应) user: sduser # 端口映射:宿主机7860端口 → 容器8080端口(与原命令 -p 7860:8080 完全一致) ports: - "7860:8080" # 格式:"宿主机端口:容器端口",默认支持 TCP 协议,无需额外声明 # 目录挂载:对应原命令的两个 -v 参数,保持宿主机与容器路径一致 volumes: - /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs - /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models # GPU 配置:启用所有 GPU(与原命令 --gpus all 功能一致) # runtime: nvidia 用于启用 NVIDIA 运行时,deploy.resources 声明 GPU 资源预留 runtime: nvidia deploy: resources: reservations: devices: - driver: nvidia capabilities: [gpu] # 声明需要 GPU 资源,等价于 --gpus all # 重启策略:容器退出后自动重启(与原命令 --restart unless-stopped 完全一致) restart: unless-stopped # 可选:容器名配置(若不指定,默认生成 "项目名_sd-webui_1",此处显式指定与原命令一致的容器名) container_name: sd-webui

3、操作命令

# 启动所有服务 docker compose up -d # 查看服务状态 docker compose logs -f # 停止服务 docker compose down # 更新镜像并重启 docker compose pull && docker compose up -d

4、网络配置优势

  • 反向代理(Nginx/Traefik)
  • 监控服务(Prometheus/Grafana)
  • 文件管理系统
  • 数据库服务

5、适用场景

  • 生产环境部署
  • 多服务架构
  • 团队协作开发
  • 需要高可用性和监控

四、自定义Dockerfile:高度定制化部署

⭐点击查看


五、故障排除与常见问题

1、指定挂载GPU而不是全部挂载

1.1、查看容器内部GPU数量

# 进入容器内部 root@gpu-3090-vm09:/home/sdwebui# docker exec -it sd-webui bash # 查看GPU sduser@bb429aab044d:~/stable-diffusion-webui$ nvidia-smi Sun Sep 28 08:34:40 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.153.02 Driver Version: 570.153.02 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 Off | N/A | | 36% 31C P8 24W / 350W | 2909MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GeForce RTX 3090 Off | 00000000:02:00.0 Off | N/A | | 36% 29C P8 29W / 350W | 3MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| +-----------------------------------------------------------------------------------------+

    1.2、指定GPU

    # 会使用第一张 RTX 3090 docker run -d \ --gpus "device=0" \ --restart unless-stopped \ -p 7860:8080 \ -v /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs \ -v /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models \ --name sd-webui \ --user sduser \ sd-webui-new 
    version: "3.8" services: sd-webui: image: sd-webui-new user: sduser ports: - "7860:8080" volumes: - /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs - /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models runtime: nvidia restart: unless-stopped container_name: sd-webui deploy: resources: reservations: devices: - driver: nvidia capabilities: [gpu] device_ids: ["0"] # <-- 核心修改点:指定使用索引为 0 的 GPU
    sduser@dec6235c8a3c:~/stable-diffusion-webui$ nvidia-smi Sun Sep 28 08:39:58 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 570.153.02 Driver Version: 570.153.02 CUDA Version: 12.8 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 Off | N/A | | 36% 31C P8 24W / 350W | 2909MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| +-----------------------------------------------------------------------------------------+ 

    2、网络波动,一般是因为连接超时

    ################################################################ Install script for stable-diffusion + Web UI Tested on Debian 11 (Bullseye), Fedora 34+ and openSUSE Leap 15.4 or newer. ################################################################ ################################################################ Running on sduser user ################################################################ ################################################################ Repo already cloned, using it as install directory ################################################################ ################################################################ Create and activate python venv ################################################################ ################################################################ Launching launch.py... ################################################################ Python 3.10.12 (main, Feb 4 2025, 14:57:36) [GCC 11.4.0] Version: v1.10.1 Commit hash: 82a973c04367123ae98bd9abdf80d9eda9b910e2 Installing xformers Traceback (most recent call last): File "/app/stable-diffusion-webui/launch.py", line 48, in <module> main() File "/app/stable-diffusion-webui/launch.py", line 39, in main prepare_environment() File "/app/stable-diffusion-webui/modules/launch_utils.py", line 402, in prepare_environment run_pip(f"install -U -I --no-deps {xformers_package}", "xformers") File "/app/stable-diffusion-webui/modules/launch_utils.py", line 144, in run_pip return run(f'"{python}" -m pip {command} --prefer-binary{index_url_line}', desc=f"Installing {desc}", errdesc=f"Couldn't install {desc}", live=live) File "/app/stable-diffusion-webui/modules/launch_utils.py", line 116, in run raise RuntimeError("\n".join(error_bits)) RuntimeError: Couldn't install xformers. Command: "/app/stable-diffusion-webui/venv/bin/python" -m pip install -U -I --no-deps xformers==0.0.23.post1 --prefer-binary Error code: 1 stdout: Collecting xformers==0.0.23.post1 Downloading xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl.metadata (1.0 kB) Downloading xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl (213.0 MB) 0.0/213.0 MB ? eta -:--:-- stderr: WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', TimeoutError('_ssl.c:990: The handshake operation timed out'))': /simple/xformers/ WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', ConnectionResetError(104, 'Connection reset by peer'))': /packages/f4/89/ce8e936d3e64b3b565c16312dd6446d54f6e485f864130702c6b3b3cbe7c/xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl.metadata WARNING: Connection timed out while downloading. error: incomplete-download × Download failed because not enough bytes were received (0 bytes/213.0 MB) ╰─> URL: https://files.pythonhosted.org/packages/f4/89/ce8e936d3e64b3b565c16312dd6446d54f6e485f864130702c6b3b3cbe7c/xformers-0.0.23.post1-cp310-cp310-manylinux2014_x86_64.whl note: This is an issue with network connectivity, not pip. hint: Consider using --resume-retries to enable download resumption. ~/stable-diffusion-webui 

    3、如何下载模型?

    1、Models – Hugging Face

    2、AIGate

    3、Civitai: The Home of Open-Source Generative AI
    Demo镜像经过commit后的总体积仅为12.2GB,包含一个预置镜像,如需要下载新模型,需要将下载的模型放入/home/sdwebui/sd-data/models指定的模型文件夹中

    4、内存不足问题

    生成图片时容器崩溃,日志显示OOM(Out of Memory)
    version: "3.8" services: sd-webui: image: sd-webui-new user: sduser ports: - "7860:8080" volumes: - /home/sdwebui/sd-data/outputs:/app/stable-diffusion-webui/outputs - /home/sdwebui/sd-data/models:/app/stable-diffusion-webui/models runtime: nvidia restart: unless-stopped container_name: sd-webui deploy: resources: limits: memory: "8G" # 硬限制:容器最多只能使用 8GB 内存 reservations: memory: "4G" # 预留 4GB 内存 devices: # 预留 GPU 资源(与内存预留同级) - driver: nvidia capabilities: [gpu] device_ids: ["0"] # 指定使用索引为 0 的 GPU

    5、日志查看和调试

    # 查看实时日志 docker logs -f sd-webui # 查看最近100行日志 docker logs --tail 100 sd-webui # 进入容器调试 docker exec -it sd-webui bash # 检查容器资源使用情况 docker stats sd-webui

    6、无cuda支持

    --- 0%| | 0/20 [00:00<?, ?it/s] *** Error completing request *** Arguments: ('task(wan01r2cq08hz6h)', <gradio.routes.Request object at 0x7f43ce887850>, '11111', '', [], 1, 1, 7, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', [], 0, 20, 'DPM++ 2M', 'Automatic', False, '', 0.8, -1, False, -1, 0, 0, 0, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {} Traceback (most recent call last): File "/app/modules/call_queue.py", line 74, in f res = list(func(*args, **kwargs)) File "/app/modules/call_queue.py", line 53, in f res = func(*args, **kwargs) File "/app/modules/call_queue.py", line 37, in f res = func(*args, **kwargs) File "/app/modules/txt2img.py", line 109, in txt2img processed = processing.process_images(p) File "/app/modules/processing.py", line 847, in process_images res = process_images_inner(p) File "/app/modules/processing.py", line 988, in process_images_inner samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts) File "/app/modules/processing.py", line 1346, in sample samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x)) File "/app/modules/sd_samplers_kdiffusion.py", line 230, in sample samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "/app/modules/sd_samplers_common.py", line 272, in launch_sampling return func() File "/app/modules/sd_samplers_kdiffusion.py", line 230, in <lambda> samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/app/repositories/k-diffusion/k_diffusion/sampling.py", line 594, in sample_dpmpp_2m denoised = model(x, sigmas[i] * s_in, **extra_args) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/modules/sd_samplers_cfg_denoiser.py", line 249, in forward x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in)) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/repositories/k-diffusion/k_diffusion/external.py", line 112, in forward eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs) File "/app/repositories/k-diffusion/k_diffusion/external.py", line 138, in get_eps return self.inner_model.apply_model(*args, **kwargs) File "/app/modules/sd_hijack_utils.py", line 22, in <lambda> setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs)) File "/app/modules/sd_hijack_utils.py", line 34, in __call__ return self.__sub_func(self.__orig_func, *args, **kwargs) File "/app/modules/sd_hijack_unet.py", line 50, in apply_model result = orig_func(self, x_noisy.to(devices.dtype_unet), t.to(devices.dtype_unet), cond, **kwargs) File "/app/modules/sd_hijack_utils.py", line 22, in <lambda> setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs)) File "/app/modules/sd_hijack_utils.py", line 36, in __call__ return self.__orig_func(*args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 858, in apply_model x_recon = self.model(x_noisy, t, **cond) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py", line 1335, in forward out = self.diffusion_model(x, t, context=cc) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/modules/sd_unet.py", line 91, in UNetModel_forward return original_forward(self, x, timesteps, context, *args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 797, in forward h = module(h, emb, context) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/openaimodel.py", line 84, in forward x = layer(x, context) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/modules/sd_hijack_utils.py", line 22, in <lambda> setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs)) File "/app/modules/sd_hijack_utils.py", line 34, in __call__ return self.__sub_func(self.__orig_func, *args, **kwargs) File "/app/modules/sd_hijack_unet.py", line 96, in spatial_transformer_forward x = block(x, context=context[i]) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 269, in forward return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/diffusionmodules/util.py", line 123, in checkpoint return func(*inputs) File "/app/repositories/stable-diffusion-stability-ai/ldm/modules/attention.py", line 272, in _forward x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/app/modules/sd_hijack_optimizations.py", line 497, in xformers_attention_forward out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=get_xformers_flash_attention_op(q, k, v)) File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 223, in memory_efficient_attention return _memory_efficient_attention( File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 321, in _memory_efficient_attention return _memory_efficient_attention_forward( File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/__init__.py", line 337, in _memory_efficient_attention_forward op = _dispatch_fw(inp, False) File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 120, in _dispatch_fw return _run_priority_list( File "/opt/conda/lib/python3.10/site-packages/xformers/ops/fmha/dispatch.py", line 63, in _run_priority_list raise NotImplementedError(msg) NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: query : shape=(2, 4096, 8, 40) (torch.float16) key : shape=(2, 4096, 8, 40) (torch.float16) value : shape=(2, 4096, 8, 40) (torch.float16) attn_bias : <class 'NoneType'> p : 0.0 `decoderF` is not supported because: xFormers wasn't build with CUDA support attn_bias type is <class 'NoneType'> operator wasn't built - see `python -m xformers.info` for more info `[email protected]` is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see `python -m xformers.info` for more info `tritonflashattF` is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see `python -m xformers.info` for more info triton is not available Only work on pre-MLIR triton for now `cutlassF` is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see `python -m xformers.info` for more info `smallkF` is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) operator wasn't built - see `python -m xformers.info` for more info unsupported embed per head: 40 

    新增指定cuda兼容的xformers

    # 6. 安装所有 Python 依赖 RUN pip install \ -i https://pypi.tuna.tsinghua.edu.cn/simple \ --trusted-host pypi.tuna.tsinghua.edu.cn \ -r requirements_versions.txt # 7. 【新增】安装支持 CUDA 11.8 的 xformers # 我们使用 xformers 官方提供的预编译 wheel 文件,确保 CUDA 支持 RUN pip install --upgrade --force-reinstall \ --index-url https://download.pytorch.org/whl/cu118 \ --no-deps \ xformers==0.0.22.post7

    可正常生成图片

    To create a public link, set `share=True` in `launch()`. Startup time: 50.9s (prepare environment: 41.7s, import torch: 3.8s, import gradio: 1.2s, setup paths: 1.6s, initialize shared: 0.5s, other imports: 0.5s, load scripts: 0.7s, create ui: 0.7s, gradio launch: 0.1s). 6ce0161689b3853acaa03779ec93eafe75a02f4ced659bee03f50797806fa2fa Loading weights [6ce0161689] from /app/models/Stable-diffusion/v1-5-pruned-emaonly.safetensors Creating model from config: /app/configs/v1-inference.yaml /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:945: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Applying attention optimization: xformers... done. Model loaded in 17.7s (calculate hash: 12.5s, create model: 2.8s, apply weights to model: 1.9s, calculate empty prompt: 0.2s). 100%|██████████| 20/20 [00:01<00:00, 12.15it/s] Total progress: 100%|██████████| 20/20 [00:01<00:00, 15.26it/s] Total progress: 100%|██████████| 20/20 [00:01<00:00, 18.42it/s] 

    六、结语

    通过这三种容器化部署方案,您可以根据实际需求选择最适合的方式。无论是快速体验还是生产部署,容器化都能提供稳定、可复现的环境保障。

    选择建议:初学者:从Docker Run开始,快速上手生产环境:使用Docker Compose,便于维护和扩展定制需求:采用自定义Dockerfile,实现完全控制

    容器化部署不仅解决了环境一致性问题,更为后续的扩展、监控和维护提供了坚实基础。

    Read more

    Flutter for OpenHarmony: Flutter 三方库 cryptography 在鸿蒙上实现金融级现代加解密(高性能安全库)

    Flutter for OpenHarmony: Flutter 三方库 cryptography 在鸿蒙上实现金融级现代加解密(高性能安全库)

    欢迎加入开源鸿蒙跨平台社区:https://openharmonycrossplatform.ZEEKLOG.net 前言 在开发 OpenHarmony 涉及用户隐私、支付或核心机密的 App 时,基础的 Base64 或简单的 MD5 已经无法满足安全需求。我们需要的是国际标准的现代密码学算法,如 AES-GCM、ChaCha20、ED25519 等。 cryptography 是目前 Flutter 生态中最推荐的现代密码学库。它不仅提供了极其丰富的算法实现,更关键的是它支持“分块处理”和“异步运算”,非常适合在鸿蒙设备上处理大文件加密。 一、核心加密体系解析 cryptography 采用了强类型的 API 设计,确保你不会错误地组合不兼容的参数。 原始敏感数据 (uint8list) Cipher (如 AesGcm) 多线程运算 (Isolate) 密文 + Nonce + MAC

    By Ne0inhk
    多模态大模型垂直微调实战:基于Qwen3-VL-4B-Thinking与 Llama Factory的完整指南

    多模态大模型垂直微调实战:基于Qwen3-VL-4B-Thinking与 Llama Factory的完整指南

    文章目录 * 一 多模态大模型 * 1.1 多模态垂直微调 * 1.2 微调的意义 * 二 多模态基座模型选择 * 2.1 多模态模型对比表 * 2.2 选型建议矩阵 * 2.3 微调与部署视角选择 * 三 Qwen3-VL-4B-Thinking理解微调(Llama Factory) * 3.1 数据集制作 * 3.2 实验平台租用和基本环境配置 * 3.3 数据集上传和注册 * 3.4 启动llama factory和网页访问 * 3.5 关键训练参数可视化配置 * 3.6 模型效果使用体验 * 3.7 模型导出 一 多模态大模型 * 多模态大模型(Multimodal

    By Ne0inhk

    2025 RTX 50 系适配:Stable Diffusion WebUI Docker 硬件清单

    硬件需求概览 2025年发布的RTX 50系显卡预计采用新一代架构(如Blackwell或后续),性能显著提升。适配Stable Diffusion WebUI需重点关注显存、CUDA核心数及Docker环境兼容性。 推荐配置 * 显卡:RTX 5090(预计24GB+显存)或RTX 5080(16GB+显存),支持FP16/FP32加速。 * CPU:Intel i7-13700K或AMD Ryzen 9 7950X,确保高效数据预处理。 * 内存:32GB DDR5(最低),建议64GB以处理复杂模型。 * 存储:1TB NVMe SSD(PCIe 4.0+),用于快速加载模型和数据集。 软件与Docker环境适配 * CUDA Toolkit:需匹配RTX 50系驱动(如CUDA 12.5+)。 * Docker镜像:

    By Ne0inhk