LLaMA-Factory 微调 DeepSeek-R1 模型入门教程 | 极客日志

PythonAI算法

LLaMA-Factory 微调 DeepSeek-R1 模型入门教程

使用 LLaMA-Factory 对 DeepSeek-R1 模型进行可视化的全流程微调教程。内容涵盖环境搭建（Anaconda、CUDA、BitsAndBytes）、数据集准备（合并、清洗、脱敏及格式转换）、模型配置与训练参数设置、评估预测以及模型导出。通过该教程，用户可掌握基于 LLaMA-Factory 框架的大语言模型个性化微调方法。

KernelLab发布于 2026/4/5更新于 2026/7/2055 浏览

LLaMA-Factory 模型微调基础教程

LLaMA-Factory 概述

使用 LLaMA-Factory 进行模型微调具有多方面的好处。首先，它简化了大模型微调的过程，使得即使是没有深厚技术功底的用户也能轻松进行模型的优化和改进。此外，LLaMA-Factory 支持多种训练方法，如全量调参、LoRA 等，以及不同的对齐方案，如 DPO、PPO 等。这为用户提供了灵活性，可以根据具体需求选择合适的微调策略。

LLaMA-Factory 还提供了一站式服务，从模型微调到量化处理，再到运行，整个过程一气呵成，无需在不同的工具和流程之间来回切换。此外，它支持多种流行的语言模型，如 LLaMA、BLOOM、Mistral、Baichuan 等，涵盖了广泛的应用场景。

在模型量化方面，LLaMA-Factory 能够有效地压缩模型规模，减少模型运行所需的计算量和存储空间，使得模型能够在性能稍弱的设备上也能流畅运行。这不仅提高了模型的可访问性，也降低了运行成本。

此外，LLaMA-Factory 的训练过程中记录的内容比较全面，除了同步输出 loss 曲线图以外，还自带 BLEU 等评测指标，这有助于用户更好地监控和评估模型的性能。

LLaMA-Factory 下载

GitHub: LLaMA-Factory

进入 LLaMA-Factory 仓库页面，点击 Code 按钮下载源码包（推荐 ZIP 格式）。
解压完成之后记录一下解压路径。

Anaconda 环境创建

软硬件依赖详情

创建虚拟环境：官方给出的是 Python 至少 3.9，推荐 3.10。
打开终端。
导航到刚才解压的地址，例如：cd <project_path>。

LLaMA-Factory 依赖安装

# 依赖下载
pip install -r requirements.txt

# 安装依赖最好都执行一遍
pip install -e ".[torch,metrics]"

CUDA 安装

# CUDA 安装
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
# 记得输入 y 继续安装

量化 BitsAndBytes 安装

如果要在 Windows 平台上开启量化 LoRA（QLoRA），需要安装预编译的 bitsandbytes 库。支持 CUDA 11.1 到 12.2，请根据您的 CUDA 版本情况选择适合的发布版本。

pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.2.post2-py3-none-win_amd64.whl

可视化微调启动

# 启动命令
llamafactory-cli webui

如果出现无法访问 localhost 的错误，说明 Gradio share 为 false 而无法正确响应，需要更改 interface.py 代码。

找到 interface.py 存在路径，例如：LLaMA-Factory/src/llamafactory/webui。
找到 run_web_ui() 和 run_web_demo() 方法，把修改成。

相关免费在线工具

加密/解密文本
使用加密算法（如AES、TripleDES、Rabbit或RC4）加密和解密文本明文。在线工具，加密/解密文本在线工具，online
RSA密钥对生成器
生成新的随机RSA私钥和公钥pem证书。在线工具，RSA密钥对生成器在线工具，online
Mermaid 预览与可视化编辑
基于 Mermaid.js 实时预览流程图、时序图等图表，支持源码编辑与即时渲染。在线工具，Mermaid 预览与可视化编辑在线工具，online
随机西班牙地址生成器
随机生成西班牙地址（支持马德里、加泰罗尼亚、安达卢西亚、瓦伦西亚筛选），支持数量快捷选择、显示全部与下载。在线工具，随机西班牙地址生成器在线工具，online
Gemini 图片去水印
基于开源反向 Alpha 混合算法去除 Gemini/Nano Banana 图片水印，支持批量处理与下载。在线工具，Gemini 图片去水印在线工具，online
curl 转代码
解析常见 curl 参数并生成 fetch、axios、PHP curl 或 Python requests 示例代码。在线工具，curl 转代码在线工具，online

share=gradio_share

share=True

import os
import json

folder_path = r'<data_path>'
json_files = []
for root, dirs, files in os.walk(folder_path):
    for file in files:
        if file.endswith('.json'):
            json_files.append(os.path.join(root, file))

merged_data = []
for file in json_files:
    with open(file, 'r', encoding='utf-8') as f:
        try:
            data = json.load(f)
            merged_data.append(data)
        except json.JSONDecodeError:
            print(f"Error decoding {file}. Skipping.")

merged_file_path = os.path.join(folder_path, 'merged_data.json')
with open(merged_file_path, 'w', encoding='utf-8') as merged_file:
    json.dump(merged_data, merged_file, indent=4, ensure_ascii=False)
print(f"合并后的文件已保存至：{merged_file_path}")

import json
import re

with open('merged_data.json', 'r', encoding='utf-8') as file:
    data = json.load(file)

converted_data = []

def clean_data(dataset):
    cleaned_data = []
    for example in dataset:
        messages = example['messages']
        cleaned_messages = []
        for message in messages:
            if not message['content'].strip():
                continue
            message['content'] = message['content'].replace("\n", " ").strip()
            cleaned_messages.append(message)
        if cleaned_messages:
            cleaned_data.append({'messages': cleaned_messages})
    return cleaned_data

def replace_sensitive_info(text):
    text = re.sub(r'\d{3}[-]?\d{4}[-]?\d{4}', '[PHONE_NUMBER]', text)
    text = re.sub(r'\S+@\S+', '[EMAIL]', text)
    text = re.sub(r'\d{4}-\d{2}-\d{2}', '[DATE]', text)
    return text

def anonymize_data(dataset):
    anonymized_data = []
    for example in dataset:
        messages = example['messages']
        anonymized_messages = []
        for message in messages:
            if message['role'] == 'user':
                message['content'] = message['content'].replace("用户", "用户 X")
            message['content'] = replace_sensitive_info(message['content'])
            anonymized_messages.append(message)
        anonymized_data.append({'messages': anonymized_messages})
    return anonymized_data

for item_list in data:
    for item in item_list:
        if 'messages' not in item:
            continue
        conversation = {"conversations": []}
        for message in item['messages']:
            role = message['role']
            content = message['content']
            content = replace_sensitive_info(content)
            if role == "system":
                continue
            elif role == "user":
                from_role = "human"
            elif role == "assistant":
                from_role = "gpt"
            conversation['conversations'].append({"from": from_role, "value": content})
        converted_data.append(conversation)

with open('converted_data.json', 'w', encoding='utf-8') as file:
    json.dump(converted_data, file, ensure_ascii=False, indent=2)
print("数据转换完成，结果已保存为 converted_data.json")

"converted_data": {
    "file_name": "converted_data.json",
    "formatting": "sharegpt",
    "columns": {
        "messages": "conversations"
    }
},

transformers>=4.41.2,<=4.48.3,!=4.46.*,!=4.47.*,!=4.48.0,!=4.48.1,!=4.48.2;python_version<'3.10'
transformers>=4.41.2,<=4.48.3,!=4.46.*,!=4.47.*,!=4.48.0;python_version>='3.10'
datasets>=2.16.0,<=3.2.0
accelerate>=0.34.0,<=1.2.1
peft>=0.11.1,<=0.12.0
trl>=0.8.6,<=0.9.6
tokenizers>=0.19.0,<=0.21.0
gradio>=4.38.0,<=5.12.0
pandas>=2.0.0
scipy
einops
sentencepiece
tiktoken
protobuf
uvicorn
pydantic
fastapi
sse-starlette
matplotlib>=3.7.0
fire
packaging
pyyaml
numpy<2.0.0
av
librosa
tyro<0.9.0

LLaMA-Factory 微调 DeepSeek-R1 模型入门教程

LLaMA-Factory 模型微调基础教程

LLaMA-Factory 概述

LLaMA-Factory 下载

Anaconda 环境创建

软硬件依赖详情

LLaMA-Factory 依赖安装

CUDA 安装

量化 BitsAndBytes 安装

可视化微调启动

更多推荐文章

相关免费在线工具

数据集准备

所需工具下载

所需数据合并

数据集预处理

DeepSeek-R1 可视化微调

数据集处理

数据详解

LLaMA-Factory 基础设置

模型评估与预测

训练模型对话

训练模型导出

更多推荐文章

相关免费在线工具

LLaMA-Factory 微调 DeepSeek-R1 模型入门教程

LLaMA-Factory 模型微调基础教程

LLaMA-Factory 概述

LLaMA-Factory 下载

Anaconda 环境创建

软硬件依赖详情

LLaMA-Factory 依赖安装

CUDA 安装

量化 BitsAndBytes 安装

可视化微调启动

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

数据集准备

所需工具下载

所需数据合并

数据集预处理

DeepSeek-R1 可视化微调

数据集处理

数据详解

LLaMA-Factory 基础设置

模型评估与预测

训练模型对话

训练模型导出

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具