开源浪潮下的中国力量：文心一言大模型本地部署与应用全攻略

开源浪潮下的中国力量：文心一言大模型本地部署与应用全攻略 | 极客日志

流程阶段	操作目标	工具/资源
环境准备	创建 Python 虚拟环境，安装依赖	Conda / pip
模型部署	加载 ERNIE 预训练模型，实现基础问答	GitCode+ Transformers + Gradio
数据准备	构建小样本中文问答数据	自制或开源精简 JSON 数据集
微调训练	使用 LoRA 或原生微调方式	PyTorch + Transformers
部署测试	将微调后的模型部署到网页端	Gradio 本地服务
效果对比	原始 vs 微调模型效果对比	人工分析 / 案例测评

模型名称	参数规模	模型风格	说明
ERNIE-4.5-0.3B-Base-PT	0.3B	基础模型	支持 CausalLM 微调
ERNIE-4.5-0.3B-LLaMA-PT	0.3B	LLaMA 格式	兼容 LLaMA 微调脚本
ERNIE-4.5-0.3B-Chat-PT	0.3B	对话风格	自带 instruction 数据训练
ERNIE-Speed / ERNIE-Tiny 系列	数百万级至亿级	推理/轻量模型	适合移动端与边缘设备部署

项目	配置
操作系统	Windows 10 / 11
Python 版本	Python 3.9
构建方式	Conda 虚拟环境 + Transformers
显卡支持	可选 GPU，推荐 RTX 30/40 系列
接口平台	Gradio（网页交互）
运行平台	Pycharm2025

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 pip install transformers datasets gradio accelerate

./models/ERNIE-4.5-0.3B-Base-PT/

./ernie4.5-finetuned/checkpoint-750/

import gradio as gr import torch from transformers import AutoModelForCausalLM, AutoTokenizer # 加载 tokenizer 和模型 tokenizer = AutoTokenizer.from_pretrained("./models/ERNIE-4.5-0.3B-Base-PT", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("./ernie4.5-finetuned/checkpoint-750", trust_remote_code=True) model.eval() model.to("cuda"if torch.cuda.is_available()else"cpu")# 推理函数defgenerate_response(prompt): inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=256) input_ids = inputs["input_ids"].to(model.device) attention_mask = inputs["attention_mask"].to(model.device)with torch.no_grad(): output = model.generate( input_ids=input_ids, attention_mask=attention_mask, max_new_tokens=128, do_sample=True, top_p=0.95, temperature=0.9, repetition_penalty=1.2, eos_token_id=tokenizer.eos_token_id or tokenizer.pad_token_id, pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id )return tokenizer.decode(output[0][input_ids.shape[1]:], skip_special_tokens=True)# Gradio 页面 iface = gr.Interface( fn=generate_response, inputs=gr.Textbox(lines=2, label="输入问题"), outputs=gr.Textbox(lines=4, label="模型回答"), title="ERNIE 4.5 微调模型测试") iface.launch(server_name="0.0.0.0", server_port=7860)

defsplit_dataset(json_file, train_ratio=0.8, val_ratio=0.1, test_ratio=0.1, seed=42):withopen(json_file,'r', encoding='utf-8')as f: data =[json.loads(line)for line in f] random.seed(seed) random.shuffle(data) n =len(data) train_end =int(n * train_ratio) val_end =int(n *(train_ratio + val_ratio)) train_data = data[:train_end] val_data = data[train_end:val_end] test_data = data[val_end:]return train_data, val_data, test_data defsave_jsonl(filename, data):withopen(filename,'w', encoding='utf-8')as f:for item in data: f.write(json.dumps(item, ensure_ascii=False)+'\n') train_data, val_data, test_data = split_dataset("train_100percent_sample.json") save_jsonl("train.json", train_data) save_jsonl("val.json", val_data) save_jsonl("test.json", test_data)

# 加载模型和分词器 model_name ="./models/ERNIE-4.5-0.3B-Base-PT" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

dataset = load_dataset("json", data_files={"train":"train.json","validation":"val.json","test":"test.json"})defpreprocess(example): prompt = example["input"] response = example["output"] prompt_ids = tokenizer(prompt, truncation=True, max_length=256, add_special_tokens=False) response_ids = tokenizer(response, truncation=True, max_length=256, add_special_tokens=False) input_ids = prompt_ids["input_ids"]+ response_ids["input_ids"] attention_mask =[1]*len(input_ids) labels =[-100]*len(prompt_ids["input_ids"])+ response_ids["input_ids"] pad_len =512-len(input_ids)if pad_len >0: input_ids +=[tokenizer.pad_token_id]* pad_len attention_mask +=[0]* pad_len labels +=[-100]* pad_len else: input_ids = input_ids[:512] attention_mask = attention_mask[:512] labels = labels[:512]return{"input_ids": input_ids,"attention_mask": attention_mask,"labels": labels } tokenized_datasets = dataset.map( preprocess, batched=False, remove_columns=dataset["train"].column_names )

training_args = TrainingArguments( output_dir="/root/autodl-tmp/ernie4.5-QA3", per_device_train_batch_size=2, num_train_epochs=3, save_steps=100, logging_steps=10, learning_rate=2e-5, fp16=True, save_total_limit=1,#evaluation_strategy="epoch", # 每个epoch评估一次 logging_dir="./logs", report_to="none",# 不用wandb)

trainer = Trainer( model=model, args=training_args, train_dataset=tokenized_datasets["train"], eval_dataset=tokenized_datasets["validation"], data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False), callbacks=[loss_recorder],) trainer.train()

import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_path ="/root/autodl-tmp/ernie4.5-QA/checkpoint-14750" tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True) model.eval() model.to("cuda"if torch.cuda.is_available()else"cpu")defgenerate_response(prompt): inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=256) input_ids = inputs["input_ids"] attention_mask = inputs["attention_mask"]# Ensure input_ids are 2Dif input_ids.dim()==1: input_ids = input_ids.unsqueeze(0)# Modify attention_mask to be 2Dif attention_mask.dim()!=2: attention_mask = attention_mask.view(input_ids.shape[0],-1) input_ids = input_ids.to(model.device) attention_mask = attention_mask.to(model.device)with torch.no_grad(): output = model.generate( input_ids=input_ids, attention_mask=attention_mask, max_new_tokens=128, do_sample=True, top_p=0.95, temperature=0.9, repetition_penalty=1.2, eos_token_id=tokenizer.eos_token_id if tokenizer.eos_token_id isnotNoneelse tokenizer.pad_token_id, pad_token_id=tokenizer.pad_token_id if tokenizer.pad_token_id isnotNoneelse tokenizer.eos_token_id ) generated_tokens = output[0][input_ids.shape[1]:] response = tokenizer.decode(generated_tokens, skip_special_tokens=True)return response.strip()if __name__ =="__main__":print("ERNIE 4.5 微调模型控制台问答，输入 exit 或空行退出。")whileTrue: prompt =input("\n请输入问题：\n")ifnot prompt.strip()or prompt.strip().lower()=="exit":print("已退出。")break response = generate_response(prompt)print("\n模型回答：\n"+ response)

split_dataset = raw_dataset.train_test_split(test_size=0.1, seed=42) train_val = split_dataset['train'] test = split_dataset['test']

def preprocess(example): prompt = example["input"] response = example["output"] prompt_ids = tokenizer(prompt, truncation=True, max_length=256, add_special_tokens=False) response_ids = tokenizer(response, truncation=True, max_length=256, add_special_tokens=False) input_ids = prompt_ids["input_ids"] + response_ids["input_ids"] attention_mask = [1] * len(input_ids) labels = [-100] * len(prompt_ids["input_ids"]) + response_ids["input_ids"] pad_len = 512 - len(input_ids) if pad_len > 0: input_ids += [tokenizer.pad_token_id] * pad_len attention_mask += [0] * pad_len labels += [-100] * pad_len else: input_ids = input_ids[:512] attention_mask = attention_mask[:512] labels = labels[:512] return { "input_ids": input_ids, "attention_mask": attention_mask, "labels": labels }

plt.figure(figsize=(8,5)) plt.plot(loss_history.train_loss, label="Train Loss") plt.plot(loss_history.epochs, loss_history.eval_loss, label="Validation Loss") plt.xlabel("Steps/Epochs") plt.ylabel("Loss") plt.legend() plt.title("Training and Validation Loss") plt.savefig("loss_curve.png") plt.show()

bleu = sacrebleu.corpus_bleu(preds,[refs]) bleu_score = bleu.score print(f"BLEU: {bleu_score:.4f}") scorer = rouge_scorer.RougeScorer(['rougeL'], use_stemmer=True) rouge_l_scores =[scorer.score(ref, pred)['rougeL'].fmeasure for pred, ref inzip(preds, refs)] rouge_l = np.mean(rouge_l_scores)print(f"ROUGE-L: {rouge_l:.4f}")print("\n评估指标：")print(f"Perplexity: {perplexity:.2f}")print(f"BLEU: {bleu_score:.4f}")print(f"ROUGE-L: {rouge_l:.4f}")

开源浪潮下的中国力量：文心一言大模型本地部署与应用全攻略

文章目录

一、前言

1.1 模型开源意义与背景

1.2 文心一言大模型简介

更多推荐文章

相关免费在线工具

1.3 测评目标与思路

二、文心一言大模型

2.1 文心一言开源概况

2.2 文心一言大模型技术综述

三、文心一言大模型深度解析

3.1 开源策略与生态影响

3.1.1 开源时间与版本介绍

3.2 模型特性与优势

四、部署实战：从 GitCode下载ERNIE-4.5-0.3B 模型到本地可交互服务

4.1 环境准备与部署方式

4.2 下载与安装步骤

4.3 调用示例与接口说明

编写部署测试脚本

五、使用公开的QA数据集微调模型

5.1 数据准备

5.2 微调流程

5.2.1 配置环境与安装依赖

5.2.2 加载预训练模型

5.2.3 数据集加载与预处理

5.2.4 配置训练参数

5.2.5 训练与微调模型

5.3 效果测试

5.4 评估结果量化分析

六、总结

6.1 模型开源价值 🚀

6.2 后续使用与研究建议 📌

更多推荐文章

相关免费在线工具

指标	分数
Perplexity	2.12
BLEU	26.7288
ROUGE-L	0.4076

开源浪潮下的中国力量：文心一言大模型本地部署与应用全攻略

文章目录

一、前言

1.1 模型开源意义与背景

1.2 文心一言大模型简介

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具

1.3 测评目标与思路

二、文心一言大模型

2.1 文心一言开源概况

2.2 文心一言大模型技术综述

三、文心一言大模型深度解析

3.1 开源策略与生态影响

3.1.1 开源时间与版本介绍

3.2 模型特性与优势

四、部署实战：从 GitCode下载ERNIE-4.5-0.3B 模型到本地可交互服务

4.1 环境准备与部署方式

4.2 下载与安装步骤

4.3 调用示例与接口说明

编写部署测试脚本

五、使用公开的QA数据集微调模型

5.1 数据准备

5.2 微调流程

5.2.1 配置环境与安装依赖

5.2.2 加载预训练模型

5.2.3 数据集加载与预处理

5.2.4 配置训练参数

5.2.5 训练与微调模型

5.3 效果测试

5.4 评估结果量化分析

六、总结

6.1 模型开源价值 🚀

6.2 后续使用与研究建议 📌

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具