AI 大模型基础与深度学习实战指南

AI 大模型基础与深度学习实战指南 | 极客日志

import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

# 加载数据集
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()

# 查看数据形状
print(f"训练集形状：{train_images.shape}")
print(f"测试集形状：{test_images.shape}")

plt.figure(figsize=(10, 5))
for i in range(10):
    plt.subplot(2, 5, i+1)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.title(f"Label: {train_labels[i]}")
    plt.axis('off')
plt.show()

# 归一化：将像素值从 [0, 255] 缩放到 [0, 1]
train_images = train_images.astype('float32') / 255.0
test_images = test_images.astype('float32') / 255.0

# 重塑维度：增加通道维度，变为 (batch, height, width, channels)
train_images = train_images.reshape((60000, 28, 28, 1))
test_images = test_images.reshape((10000, 28, 28, 1))

model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),  # 防止过拟合
    layers.Dense(10, activation='softmax')  # 输出 10 个类别的概率
])

# 打印网络结构摘要
model.summary()

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(
    train_images, train_labels,
    epochs=5,
    batch_size=64,
    validation_split=0.1,
    verbose=1
)

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=0)
print(f'\n测试集准确率：{test_acc:.4f}')

import numpy as np

# 预测第一张测试图片
predictions = model.predict(test_images)
predicted_class = np.argmax(predictions[0])
print(f"预测结果：{predicted_class}, 真实标签：{test_labels[0]}")

# 保存模型
model.save('mnist_cnn_model.h5')
print("模型已保存")

AI 大模型基础与深度学习实战指南

AI 大模型基础与深度学习实战指南

一、深度学习的基本概念

1.1 核心原理

1.2 深度学习框架

1.3 经典模型架构

二、经典入门 Demo 实战

2.1 深度学习原理简述

2.2 手写数字识别实战（MNIST）

环境准备

1. 数据准备

2. 构建神经网络模型

3. 编译模型

4. 训练模型

5. 预测与保存

三、进阶：从 CNN 到大模型

3.1 Transformer 与自注意力机制

3.2 大模型的特征

3.3 最佳实践建议

四、总结

更多推荐文章

相关免费在线工具

AI 大模型基础与深度学习实战指南

AI 大模型基础与深度学习实战指南

一、深度学习的基本概念

1.1 核心原理

1.2 深度学习框架

1.3 经典模型架构

二、经典入门 Demo 实战

2.1 深度学习原理简述

2.2 手写数字识别实战（MNIST）

环境准备

1. 数据准备

2. 构建神经网络模型

3. 编译模型

4. 训练模型

5. 预测与保存

三、进阶：从 CNN 到大模型

3.1 Transformer 与自注意力机制

3.2 大模型的特征

3.3 最佳实践建议

四、总结

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具