2024-09-27

整站

TensorFlow识别MNIST

本文主要是通过Softmax对MNIST进行分类，通过一个简单的前馈神经网络模型实现，以此
(1).熟悉算法工程工作内容都有那些.
(2).熟悉TensorFlow框架的原理和使用
(3).熟悉Softmax和简单的网络模型
(4).熟悉基于python的TensorFlow框架工具

TensorFlow识别 MNIST

具体实现见：kaggle-digit-rec-tensorflow:ipynb

TensorFlow Version: 2.11.0

书中使用的是 python2.7、tensorflow 1.14.0；使用方式上有些差异. (可以关注下差异)

1.1 import

引入依赖

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

from scipy.io import loadmat
import matplotlib.pyplot as plt  # 用于图像相关 

import tensorflow as tf

from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical

1.2 数据下载

tf2.x 也可以通过下载数据.

# 引入依赖
# 原始数据是 784 * 70000 的二维数组(mnist['data']). 及70000条数据， 每条是 784维 的灰度图像.
mnist = loadmat("/kaggle/input/mnist-original/mnist-original.mat")

# label 是 1*70000的二维数组。
mnist['label']

1.3 数据查看

# 查看数据形状
images = mnist['data']
print(images.shape)
images

# 查看其中某一条图像
plt.imshow(images[20000].reshape(28,28))

1.4 数据预处理

print(images.shape)
mnistImages = images.T
print(mnistImages.shape)
# 归一化图像数据
mnistImages = mnistImages / 255.0
#images[60000]

labels = mnist['label'].squeeze()
print(labels.shape)

# 查看数据
plt.imshow(mnistImages[20100].reshape(28,28))
print(labels[20100])

# 将标签转换为独热编码(为什么)
model_labels = tf.keras.utils.to_categorical(labels, num_classes=10)
print(model_labels.shape)
model_labels[30000]

创建TensorFlow数据集

# 创建TensorFlow数据集
train_images = mnistImages[:60000]
train_labels = model_labels[:60000]
test_images = mnistImages[60000:]
test_labels = model_labels[60000:]

train_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels)).batch(32)
test_dataset = tf.data.Dataset.from_tensor_slices((test_images, test_labels)).batch(32)

print(train_images.shape)
print(train_labels.shape)

1.5 创建模型并训练

创建一个简单的序贯模型

# 创建一个简单的序列模型
model = Sequential([
    #Flatten(input_shape=(784,)),  # 将28x28图像展平成784维向量
    Dense(128, activation='relu', input_shape=(784,)), # 第一个隐藏层，具有128个神经元和ReLU激活函数
    Dense(64, activation='relu'),  # 添加一个具有128个神经元的隐藏层，使用ReLU激活函数
    Dense(10, activation='softmax') # 添加一个输出层，具有10个神经元（对应10个类别），使用softmax激活函数
])

# 编译模型
model.compile(optimizer='adam',
              loss='categorical_crossentropy',  # 使用分类交叉熵作为损失函数
              metrics=['accuracy'])  # 跟踪训练和测试的准确率

# 打印模型概况
model.summary()

# 训练模型
model.fit(train_dataset, epochs=5, batch_size=32)

1.6 评估模型

1
2
3

# 评估模型
loss, accuracy = model.evaluate(test_dataset)
print(f"Test accuracy: {accuracy}")

data_index = 10108
single_data = mnistImages[data_index]
print(single_data.shape)
print(labels[data_index])
# 输入需要 一个行向量， 和训练数据类似.
prediction = model.predict(single_data.reshape(1,784))
# 打印 - 非科学计数法
np.set_printoptions(suppress=True)
result = np.array(prediction[0])
print(result)

# 获取预测结果中概率最高的类别索引
predicted_digit = np.argmax(prediction[0])
# 打印预测的数字
print(f"Predicted digit: {predicted_digit}")
plt.imshow(mnistImages[10108].reshape(28,28))

原理

训练

基础知识：Softmax回归(线性的多元分类模型).

Softmax目的：将打分结果(向量) 转化为 0-1区间的概率.

假设 x是单个样本的特征， W、b 是 Softmax模型的参数。

对样本的理解：这里一条数据就是一个样本，包含了一个特征向量。（业务中: 也可以用id表示一个样本 ? )

在MNIST中，x就代表输入的图片，它是一个 784维度的向量，而W是一个矩阵。
它的形状是 (784,10) , b是一个10维的向量。10代表的是类别数。

1.2.1 Softmax

Softmax回归是一个线性的多类分类模型。实际上是直接从 Logiistic回归模型转化而来的。
区别在于Logistic回归模型为两类分类模型，而Softmax模型为多分类模型.

什么是回归?
回归是统计学和机器学习中的一种重要的分析方法，用于建立变量之间的关系模型，通常用于预测连续型变量的数值...
见：机器学习基本概念和算法

什么是逻辑回归?
(重要)在二分类问题中，采用逻辑函数 将线性组合的输出 转换为0-1之间的概率值，用于描述事件发生的概率.

逻辑回归公式:
$ h_\theta(x) = \frac{1}{1 + e^{-\theta^Tx}} $

Softmax回归公式:
$ Yk = \frac{e^{Logit_k}}{\sum{j=1}^{K} e^{Logit_j}} $

1.3 使用Softmax对MNIST分类

1.3.1 计算Logit

(简单的线性模型)

通过一下公式计算各个类别的Logit
$ \text{Logit} = z = W^T x + b $

1
2
3

W 是权重矩阵，
x 是输入的特征向量， 可以是经过前层神经网络处理后的特征表示.
b 是一个偏置向量，为每个类别提供一个可学习的偏置项.

1.3.2 使用Softmax函数

(激活函数-Softmax函数，用于将线性组合转换为每个类别的概率分布)

使用Softmax函数将它转换为各个类别的概率值将

1.3.3 Softmax回归在TensorFlow中的实现

代码详见：Kaggle:digit-rec-tensowflow

todo：别的算法, 分别的结果评估、效果比对

总结算法公共工作过程？Tensorflow的原理和使用