Pytorch入门实战:LeNet手写数字识别

在《Mxnet/Gluon入门实战:LeNet手写数字识别》和《Pytorch入门实战:LeNet手写数字识别》两篇博文里,我将分别使用Mxnet和Pytorch框架搭建LeNet网络,在Mnist数据集上进行手写数字识别,以对比两种框架炼丹异同。本文是Pytorch部分。

需掌握的几个Pytorch用法:

  1. torch.Tensor.view
    返回一个新的数据相同但shape不同的tensor。
    >> x = x.view(x.size()[0], -1)

  2. torch.max
    返回指定维度上的最大值,以及最大值出现的位置。
    >> _, predicted = torch.max(outputs.data, 1)

  3. torch.Tensor.item
    返回tensor对应的python标准value。这个函数只能用在tensor只有一个元素的场合。
    >> x = torch.tensor([1.0])
    >> x.item()

  4. import torchvision.transforms as transforms
    这是pytorch中的图像预处理包,一般用Compose把多个步骤整合到一起:
    >> transforms.Compose([tansforms.CenterCrop(10), transforms.ToTensor()])

  5. transforms.ToTensor()
    shape=(H x W x C)的像素值范围为[0, 255]的PIL.Image或者numpy.ndarray转换成shape=(C x H x W)的像素值范围为[0.0, 1.0]的torch.FloatTensor。

1
2
3
4
5
6
7
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms as transforms
import torchvision
import torch.optim as optim
import time

首先我们构建一个LeNet网络。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class LeNet(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Sequential(
nn.Conv2d(1, 6, 5, 1, 2),
nn.ReLU(),
nn.MaxPool2d(2, 2)
)

self.conv2 = nn.Sequential(
nn.Conv2d(6, 16, 5),
nn.ReLU(),
nn.MaxPool2d(2, 2)
)

self.fc = nn.Sequential(
nn.Linear(400, 120),
nn.ReLU(),
nn.Linear(120, 84),
nn.ReLU(),
nn.Linear(84, 10),
)

def forward(self, x):
x = self.conv1(x)
x = self.conv2(x)
x = x.view(x.size()[0], -1)
x = self.fc(x)
return x

设置epoch,batch_size,learning_rate。

1
2
3
epoch_nums = 10
batch_size = 16
lr = 0.01

导入数据集

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
transform = transforms.ToTensor()

train_set = torchvision.datasets.MNIST(
root='./data/',
train=True,
transform=transform,
download=True,
)

train_loader = torch.utils.data.DataLoader(
dataset=train_set,
batch_size=batch_size,
shuffle=True,
)

test_set = torchvision.datasets.MNIST(
root='./data/',
train=False,
download=True,
transform=transform,
)

test_loader = torch.utils.data.DataLoader(
dataset=test_set,
batch_size=batch_size,
shuffle=True,
)

如果有gpu就在gpu就在gpu上计算,否则在cpu上计算。

1
2
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
device(type='cuda')

实例化网络、误差评估方法、优化方法。

1
2
3
net = LeNet().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=lr, momentum=0.9)

开始训练。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
for epoch in range(epoch_nums):
sum_loss = 0.0
start_time = time.time()
for i, data in enumerate(train_loader):
X, y = data
X, y = X.to(device), y.to(device)
# 将优化器梯度清零
optimizer.zero_grad()

outputs = net(X)
loss = criterion(outputs, y)
loss.backward()
optimizer.step()
sum_loss += loss.item()

with torch.no_grad():
test_acc = 0.0
total = 0

for data in test_loader:
X, y = data
X, y = X.to(device), y.to(device)
outputs = net(X)
_, predicted = torch.max(outputs.data, 1)
total += y.shape[0]
test_acc += (predicted == y).sum()

print('epoch: %d, train loss: %.03f, test acc: %.03f, time %.1f sec' % (epoch + 1, sum_loss / len(train_loader), test_acc.item() / total, time.time() - start_time))
epoch: 1, train loss: 0.246, test acc: 0.975, time 8.7 sec
epoch: 2, train loss: 0.062, test acc: 0.987, time 8.5 sec
epoch: 3, train loss: 0.043, test acc: 0.984, time 8.5 sec
epoch: 4, train loss: 0.035, test acc: 0.986, time 8.7 sec
epoch: 5, train loss: 0.029, test acc: 0.987, time 8.6 sec
epoch: 6, train loss: 0.024, test acc: 0.985, time 8.6 sec
epoch: 7, train loss: 0.020, test acc: 0.988, time 8.7 sec
epoch: 8, train loss: 0.019, test acc: 0.988, time 8.6 sec
epoch: 9, train loss: 0.017, test acc: 0.989, time 8.5 sec
epoch: 10, train loss: 0.015, test acc: 0.986, time 8.4 sec

保存网络或者网络参数。

1
2
torch.save(net.state_dict(), './pytorch_lenet_minist_parameter_%d' % epoch_nums)
torch.save(net, './pytorch_lenet_minist_%d' % epoch_nums)
持续技术分享,您的支持将鼓励我继续创作!