Pytorch和CNN图像分类
Pytorch和CNN图像分类
PyTorch是一个基于Torch的Python开源机器学习库,用于自然语言处理等应用程序。它主要由Facebookd的人工智能小组开发,不仅能够 实现强大的GPU加速,同时还支持动态神经网络,这一点是现在很多主流框架如TensorFlow都不支持的。 PyTorch提供了两个高级功能:
1.具有强大的GPU加速的张量计算(如Numpy)
2.包含自动求导系统的深度神经网络。除了Facebook之外,Twitter、GMU和Salesforce等机构都采用了PyTorch。
本文使用CIFAR-10数据集进行图像分类。该数据集中的图像是彩色小图像,其中被分为了十类。一些示例图像,如下图所示:

测试GPU是否可以使用
数据集中的图像大小为32x32x3 。在训练的过程中最好使用GPU来加速。
1importtorch
2importnumpyasnp
3
4#检查是否可以利用GPU
5train_on_gpu = torch.cuda.is_available()
6
7ifnottrain_on_gpu:
8print('CUDA is not available.')
9else:
10print('CUDA is available!')
结果:
CUDA is available!
加载数据
数据下载可能会比较慢。请耐心等待。加载训练和测试数据,将训练数据分为训练集和验证集,然后为每个数据集创建DataLoader。
1fromtorchvisionimportdatasets
2importtorchvision.transformsastransforms
3fromtorch.utils.data.samplerimportSubsetRandomSampler
4
5# number of subprocesses to use for data loading
6num_workers =0
7#每批加载16张图片
8batch_size =16
9# percentage of training set to use as validation
10valid_size =0.2
11
12#将数据转换为torch.FloatTensor,并标准化。
13transform = transforms.Compose([
14transforms.ToTensor(),
15transforms.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5))
16])
17
18#选择训练集与测试集的数据
19train_data = datasets.CIFAR10('data', train=True,
20download=True, transform=transform)
21test_data = datasets.CIFAR10('data', train=False,
22download=True, transform=transform)
23
24# obtain training indices that will be used for validation
25num_train = len(train_data)
26indices = list(range(num_train))
27np.random.shuffle(indices)
28split = int(np.floor(valid_size * num_train))
29train_idx, valid_idx = indices[split:], indices[:split]
30
31# define samplers for obtaining training and validation batches
32train_sampler = SubsetRandomSampler(train_idx)
33valid_sampler = SubsetRandomSampler(valid_idx)
34
35# prepare data loaders (combine dataset and sampler)
36train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size,
37sampler=train_sampler, num_workers=num_workers)
38valid_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size,
39sampler=valid_sampler, num_workers=num_workers)
40test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size,
41num_workers=num_workers)
42
43#图像分类中10类别
44classes = ['airplane','automobile','bird','cat','deer',
45'dog','frog','horse','ship','truck']
查看训练集中的一批样本
1import matplotlib.pyplot as plt
2%matplotlib inline
3
4# helper function to un-normalize and display an image
5defimshow(img):
6img = img /2+0.5# unnormalize
7plt.imshow(np.transpose(img, (1,2,0)))# convert from Tensor image
8
9#获取一批样本
10dataiter = iter(train_loader)
11images, labels = dataiter.next()
12images = images.numpy()# convert images to numpy for display
13
14#显示图像,标题为类名
15fig = plt.figure(figsize=(25,4))
16#显示16张图片
17foridxinnp.arange(16):
18ax = fig.add_subplot(2,16/2, idx+1, xticks=[], yticks=[])
19imshow(images[idx])
20ax.set_title(classes[labels[idx]])
结果:

查看一张图像中的更多细节
在这里,进行了归一化处理。红色、绿色和蓝色(RGB)颜色通道可以被看作三个单独的灰度图像。
1rgb_img = np.squeeze(images[3])
2channels = ['red channel','green channel','blue channel']
3
4fig = plt.figure(figsize = (36,36))
5foridxinnp.arange(rgb_img.shape[0]):
6ax = fig.add_subplot(1,3, idx +1)
7img = rgb_img[idx]
8ax.imshow(img, cmap='gray')
9ax.set_title(channels[idx])
10width, height = img.shape
11thresh = img.max()/2.5
12forxinrange(width):
13foryinrange(height):
14val = round(img[x][y],2)ifimg[x][y] !=0else0
15ax.annotate(str(val), xy=(y,x),
16horizontalalignment='center',
17verticalalignment='center', size=8,
18color='white'ifimg[x][y]<threshelse'black')
结果:

定义卷积神经网络的结构
这里,将定义一个CNN的结构。将包括以下内容:
- 卷积层:可以认为是利用图像的多个滤波器(经常被称为卷积操作)进行滤波,得到图像的特征。
- 通常,我们在 PyTorch 中使用
nn.Conv2d定义卷积层,并指定以下参数:
1nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0)

用 3x3 窗口和步长 1 进行卷积运算
§ in_channels 是指输入深度。对于灰阶图像来说,深度 = 1
§ out_channels 是指输出深度,或你希望获得的过滤图像数量
§ kernel_size 是卷积核的大小(通常为 3,表示 3x3 核)
§ stride 和 padding 具有默认值,但是应该根据你希望输出在空间维度 x, y 里具有的大小设置它们的值。
- 池化层:这里采用的最大池化:对指定大小的窗口里的像素值最大值。
- 在 2x2 窗口里,取这四个值的最大值。
- 由于最大池化更适合发现图像边缘等重要特征,适合图像分类任务。
- 最大池化层通常位于卷积层之后,用于缩小输入的 x-y 维度 。
- 通常的“线性+dropout”层可避免过拟合,并产生输出10类别。
下图中,可以看到这是一个具有2个卷积层的神经网络。
卷积层的输出大小
要计算给定卷积层的输出大小,我们可以执行以下计算:
这里,假设输入大小为(H,W),滤波器大小为(FH,FW),输出大小为 (OH,OW),填充为P,步幅为S。此时,输出大小可通过下面公式进行计算。

例: 输入大小为(H=7,W=7),滤波器大小为(FH=3,FW=3),填充为P=0,步幅为S=1, 输出大小为 (OH=5,OW=5)。如果用 S=2,将得输出大小为 (OH=3,OW=3)。
1importtorch.nnasnn
2importtorch.nn.functionalasF
3
4#定义卷积神经网络结构
5classNet(nn.Module):
6def__init__(self):
7super(Net, self).__init__()
8#卷积层 (32x32x3的图像)
9self.conv1 = nn.Conv2d(3,16,3, padding=1)
10#卷积层(16x16x16)
11self.conv2 = nn.Conv2d(16,32,3, padding=1)
12#卷积层(8x8x32)
13self.conv3 = nn.Conv2d(32,64,3, padding=1)
14#最大池化层
15self.pool = nn.MaxPool2d(2,2)
16# linear layer (64 * 4 * 4 -> 500)
17self.fc1 = nn.Linear(64*4*4,500)
18# linear layer (500 -> 10)
19self.fc2 = nn.Linear(500,10)
20# dropout层 (p=0.3)
21self.dropout = nn.Dropout(0.3)
22
23defforward(self, x):
24# add sequence of convolutional and max pooling layers
25x = self.pool(F.relu(self.conv1(x)))
26x = self.pool(F.relu(self.conv2(x)))
27x = self.pool(F.relu(self.conv3(x)))
28# flatten image input
29x = x.view(-1,64*4*4)
30# add dropout layer
31x = self.dropout(x)
32# add 1st hidden layer, with relu activation function
33x = F.relu(self.fc1(x))
34# add dropout layer
35x = self.dropout(x)
36# add 2nd hidden layer, with relu activation function
37x = self.fc2(x)
38returnx
39
40# create a complete CNN
41model = Net()
42print(model)
43
44#使用GPU
45iftrain_on_gpu:
46model.cuda()
结果:
1Net(
2(conv1): Conv2d(3,16, kernel_size=(3,3), stride=(1,1), padding=(1,1))
3(conv2): Conv2d(16,32, kernel_size=(3,3), stride=(1,1), padding=(1,1))
4(conv3): Conv2d(32,64, kernel_size=(3,3), stride=(1,1), padding=(1,1))
5(pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
6(fc1): Linear(in_features=1024, out_features=500, bias=True)
7(fc2): Linear(in_features=500, out_features=10, bias=True)
8(dropout): Dropout(p=0.3, inplace=False)
9)
选择损失函数与优化函数
1importtorch.optimasoptim
2#使用交叉熵损失函数
3criterion = nn.CrossEntropyLoss()
4#使用随机梯度下降,学习率lr=0.01
5optimizer = optim.SGD(model.parameters(), lr=0.01)
训练卷积神经网络模型
注意:训练集和验证集的损失是如何随着时间的推移而减少的;如果验证损失不断增加,则表明可能过拟合现象。(实际上,在下面的例子中,如果n_epochs设置为40,可以发现存在过拟合现象!)
1#训练模型的次数
2n_epochs =30
3
4valid_loss_min = np.Inf# track change in validation loss
5
6forepochinrange(1, n_epochs+1):
7
8# keep track of training and validation loss
9train_loss =0.0
10valid_loss =0.0
11
12###################
13#训练集的模型 #
14###################
15model.train()
16fordata, targetintrain_loader:
17# move tensors to GPU if CUDA is available
18iftrain_on_gpu:
19data, target = data.cuda(), target.cuda()
20# clear the gradients of all optimized variables
21optimizer.zero_grad()
22# forward pass: compute predicted outputs by passing inputs to the model
23output = model(data)
24# calculate the batch loss
25loss = criterion(output, target)
26# backward pass: compute gradient of the loss with respect to model parameters
27loss.backward()
28# perform a single optimization step (parameter update)
29optimizer.step()
30# update training loss
31train_loss += loss.item()*data.size(0)
32
33######################
34#验证集的模型#
35######################
36model.eval()
37fordata, targetinvalid_loader:
38# move tensors to GPU if CUDA is available
39iftrain_on_gpu:
40data, target = data.cuda(), target.cuda()
41# forward pass: compute predicted outputs by passing inputs to the model
42output = model(data)
43# calculate the batch loss
44loss = criterion(output, target)
45# update average validation loss
46valid_loss += loss.item()*data.size(0)
47
48#计算平均损失
49train_loss = train_loss/len(train_loader.sampler)
50valid_loss = valid_loss/len(valid_loader.sampler)
51
52#显示训练集与验证集的损失函数
53print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
54epoch, train_loss, valid_loss))
55
56#如果验证集损失函数减少,就保存模型。
57ifvalid_loss <= valid_loss_min:
58print('Validation loss decreased ({:.6f} --> {:.6f}). Saving model ...'.format(
59valid_loss_min,
60valid_loss))
61torch.save(model.state_dict(),'model_cifar.pt')
62valid_loss_min = valid_loss
结果:
1Epoch: 1TrainingLoss: 2.065666ValidationLoss: 1.706993
2Validationlossdecreased(inf--> 1.706993).Savingmodel...
3Epoch: 2TrainingLoss: 1.609919ValidationLoss: 1.451288
4Validationlossdecreased(1.706993--> 1.451288).Savingmodel...
5Epoch: 3TrainingLoss: 1.426175ValidationLoss: 1.294594
6Validationlossdecreased(1.451288--> 1.294594).Savingmodel...
7Epoch: 4TrainingLoss: 1.307891ValidationLoss: 1.182497
8Validationlossdecreased(1.294594--> 1.182497).Savingmodel...
9Epoch: 5TrainingLoss: 1.200655ValidationLoss: 1.118825
10Validationlossdecreased(1.182497--> 1.118825).Savingmodel...
11Epoch: 6TrainingLoss: 1.115498ValidationLoss: 1.041203
12Validationlossdecreased(1.118825--> 1.041203).Savingmodel...
13Epoch: 7TrainingLoss: 1.047874ValidationLoss: 1.020686
14Validationlossdecreased(1.041203--> 1.020686).Savingmodel...
15Epoch: 8TrainingLoss: 0.991542ValidationLoss: 0.936289
16Validationlossdecreased(1.020686--> 0.936289).Savingmodel...
17Epoch: 9TrainingLoss: 0.942437ValidationLoss: 0.892730
18Validationlossdecreased(0.936289--> 0.892730).Savingmodel...
19Epoch: 10TrainingLoss: 0.894279ValidationLoss: 0.875833
20Validationlossdecreased(0.892730--> 0.875833).Savingmodel...
21Epoch: 11TrainingLoss: 0.859178ValidationLoss: 0.838847
22Validationlossdecreased(0.875833--> 0.838847).Savingmodel...
23Epoch: 12TrainingLoss: 0.822664ValidationLoss: 0.823634
24Validationlossdecreased(0.838847--> 0.823634).Savingmodel...
25Epoch: 13TrainingLoss: 0.787049ValidationLoss: 0.802566
26Validationlossdecreased(0.823634--> 0.802566).Savingmodel...
27Epoch: 14TrainingLoss: 0.749585ValidationLoss: 0.785852
28Validationlossdecreased(0.802566--> 0.785852).Savingmodel...
29Epoch: 15TrainingLoss: 0.721540ValidationLoss: 0.772729
30Validationlossdecreased(0.785852--> 0.772729).Savingmodel...
31Epoch: 16TrainingLoss: 0.689508ValidationLoss: 0.768470
32Validationlossdecreased(0.772729--> 0.768470).Savingmodel...
33Epoch: 17TrainingLoss: 0.662432ValidationLoss: 0.758518
34Validationlossdecreased(0.768470--> 0.758518).Savingmodel...
35Epoch: 18TrainingLoss: 0.632324ValidationLoss: 0.750859
36Validationlossdecreased(0.758518--> 0.750859).Savingmodel...
37Epoch: 19TrainingLoss: 0.616094ValidationLoss: 0.729692
38Validationlossdecreased(0.750859--> 0.729692).Savingmodel...
39Epoch: 20TrainingLoss: 0.588593ValidationLoss: 0.729085
40Validationlossdecreased(0.729692--> 0.729085).Savingmodel...
41Epoch: 21TrainingLoss: 0.571516ValidationLoss: 0.734009
42Epoch: 22TrainingLoss: 0.545541ValidationLoss: 0.721433
43Validationlossdecreased(0.729085--> 0.721433).Savingmodel...
44Epoch: 23TrainingLoss: 0.523696ValidationLoss: 0.720512
45Validationlossdecreased(0.721433--> 0.720512).Savingmodel...
46Epoch: 24TrainingLoss: 0.508577ValidationLoss: 0.728457
47Epoch: 25TrainingLoss: 0.483033ValidationLoss: 0.722556
48Epoch: 26TrainingLoss: 0.469563ValidationLoss: 0.742352
49Epoch: 27TrainingLoss: 0.449316ValidationLoss: 0.726019
50Epoch: 28TrainingLoss: 0.442354ValidationLoss: 0.713364
51Validationlossdecreased(0.720512--> 0.713364).Savingmodel...
52Epoch: 29TrainingLoss: 0.421807ValidationLoss: 0.718615
53Epoch: 30TrainingLoss: 0.404595ValidationLoss: 0.729914
加载模型
1model.load_state_dict(torch.load('model_cifar.pt'))
结果:
1<All keys matched successfully>
测试训练好的网络
在测试数据上测试你的训练模型!一个“好”的结果将是CNN得到大约70%,这些测试图像的准确性。
1# track test loss
2test_loss =0.0
3class_correct = list(0.foriinrange(10))
4class_total = list(0.foriinrange(10))
5
6model.eval()
7# iterate over test data
8fordata, targetintest_loader:
9# move tensors to GPU if CUDA is available
10iftrain_on_gpu:
11data, target = data.cuda(), target.cuda()
12# forward pass: compute predicted outputs by passing inputs to the model
13output = model(data)
14# calculate the batch loss
15loss = criterion(output, target)
16# update test loss
17test_loss += loss.item()*data.size(0)
18# convert output probabilities to predicted class
19_, pred = torch.max(output,1)
20# compare predictions to true label
21correct_tensor = pred.eq(target.data.view_as(pred))
22correct = np.squeeze(correct_tensor.numpy())ifnottrain_on_gpuelsenp.squeeze(correct_tensor.cpu().numpy())
23# calculate test accuracy for each object class
24foriinrange(batch_size):
25label = target.data[i]
26class_correct[label] += correct[i].item()
27class_total[label] +=1
28
29# average test loss
30test_loss = test_loss/len(test_loader.dataset)
31print('Test Loss: {:.6f}\n'.format(test_loss))
32
33foriinrange(10):
34ifclass_total[i] >0:
35print('Test Accuracy of %5s: %2d%% (%2d/%2d)'% (
36classes[i],100* class_correct[i] / class_total[i],
37np.sum(class_correct[i]), np.sum(class_total[i])))
38else:
39print('Test Accuracy of %5s: N/A (no training examples)'% (classes[i]))
40
41print('\nTest Accuracy (Overall): %2d%% (%2d/%2d)'% (
42100.* np.sum(class_correct) / np.sum(class_total),
43np.sum(class_correct), np.sum(class_total)))
结果:
1Test Loss:0.708721
2
3Test Accuracyofairplane:82% (826/1000)
4Test Accuracyofautomobile:81% (818/1000)
5Test Accuracyofbird:65% (659/1000)
6Test Accuracyofcat:59% (590/1000)
7Test Accuracyofdeer:75% (757/1000)
8Test Accuracyofdog:56% (565/1000)
9Test Accuracyoffrog:81% (812/1000)
10Test Accuracyofhorse:82% (823/1000)
11Test Accuracyofship:86% (866/1000)
12Test Accuracyoftruck:84% (848/1000)
13
14Test Accuracy (Overall):75% (7564/10000)
显示测试样本的结果
1# obtain one batch of test images
2dataiter = iter(test_loader)
3images, labels = dataiter.next()
4images.numpy()
5
6# move model inputs to cuda, if GPU available
7iftrain_on_gpu:
8images = images.cuda()
9
10# get sample outputs
11output = model(images)
12# convert output probabilities to predicted class
13_, preds_tensor = torch.max(output,1)
14preds = np.squeeze(preds_tensor.numpy())ifnottrain_on_gpuelsenp.squeeze(preds_tensor.cpu().numpy())
15
16# plot the images in the batch, along with predicted and true labels
17fig = plt.figure(figsize=(25,4))
18foridxinnp.arange(16):
19ax = fig.add_subplot(2,16/2, idx+1, xticks=[], yticks=[])
20imshow(images.cpu()[idx])
21ax.set_title("{} ({})".format(classes[preds[idx]], classes[labels[idx]]),
22color=("green"ifpreds[idx]==labels[idx].item()else"red"))
结果:

参考资料:
《吴恩达深度学习笔记》
《深度学习入门:基于Python的理论与实现》
https://pytorch.org/docs/stable/nn.html#
https://github.com/udacity/deep-learning-v2-pytorch
Pytorch和CNN图像分类的更多相关文章
- MINIST深度学习识别:python全连接神经网络和pytorch LeNet CNN网络训练实现及比较(三)
版权声明:本文为博主原创文章,欢迎转载,并请注明出处.联系方式:460356155@qq.com 在前两篇文章MINIST深度学习识别:python全连接神经网络和pytorch LeNet CNN网 ...
- ubuntu之路——day18 用pytorch完成CNN
本次作业:Andrew Ng的CNN的搭建卷积神经网络模型以及应用(1&2)作业目录参考这位博主的整理:https://blog.csdn.net/u013733326/article/det ...
- 基于pytorch的CNN、LSTM神经网络模型调参小结
(Demo) 这是最近两个月来的一个小总结,实现的demo已经上传github,里面包含了CNN.LSTM.BiLSTM.GRU以及CNN与LSTM.BiLSTM的结合还有多层多通道CNN.LSTM. ...
- Pytorch写CNN
用Pytorch写了两个CNN网络,数据集用的是FashionMNIST.其中CNN_1只有一个卷积层.一个全连接层,CNN_2有两个卷积层.一个全连接层,但训练完之后的准确率两者差不多,且CNN_1 ...
- tensorflow训练自己的数据集实现CNN图像分类1
利用卷积神经网络训练图像数据分为以下几个步骤 读取图片文件 产生用于训练的批次 定义训练的模型(包括初始化参数,卷积.池化层等参数.网络) 训练 1 读取图片文件 def get_files(file ...
- pytorch 8 CNN 卷积神经网络
# library # standard library import os # third-party library import torch import torch.nn as nn impo ...
- 奉献pytorch 搭建 CNN 卷积神经网络训练图像识别的模型,配合numpy 和matplotlib 一起使用调用 cuda GPU进行加速训练
1.Torch构建简单的模型 # coding:utf-8 import torch class Net(torch.nn.Module): def __init__(self,img_rgb=3,i ...
- pytorch之 CNN
# library # standard library import os # third-party library import torch import torch.nn as nn impo ...
- tensorflow训练自己的数据集实现CNN图像分类2(保存模型&测试单张图片)
神经网络训练的时候,我们需要将模型保存下来,方便后面继续训练或者用训练好的模型进行测试.因此,我们需要创建一个saver保存模型. def run_training(): data_dir = 'C: ...
随机推荐
- 播放视频插件swfobject.js与Video Html5
播放视频的方法: 方法一. 使用HTML5播放 <video src="./files/Clip_480_5sec_6mbps_h264.mp4" width="1 ...
- 路由器逆向分析------QEMU的基本使用方法(MIPS)
本文博客地址:http://blog.csdn.net/qq1084283172/article/details/69258334 一.QEMU的运行模式 直接摘抄自己<揭秘家用路由器0day漏 ...
- Android木马病毒com.schemedroid的分析报告
某安全公司移动病毒分析报告的面试题目,该病毒样本的代码量比较大,最大的分析障碍是该病毒样本的类名称和类方法名称以及类成员变量的名称被混淆为无法辨认的特殊字符,每个被分析的类中所有的字符串都被加密处理了 ...
- POJ2296二分2sat
题意: 给n个点,每个点必须在一个正方形上,可以在正方向上面边的中点或者是下面边的中点,正方形是和x,y轴平行的,而且所有的点的正方形的边长一样,并且正方形不能相互重叠(边相邻可以),问满 ...
- 手脱ASPack2.12壳(练习)
俗话说柿子挑软的捏,练习脱壳还得用加密壳脱 0x01 准备 1.OD 2.ASPack加壳软件 0x02 脱壳实战 查一下壳 OD载入 第一个call,F7进去 第二个call也是F7进去 之后往下单 ...
- Portswigger web security academy:OAth authentication vulnerable
Portswigger web security academy:OAth authentication vulnerable 目录 Portswigger web security academy: ...
- 【JavaScript】【dp】Leetcode每日一题-解码方法
[JavaScript]Leetcode每日一题-解码方法 [题目描述] 一条包含字母 A-Z 的消息通过以下映射进行了 编码 : 'A' -> 1 'B' -> 2 ... 'Z' -& ...
- SecureCRT 基本设置
1:字体与大小 Lucida Console 四号 2:声音关闭 Terminal-->Audio bell不勾选 默认网络工程师常用: Terminal-->Emulation--& ...
- LeetCode 26. 删除有序数组中的重复项
双指针法 分析: 设置两个指针:p1,p2,初始p1指向数组的第一个元素,p2指向第二个元素 1)如果p1的值 == p2的值,就让p2后移一位 2)如果p1的值 != p2的值,修改p1的下一个元素 ...
- Base64文件上传(Use C#)
Base64是网络上最常见的用于传输8Bit字节码的编码方式之一,它是一种基于64个可打印字符来表示二进制数据的方法. 使用base64进行文件上传的具体流程是:前台使用js将文件转换为base64格 ...