基于Kaggle的图像分类（CIFAR-10）

Image Classification (CIFAR-10) on Kaggle

一直在使用Gluon’s data package数据包直接获得张量格式的图像数据集。然而，在实际应用中，图像数据集往往以图像文件的形式存在。将从原始图像文件开始，逐步组织、读取并将文件转换为张量格式。对CIFAR-10数据集进行了一个实验。这是计算机视觉领域的一个重要数据集。现在，将应用前面几节中所学的知识来参加Kaggle竞赛，该竞赛解决CIFAR-10图像分类问题。

比赛的网址是https://www.kaggle.com/c/cifar-10

图1显示了比赛网页上的信息。为了提交结果，请先在Kaggle网站注册一个帐户。

Fig. 1 CIFAR-10 image classification competition webpage information. The dataset for the competition can be accessed by clicking the “Data” tab.

首先，导入比赛所需的软件包或模块。

import collections

from d2l import mxnet as d2l

import math

from mxnet import autograd, gluon, init, npx

from mxnet.gluon import nn

import os

import pandas as pd

import shutil

import time

npx.set_np()

1. Obtaining and Organizing the Dataset

比赛数据分为训练集和测试集。训练集包含50000帧图像。测试集包含30万帧图像，其中10000帧图像用于评分，而其29万帧包括非评分图像，以防止手动标记测试集和提交标记结果。两个数据集中的图像格式都是PNG，高度和宽度都是32个像素和三个颜色通道（RGB）。图像覆盖1010类别：飞机、汽车、鸟、猫、鹿、狗、青蛙、马、船和卡车。图中左上角显示了数据集中飞机、汽车和鸟类的一些图像。

1.1. Downloading the Dataset

登录Kaggle后，点击图1所示CIFAR-10图像分类竞赛网页上的“数据”选项卡，点击“全部下载”按钮下载数据集。在../data中解压缩下载的文件，并在其中解压缩train.7z和test.7z之后，将在以下路径中找到整个数据集：

../data/cifar-10/train/[1-50000].png
../data/cifar-10/test/[1-300000].png
../data/cifar-10/trainLabels.csv
../data/cifar-10/sampleSubmission.csv

这里的“训练”和“测试”文件夹分别包含训练和测试图像，trainLabels.csv有训练图像的标签和sample_submission.csv是提交的样本。为了便于入门，提供了一个小规模的数据集示例：包含第一个1000帧训练图像和55随机测试图像。要使用Kaggle竞赛的完整数据集，需要将以下demo变量设置为False。

#@save

d2l.DATA_HUB['cifar10_tiny'] = (d2l.DATA_URL + 'kaggle_cifar10_tiny.zip',

'2068874e4b9a9f0fb07ebe0ad2b29754449ccacd')

# If you use the full dataset downloaded for the Kaggle competition, set the

# demo variable to False

demo = True

if demo:

data_dir = d2l.download_extract('cifar10_tiny')

else:

data_dir = '../data/cifar-10/'

1.2. Organizing the Dataset

需要组织数据集来促进模型的训练和测试。让首先从csv文件中读取标签。以下函数返回一个字典，该字典将不带扩展名的文件名映射到其标签。

#@save

def read_csv_labels(fname):

"""Read fname to return a name to label dictionary."""

with open(fname, 'r') as f:

# Skip the file header line (column name)

lines = f.readlines()[1:]

tokens = [l.rstrip().split(',') for l in lines]

return dict(((name, label) for name, label in tokens))

labels = read_csv_labels(os.path.join(data_dir, 'trainLabels.csv'))

print('# training examples:', len(labels))

print('# classes:', len(set(labels.values())))

# training examples: 1000

# classes: 10

接下来，定义reorg_train_valid函数来从原始训练集中分割验证集。此函数中的参数valid_ratio是验证集中的示例数与原始训练集中的示例数的比率。特别是让n是具有最少示例的类的图像数，以及r是比率，那么将使用最大值（⌊nr⌋，1），每个类的图像作为验证集。让以valid_ratio=0.1为例。从最初的训练开始50000帧图像，会有45000帧。当调整超参数时，用于训练并存储在路径“train_valid_test/train”中的图像，而另一个5000帧图像将作为验证集存储在“train_valid_test/train”路径中。组织好数据后，同一类的图像将被放在同一个文件夹下，以便以后阅读。

#@save

def copyfile(filename, target_dir):

"""Copy a file into a target directory."""

d2l.mkdir_if_not_exist(target_dir)

shutil.copy(filename, target_dir)

#@save

def reorg_train_valid(data_dir, labels, valid_ratio):

# The number of examples of the class with the least examples in the

# training dataset

n = collections.Counter(labels.values()).most_common()[-1][1]

# The number of examples per class for the validation set

n_valid_per_label = max(1, math.floor(n * valid_ratio))

label_count = {}

for train_file in os.listdir(os.path.join(data_dir, 'train')):

label = labels[train_file.split('.')[0]]

fname = os.path.join(data_dir, 'train', train_file)

# Copy to train_valid_test/train_valid with a subfolder per class

copyfile(fname, os.path.join(data_dir, 'train_valid_test',

'train_valid', label))

if label not in label_count or label_count[label] < n_valid_per_label:

# Copy to train_valid_test/valid

copyfile(fname, os.path.join(data_dir, 'train_valid_test',

'valid', label))

label_count[label] = label_count.get(label, 0) + 1

else:

# Copy to train_valid_test/train

copyfile(fname, os.path.join(data_dir, 'train_valid_test',

'train', label))

return n_valid_per_label

下面的reorg_test函数用于组织测试集，以便于预测期间的读数。

#@save

def reorg_test(data_dir):

for test_file in os.listdir(os.path.join(data_dir, 'test')):

copyfile(os.path.join(data_dir, 'test', test_file),

os.path.join(data_dir, 'train_valid_test', 'test',

'unknown'))

使用一个函数来调用先前定义的read_csv_labels、reorg_train_valid和reorg_test函数。

def reorg_cifar10_data(data_dir, valid_ratio):

labels = read_csv_labels(os.path.join(data_dir, 'trainLabels.csv'))

reorg_train_valid(data_dir, labels, valid_ratio)

reorg_test(data_dir)

只将批量大小设置1为演示数据集。在实际训练和测试过程中，应使用Kaggle竞赛的完整数据集，并将批次大小设置为更大的整数，例如128。使用10%作为调整超参数的验证集。

batch_size = 1 if demo else 128

valid_ratio = 0.1

reorg_cifar10_data(data_dir, valid_ratio)

2. Image Augmentation

为了解决过度拟合的问题，使用图像增强技术。例如，通过添加transforms.RandomFlipLeftRight（），图像可以随机翻转。还可以使用transforms.Normalize()。下面，将列出其中一些操作，可以根据需要选择使用或修改这些操作。

transform_train = gluon.data.vision.transforms.Compose([

# Magnify the image to a square of 40 pixels in both height and width

gluon.data.vision.transforms.Resize(40),

# Randomly crop a square image of 40 pixels in both height and width to

# produce a small square of 0.64 to 1 times the area of the original

# image, and then shrink it to a square of 32 pixels in both height and

# width

gluon.data.vision.transforms.RandomResizedCrop(32, scale=(0.64, 1.0),

ratio=(1.0, 1.0)),

gluon.data.vision.transforms.RandomFlipLeftRight(),

gluon.data.vision.transforms.ToTensor(),

# Normalize each channel of the image

gluon.data.vision.transforms.Normalize([0.4914, 0.4822, 0.4465],

[0.2023, 0.1994, 0.2010])])

为了保证测试过程中输出的确定性，只对图像进行归一化处理。

transform_test = gluon.data.vision.transforms.Compose([

gluon.data.vision.transforms.ToTensor(),

gluon.data.vision.transforms.Normalize([0.4914, 0.4822, 0.4465],

[0.2023, 0.1994, 0.2010])])

3. Reading the Dataset

接下来，可以创建ImageFolderDataset实例来读取包含原始图像文件的有组织的数据集，其中每个示例都包含图像和标签。

train_ds, valid_ds, train_valid_ds, test_ds = [

gluon.data.vision.ImageFolderDataset(

os.path.join(data_dir, 'train_valid_test', folder))

for folder in ['train', 'valid', 'train_valid', 'test']]

在DataLoader中指定定义的图像增强操作。在训练过程中，只使用验证集来评估模型，所以需要确保输出的确定性。在预测过程中，将在组合训练集和验证集上训练模型，以充分利用所有标记数据。

train_iter, train_valid_iter = [gluon.data.DataLoader(

dataset.transform_first(transform_train), batch_size, shuffle=True,

last_batch='keep') for dataset in (train_ds, train_valid_ds)]

valid_iter, test_iter = [gluon.data.DataLoader(

dataset.transform_first(transform_test), batch_size, shuffle=False,

last_batch='keep') for dataset in (valid_ds, test_ds)]

4. Defining the Model

基于HybridBlock类构建剩余块，这样做是为了提高执行效率。

class Residual(nn.HybridBlock):

def __init__(self, num_channels, use_1x1conv=False, strides=1, **kwargs):

super(Residual, self).__init__(**kwargs)

self.conv1 = nn.Conv2D(num_channels, kernel_size=3, padding=1,

strides=strides)

self.conv2 = nn.Conv2D(num_channels, kernel_size=3, padding=1)

if use_1x1conv:

self.conv3 = nn.Conv2D(num_channels, kernel_size=1,

strides=strides)

else:

self.conv3 = None

self.bn1 = nn.BatchNorm()

self.bn2 = nn.BatchNorm()

def hybrid_forward(self, F, X):

Y = F.npx.relu(self.bn1(self.conv1(X)))

Y = self.bn2(self.conv2(Y))

if self.conv3:

X = self.conv3(X)

return F.npx.relu(Y + X)

定义ResNet-18模型。

def resnet18(num_classes):

net = nn.HybridSequential()

net.add(nn.Conv2D(64, kernel_size=3, strides=1, padding=1),

nn.BatchNorm(), nn.Activation('relu'))

def resnet_block(num_channels, num_residuals, first_block=False):

blk = nn.HybridSequential()

for i in range(num_residuals):

if i == 0 and not first_block:

blk.add(Residual(num_channels, use_1x1conv=True, strides=2))

else:

blk.add(Residual(num_channels))

return blk

net.add(resnet_block(64, 2, first_block=True),

resnet_block(128, 2),

resnet_block(256, 2),

resnet_block(512, 2))

net.add(nn.GlobalAvgPool2D(), nn.Dense(num_classes))

return net

CIFAR-10图像分类挑战赛使用10个类别。在训练开始之前，将对模型执行Xavier随机初始化。

def get_net(ctx):

num_classes = 10

net = resnet18(num_classes)

net.initialize(ctx=ctx, init=init.Xavier())

return net

loss = gluon.loss.SoftmaxCrossEntropyLoss()

5. Defining the Training Functions

将根据模型在验证集上的性能来选择模型并调整超参数。其次，定义了模型训练函数训练。记录了每个时代的训练时间，这有助于比较不同模型的时间成本。

def train(net, train_iter, valid_iter, num_epochs, lr, wd, ctx, lr_period,

lr_decay):

trainer = gluon.Trainer(net.collect_params(), 'sgd',

{'learning_rate': lr, 'momentum': 0.9, 'wd': wd})

for epoch in range(num_epochs):

train_l_sum, train_acc_sum, n, start = 0.0, 0.0, 0, time.time()

if epoch > 0 and epoch % lr_period == 0:

trainer.set_learning_rate(trainer.learning_rate * lr_decay)

for X, y in train_iter:

y = y.astype('float32').as_in_ctx(ctx)

with autograd.record():

y_hat = net(X.as_in_ctx(ctx))

l = loss(y_hat, y).sum()

l.backward()

trainer.step(batch_size)

train_l_sum += float(l)

train_acc_sum += float((y_hat.argmax(axis=1) == y).sum())

n += y.size

time_s = "time %.2f sec" % (time.time() - start)

if valid_iter is not None:

valid_acc = d2l.evaluate_accuracy_gpu(net, valid_iter)

epoch_s = ("epoch %d, loss %f, train acc %f, valid acc %f, "

% (epoch + 1, train_l_sum / n, train_acc_sum / n,

valid_acc))

else:

epoch_s = ("epoch %d, loss %f, train acc %f, " %

(epoch + 1, train_l_sum / n, train_acc_sum / n))

print(epoch_s + time_s + ', lr ' + str(trainer.learning_rate))

6. Training and Validating the Model

现在可以对模型进行验证。可以调整以下超参数。例如，可以增加纪元的数量。由于lr_period和lr_decay分别设置为80和0.1，因此每80个周期后，优化算法的学习速率将乘以0.1。为了简单起见，在这里只训练了一个时代。

ctx, num_epochs, lr, wd = d2l.try_gpu(), 1, 0.1, 5e-4

lr_period, lr_decay, net = 80, 0.1, get_net(ctx)

net.hybridize()

train(net, train_iter, valid_iter, num_epochs, lr, wd, ctx, lr_period,

lr_decay)

epoch 1, loss 2.859060, train acc 0.100000, valid acc 0.100000, time 9.51 sec, lr 0.1

7. Classifying the Testing Set and Submitting Results on Kaggle

在获得满意的模型设计和超参数后，使用所有训练数据集（包括验证集）对模型进行再训练并对测试集进行分类。

net, preds = get_net(ctx), []

net.hybridize()

train(net, train_valid_iter, None, num_epochs, lr, wd, ctx, lr_period,

lr_decay)

for X, _ in test_iter:

y_hat = net(X.as_in_ctx(ctx))

preds.extend(y_hat.argmax(axis=1).astype(int).asnumpy())

sorted_ids = list(range(1, len(test_ds) + 1))

sorted_ids.sort(key=lambda x: str(x))

df = pd.DataFrame({'id': sorted_ids, 'label': preds})

df['label'] = df['label'].apply(lambda x: train_valid_ds.synsets[x])

df.to_csv('submission.csv', index=False)

epoch 1, loss 2.873863, train acc 0.106000, time 9.55 sec, lr 0.1

执行上述代码后，将得到一个“submission.csv “文件。此文件的格式符合Kaggle竞赛要求。

8. Summary¶

We can create an ImageFolderDataset instance to read the dataset containing the original image files.
We can use convolutional neural networks, image augmentation, and hybrid programming to take part in an image classification competition.

基于Kaggle的图像分类（CIFAR-10）的更多相关文章

【翻译】TensorFlow卷积神经网络识别CIFAR 10Convolutional Neural Network (CNN)| CIFAR 10 TensorFlow
原网址:https://data-flair.training/blogs/cnn-tensorflow-cifar-10/ by DataFlair Team · Published May 21, ...
【神经网络与深度学习】基于Windows+Caffe的Minst和CIFAR—10训练过程说明
Minst训练我的路径:G:\Caffe\Caffe For Windows\examples\mnist 对于新手来说,初步完成环境的配置后,一脸茫然.不知如何跑Demo,有么有!那么接下来的教 ...
基于Vmware player的Windows 10 IoT core + RaspberryPi2安装部署
本文记录了基于Vmware Player安装Windows10和VS2015开发平台的过程,以及如何在RaspberryPi2.0上启动Windows10 IoT core系统,并通过一个简单的hel ...
DL Practice：Cifar 10分类
Step 1:数据加载和处理一般使用深度学习框架会经过下面几个流程: 模型定义(包括损失函数的选择)——>数据处理和加载——>训练(可能包括训练过程可视化)——>测试所以自己写代 ...
数据算法 --hadoop/spark数据处理技巧 --（9.基于内容的电影推荐 10. 使用马尔科夫模型的智能邮件营销）
九.基于内容的电影推荐在基于内容的推荐系统中,我们得到的关于内容的信息越多,算法就会越复杂(设计的变量更多),不过推荐也会更准确,更合理. 本次基于评分,提供一个3阶段的MR解决方案来实现电影推荐. ...
基于WPF系统框架设计(10)-分页控件设计
背景最近要求项目组成员开发一个通用的分页组件,要求是这个组件简单易用,通用性,兼容现有框架MVVM模式,可是最后给我提交的成果勉强能够用,却欠少灵活性和框架兼容性. 设计的基本思想传入数据源,总页 ...
Elastic：Elastic Maps 基于位置的警报 - 7.10
文章转载自:https://elasticstack.blog.csdn.net/article/details/112535618
基于Tensorflow + Opencv 实现CNN自定义图像分类
摘要:本篇文章主要通过Tensorflow+Opencv实现CNN自定义图像分类案例,它能解决我们现实论文或实践中的图像分类问题,并与机器学习的图像分类算法进行对比实验. 本文分享自华为云社区< ...
【深度学习系列】用PaddlePaddle和Tensorflow进行图像分类
上个月发布了四篇文章,主要讲了深度学习中的"hello world"----mnist图像识别,以及卷积神经网络的原理详解,包括基本原理.自己手写CNN和paddlepaddle的 ...

随机推荐

洛谷P1422 小玉家的电费
题目描述夏天到了,各家各户的用电量都增加了许多,相应的电费也交的更多了.小玉家今天收到了一份电费通知单.小玉看到上面写:据闽价电[2006]27号规定,月用电量在150千瓦时及以下部分按每千瓦时0. ...
Apache Tomcat examples directory vulnerabilities(Apache Tomcat样例目录session操纵漏洞)复现
目录 Session操控漏洞示例: Session操控漏洞在Apache tomcat中,有一个默认的example示例目录,该example目录中存着众多的样例,其中/examples/serv ...
指定的服务已标记为删除寒江孤钓<<windows 内核安全编程>> 学习笔记
运行cmd:"sc delete first" 删除我们的服务之后, 再次创建这个服务的时候出现 "指定的服务已标记为删除"的错误, 原因是我们删除服务之前没有 ...
【python】Leetcode每日一题-旋转链表
[python]Leetcode每日一题-旋转链表 [题目描述] 给你一个链表的头节点 head ,旋转链表,将链表每个节点向右移动 k 个位置. 示例1: 输入:head = [1,2,3,4,5] ...
基于linux信号的timeout装饰器
在做基于ray的分布式任务处理时,偶尔遇到由于ray集群不稳定导致的长时间连接不上,进而导致程序卡死,无法向后端返回任务状态的情况.但是ray的初始化函数本身未实现超时机制,因此设计基于多线程+信号的 ...
【哲学角度看软件测试】要想软件“一想之美”，UI 测试少不了
摘要:软件测试的最高层次需求是:UI测试,也就是这个软件"长得好不好看". 为了让读者更好地理解测试,我们从最基础的概念开始介绍.以一个软件的"轮回"为例,下图 ...
Power BI官方客户案例2021
微软商业应用峰会Power BI客户案例,今年的客户案例相比前2年不同,主要是大客户,基本都是行业Top公司. 选取零售,医药制造,教育,医疗IT等行业龙头.沃尔玛,拜耳,滑铁卢大学的分享内容非常棒, ...
二分查找确定lower_bound和upper_bound
lower_bound当target存在时, 返回它出现的第一个位置,如果不存在,则返回这样一个下标i:在此处插入target后,序列仍然有序. 代码如下: int lower_bound(int* ...
Visual Lab Online —— Alpha版本发布声明
Visual Lab Online -- Alpha版本发布声明项目内容班级:北航2020春软件工程博客园班级博客作业:Alpha阶段发布声明发布声明目录 Visual Lab Onli ...
一文学完makefile语法
一.开始 1.Hello World 新建一个makefile文件,写入如下内容, hello: echo "Hello World" clean: echo "clea ...

基于Kaggle的图像分类（CIFAR-10）

基于Kaggle的图像分类（CIFAR-10）的更多相关文章

随机推荐

热门专题