pytorch资料

torchvision是独立于pytorch的关于图像操作的一些方便工具库。

torchvision的详细介绍在：https://pypi.org/project/torchvision/

torchvision主要包括一下几个包：

vision.datasets : 几个常用视觉数据集，可以下载和加载，这里主要的高级用法就是可以看源码如何自己写自己的Dataset的子类
vision.models : 流行的模型，例如 AlexNet, VGG, ResNet 和 Densenet 以及与训练好的参数。
vision.transforms : 常用的图像操作，例如：随机切割，旋转，数据类型转换，图像到tensor ,numpy 数组到tensor , tensor 到图像等。
vision.utils : 用于把形似 (3 x H x W) 的张量保存到硬盘中，给一个mini-batch的图像可以产生一个图像格网。

安装

Anaconda:

conda install torchvision -c pytorch

pip:

pip install torchvision

由于此包是配合pytorch的对于图像处理来说必不可少的，
对于以后要用的torch来说一站式的anaconda是首选，毕竟人生苦短。
(anaconda + vscode +pytorch 非常好用） 值得推荐！

以下翻译自： https://pytorch.org/docs/master/torchvision/

数据集 torchvision.datasets

包括以下数据集:

数据集有 API: - __getitem__ - __len__ 他们都是 torch.utils.data.Dataset的子类。这样我们在实现我们自己的Dataset数据集的时候至少要实现上边两个方法。

因此， 他们可以使用torch.utils.data.DataLoader里的多线程 (python multiprocessing) 。

例如:

torch.utils.data.DataLoader(coco_cap, batch_size=args.batchSize, shuffle=True, num_workers=args.nThreads)

在构造上每个数据集的API有一些轻微的差异，但是都包含以下参数：

transform - 接受一个图像返回变换后的图像的函数
常用的操作如 ToTensor, RandomCrop等. 他们可以通过transforms.Compose被组合在一起。 (见以下transforms 章节)
target_transform - 一个对目标值进行变换的函数。例如，输入一个图片描述，返回一个编码后的张量（a tensor of word indices）。

每个数据集都有类似参数，所以很容易通过一个掌握其他全部。

MNIST

dset.MNIST(root, train=True, transform=None, target_transform=None, download=False)

root:数据的目录，里边有 processed/training.pt 和processed/test.pt 的内容

train: True -使用训练集, False -使用测试集.

transform: 给输入图像施加变换

target_transform:给目标值(类别标签)施加的变换

download: 是否下载mnist数据集

COCO

This requires the COCO API to be installed

Captions:

dset.CocoCaptions(root="dir where images are", annFile="json annotation file", [transform, target_transform])

Example:

import torchvision.datasets as dset

import torchvision.transforms as transforms

cap = dset.CocoCaptions(root = 'dir where images are',

                        annFile = 'json annotation file',

                        transform=transforms.ToTensor())

print('Number of samples: ', len(cap))

img, target = cap[3] # load 4th sample

print("Image Size: ", img.size())

print(target)

Output:

Number of samples: 82783

Image Size: (3L, 427L, 640L)

[u'A plane emitting smoke stream flying over a mountain.',

u'A plane darts across a bright blue sky behind a mountain covered in snow',

u'A plane leaves a contrail above the snowy mountain top.',

u'A mountain that has a plane flying overheard in the distance.',

u'A mountain view with a plume of smoke in the background']

Detection:

dset.CocoDetection(root="dir where images are", annFile="json annotation file", [transform, target_transform])

LSUN

dset.LSUN(db_path, classes='train', [transform, target_transform])

db_path = root directory for the database files
classes =
'train' - all categories, training set
'val' - all categories, validation set
'test' - all categories, test set
['bedroom_train', 'church_train', …] : a list of categories to load

CIFAR

dset.CIFAR10(root, train=True, transform=None, target_transform=None, download=False)

dset.CIFAR100(root, train=True, transform=None, target_transform=None, download=False)

root : root directory of dataset where there is folder cifar-10-batches-py
train : True = Training set, False = Test set
download : True = downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, does not do anything.

STL10

dset.STL10(root, split='train', transform=None, target_transform=None, download=False)

root : root directory of dataset where there is folder stl10_binary
split : 'train' = Training set, 'test' = Test set, 'unlabeled' = Unlabeled set,

'train+unlabeled' = Training + Unlabeled set (missing label marked as -1)
download : True = downloads the dataset from the internet and

puts it in root directory. If dataset is already downloaded, does not do anything.

SVHN

dset.SVHN(root, split='train', transform=None, target_transform=None, download=False)

root : root directory of dataset where there is folder SVHN
split : 'train' = Training set, 'test' = Test set, 'extra' = Extra training set
download : True = downloads the dataset from the internet and

puts it in root directory. If dataset is already downloaded, does not do anything.

ImageFolder

一个通用的数据加载器，图像应该按照以下方式放置：

root/dog/xxx.png

root/dog/xxy.png

root/dog/xxz.png

root/cat/123.png

root/cat/nsdf3.png

root/cat/asd932_.png

dset.ImageFolder(root="root folder path", [transform, target_transform])

ImageFolder有以下成员:

self.classes - 类别名列表
self.class_to_idx - 类别名到标签，例如 “狗”-->[1,0,0]
self.imgs - 一个包括 (image path, class-index) 元组的列表。

Imagenet-12

This is simply implemented with an ImageFolder dataset.

The data is preprocessed as described here

Here is an example.

PhotoTour

Learning Local Image Descriptors Data http://phototour.cs.washington.edu/patches/default.htm

import torchvision.datasets as dset

import torchvision.transforms as transforms

dataset = dset.PhotoTour(root = 'dir where images are',

                         name = 'name of the dataset to load',

                         transform=transforms.ToTensor())

print('Loaded PhotoTour: {} with {} images.'

      .format(dataset.name, len(dataset.data)))

模型

models 子包含了以下的模型框架：

这里对于每种模型里可能包含很多子模型，比如Resnet就有 34，51，101，152不同层数。

这些成熟的模型的意义就是你可以在torchvision的安装路径下找到可以通过命令 print(torchvision.models.__file__) #'d:\\Anaconda3\\lib\\site-packages\\torchvision\\models\\__init__.py'

学习这些优秀的模型是如何搭建的。

你可以用随机参数初始化一个模型：

import torchvision.models as models

resnet18 = models.resnet18()

alexnet = models.alexnet()

vgg16 = models.vgg16()

squeezenet = models.squeezenet1_0()

我们提供了预训练的ResNet的模型参数，以及 SqueezeNet 1.0 and 1.1, and AlexNet, 使用 PyTorch model zoo. 可以在构造函数里添加 pretrained=True:

import torchvision.models as models

resnet18 = models.resnet18(pretrained=True)

alexnet = models.alexnet(pretrained=True)

squeezenet = models.squeezenet1_0(pretrained=True)

所有的预训练模型期待输入同样标准化的数据，例如mini-baches 包括形似(3*H*W)的3通道的RGB图像，H,W最少是224。

图像的范围必须在[0,1]之间，然后使用 mean=[0.485, 0.456, 0.406] and std=[0.229, 0.224, 0.225] 进行标准化。

相关的例子在： the imagenet example here<https://github.com/pytorch/examples/blob/42e5b996718797e45c46a25c55b031e6768f8440/imagenet/main.py#L89-L101>

变换

变换（Transforms）是常用的图像变换。可以通过 transforms.Compose进行连续操作：

`transforms.Compose`

你可以组合几个变换在一起，例如：

transform = transforms.Compose([

    transforms.RandomSizedCrop(224),

    transforms.RandomHorizontalFlip(),

    transforms.ToTensor(),

    transforms.Normalize(mean = [ 0.485, 0.456, 0.406 ],

                          std = [ 0.229, 0.224, 0.225 ]),

])

PIL.Image支持的变换

`Scale(size, interpolation=Image.BILINEAR)`

缩放输入的 PIL.Image到给定的“尺寸”。 ‘尺寸’ 指的是较短边的尺寸.

例如,如果 height > width, 那么图像将被缩放为 (size * height / width, size) - size: 图像较短边的尺寸- interpolation: Default: PIL.Image.BILINEAR

`CenterCrop(size)` - 从中间裁剪图像到指定大小

从中间裁剪一个 PIL.Image 到给定尺寸. 尺寸可以是一个元组 (target_height, target_width) 或一个整数，整数将被认为是正方形的尺寸 (size, size)

`RandomCrop(size, padding=0)`

Crops the given PIL.Image at a random location to have a region of the given size. size can be a tuple (target_height, target_width) or an integer, in which case the target will be of a square shape (size, size) If padding is non-zero, then the image is first zero-padded on each side with padding pixels.

`RandomHorizontalFlip()`

随机进行PIL.Image图像的水平翻转，概率是0.5.

`RandomSizedCrop(size, interpolation=Image.BILINEAR)`

Random crop the given PIL.Image to a random size of (0.08 to 1.0) of the original size and and a random aspect ratio of 3/4 to 4/3 of the original aspect ratio

This is popularly used to train the Inception networks - size: size of the smaller edge - interpolation: Default: PIL.Image.BILINEAR

`Pad(padding, fill=0)`

Pads the given image on each side with padding number of pixels, and the padding pixels are filled with pixel value fill. If a 5x5image is padded with padding=1 then it becomes 7x7

对于 torch.*Tensor 的变换

`Normalize(mean, std)`

Given mean: (R, G, B) and std: (R, G, B), will normalize each channel of the torch.*Tensor, i.e. channel = (channel - mean) / std

转换变换

ToTensor() - Converts a PIL.Image (RGB) or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0]
ToPILImage() - Converts a torch.*Tensor of range [0, 1] and shape C x H x W or numpy ndarray of dtype=uint8, range[0, 255] and shape H x W x C to a PIL.Image of range [0, 255]

广义变换

`Lambda(lambda)`

Given a Python lambda, applies it to the input img and returns it. For example:

transforms.Lambda(lambda x: x.add(10))

便利函数

make_grid(tensor, nrow=8, padding=2, normalize=False, range=None, scale_each=False)

Given a 4D mini-batch Tensor of shape (B x C x H x W), or a list of images all of the same size, makes a grid of images

normalize=True will shift the image to the range (0, 1), by subtracting the minimum and dividing by the maximum pixel value.

if range=(min, max) where min and max are numbers, then these numbers are used to normalize the image.

scale_each=True will scale each image in the batch of images separately rather than computing the (min, max) over all images.

Example usage is given in this notebook <https://gist.github.com/anonymous/bf16430f7750c023141c562f3e9f2a91>

save_image(tensor, filename, nrow=8, padding=2, normalize=False, range=None, scale_each=False)

Saves a given Tensor into an image file.

If given a mini-batch tensor, will save the tensor as a grid of images.

All options after filename are passed through to make_grid. Refer to it’s documentation for more details

用以输出图像的拼接，很方便。

没想到这篇文章阅读量这么大，考虑跟新下。

图像引擎：由于需要读取处理图片所以需要相关的图像库。现在torchvision可以支持多个图像读取库，可以切换。

使用的函数是：

torchvision.get_image_backend() #获取图像存取引擎

torchvision.set_image_backend(backend) #改变图像读取引擎

#backend (string) –图像引擎的名字：是 {‘PIL’, ‘accimage’}其中之一。 accimage 包使用的是因特尔(Intel) IPP 库。它的速度快于PIL,但是并不支持很多的图像操作。

由于这个是后边的，普通用处不大，知道即可。

一分钱也是真爱

分类: pytorch

pytorch资料的更多相关文章

Pytorch 资料汇总（持续更新）
1. Pytorch 论坛/网站 PyTorch 中文网 python优先的深度学习框架 Pytorch中文文档 Pythrch-CN文档地址 PyTorch 基礎篇 2. Pytorch 书籍深度 ...
【PyTorch】深度学习与PyTorch资料链接整理
欢迎来到我的博客! 以下链接均是日常学习,偶然得之,并加以收集整理,感兴趣的朋友可以多多访问和学习.如果以下内容对你有所帮助,不妨转载和分享.(Update on 5,November,2019) 1 ...
20180122 PyTorch学习资料汇总
PyTorch发布一年团队总结:https://zhuanlan.zhihu.com/p/33131356?gw=1&utm_source=qq&utm_medium=social 官 ...
pytorch学习资料链接
2017年12月25日15:06:44 官方文档:http://pytorch.org/docs/master/index.html 官方文档中文翻译:https://pytorch-cn.readt ...
转 Pytorch 教学资料
本文收集了大量PyTorch项目(备查) 转自:https://blog.csdn.net/fuckliuwenl/article/details/80554182 目录: 入门系列教程入门实例图 ...
学习资料分享：Python能做什么？
最近一直忙着研究学习Python,很久没更新博客了,整理了一些Python学习资料,和大家分享一下!每天更新一篇~ 一.Python 特点 1.易于学习:Python有相对较少的关键字,结构简单,和一 ...
空间金字塔池化(Spatial Pyramid Pooling, SPP)原理和代码实现(Pytorch)
想直接看公式的可跳至第三节 3.公式修正一.为什么需要SPP 首先需要知道为什么会需要SPP. 我们都知道卷积神经网络(CNN)由卷积层和全连接层组成,其中卷积层对于输入数据的大小并没有要求,唯一对 ...
pytorch中文文档-torch.nn常用函数-待添加-明天继续
https://pytorch.org/docs/stable/nn.html 1)卷积层 class torch.nn.Conv2d(in_channels, out_channels, kerne ...
PyTorch常用代码段整理合集
PyTorch常用代码段整理合集转自:知乎作者:张皓众所周知,程序猿在写代码时通常会在网上搜索大量资料,其中大部分是代码段.然而,这项工作常常令人心累身疲,耗费大量时间.所以,今天小编转载了知乎 ...

随机推荐

EAC3 mantissa quantization(VQ & GAQ)
EAC3基于hebap来决定mantissa的quantizer. hebap如下: mantissa 使用VQ(vector quantization) 和GAQ(gain adaptive qua ...
ios 下 select和option 无法隐藏指定元素
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...
c++基础语法规则
1,c++存储类:定义函数或者变量的生命周期 auto 关键字用于两种情况:声明变量时根据初始化表达式自动推断该变量的类型.声明函数时函数返回值的占位符. register 存储类用于定义存储 ...
linux文件或文件夹常见操作，排查部署在linux上程序问题常用操作
创建文件夹 mkdir [-p] DirName [ 在工作目录下,建立一个名为 A 新的子目录 : mkdir A 在工作目录下的 B目录中,建立一个名为 T 的子目录: 若 B 目录不存在,则 ...
【 Struts2 过滤器】
LoginInterceptor package k.util; import com.opensymphony.xwork2.ActionInvocation; import com.opensym ...
CSS学习（8）盒模型
box:盒子,每个元素在页面中都会生成一个矩形区域(盒子) 盒子类型: 1.行盒,display属性=inline的元素,不换行(默认值) 2.块盒,display属性=block的元素,换行浏览器 ...
每天进步一点点------YUV格式详细解释
YUV格式详细解释 YUV开放分类: 网络.计算机.手机.色彩学.影像学概述 YUV(亦称YCrCb)是被欧洲电视系统所采用的一种颜色编码方法(属于PAL),是PAL和SECAM模拟彩色电视制式 ...
python闯关之路二（模块的应用）
1.有如下字符串:n = "路飞学城"(编程题) - 将字符串转换成utf-8的字符编码的字节,再将转换的字节重新转换为utf-8的字符编码的字符串 - 将字符串转换成gbk的字符 ...
松软科技课堂:JavaScriptDOM - 改变 CSS
HTML DOM 允许 JavaScript 更改 HTML 元素的样式. 改变 HTML 样式如需更改 HTML 元素的样式,请使用此语法: document.getElementById(id) ...
Error: EACCES: permission denied, open '/Users/qinmengjiao/WebstormProjects/m-kbs-app/.babelrc
表示没有访问这个文件的权限执行命令 sudo chown -R $(whoami) ~/WebstormProjects/m-kbs-app/.babelrc 就可以解决上面的问题以下是chown ...