概述

PyTorch在做一般的深度学习图像处理任务时，先使用dataset类和dataloader类读入图片，在读入的时候需要做transform变换，其中transform一般都需要ToTensor()操作，将dataset类中__getitem__()方法内读入的PIL或CV的图像数据转换为torch.FloatTensor。详细过程如下：

PIL与CV数据格式

PIL(RGB)

PIL(Python Imaging Library)是Python中最基础的图像处理库，一般操作如下：

from PIL import Image

import numpy as np

image = Image.open('test.jpg') # 图片是400x300 宽x高

print type(image) # out: PIL.JpegImagePlugin.JpegImageFile

print image.size  # out: (400,300)

print image.mode # out: 'RGB'

print image.getpixel((0,0)) # out: (143, 198, 201)

# resize w*h

image = image.resize((200,100)，Image.NEAREST)

print image.size # out: (200,100)

'''

代码解释

**注意image是 class:`~PIL.Image.Image` object**，它有很多属性，比如它的size是(w,h),通道是RGB，，他也有很多方法，比如获取getpixel((x,y))某个位置的像素，得到三个通道的值，x最大可取w-1，y最大可取h-1

比如resize方法，可以实现图片的放缩，具体参数如下

resize(self, size, resample=0) method of PIL.Image.Image instance

    Returns a resized copy of this image.

    :param size: The requested size in pixels, as a 2-tuple:

       (width, height).

    注意size是 (w,h),和原本的(w,h)保持一致

    :param resample: An optional resampling filter.  This can be

       one of :py:attr:`PIL.Image.NEAREST`, :py:attr:`PIL.Image.BOX`,

       :py:attr:`PIL.Image.BILINEAR`, :py:attr:`PIL.Image.HAMMING`,

       :py:attr:`PIL.Image.BICUBIC` or :py:attr:`PIL.Image.LANCZOS`.

       If omitted, or if the image has mode "1" or "P", it is

       set :py:attr:`PIL.Image.NEAREST`.

       See: :ref:`concept-filters`.

    注意这几种插值方法，默认NEAREST最近邻（分割常用），分类常用BILINEAR双线性，BICUBIC立方

    :returns: An :py:class:`~PIL.Image.Image` object.

'''

image = np.array(image,dtype=np.float32) # image = np.array(image)默认是uint8

print image.shape # out: (100, 200, 3)

# 神奇的事情发生了，w和h换了，变成(h,w,c)了

# 注意ndarray中是 行row x 列col x 维度dim 所以行数是高，列数是宽

OpenCV(python版)(BGR)

OpenCV是一个很强大的图像处理库，适用面更广，可以在各种场合看到，性能也较好，相关代码也较多。常用操作如下：

import cv2

import numpy as np

image = cv2.imread('test.jpg')

print type(image) # out: numpy.ndarray

print image.dtype # out: dtype('uint8')

print image.shape # out: (300, 400, 3) (h,w,c) 和skimage类似

print image # BGR

'''

array([

        [ [143, 198, 201 (dim=3)],[143, 198, 201],... (w=200)],

        [ [143, 198, 201],[143, 198, 201],... ],

        ...(h=100)

      ], dtype=uint8)

'''

# w*h

image = cv2.resize(image,(100,200),interpolation=cv2.INTER_LINEAR)

print image.dtype # out: dtype('uint8')

print image.shape # out: (200, 100, 3)

'''

注意注意注意 和skimage不同

resize(src, dsize[, dst[, fx[, fy[, interpolation]]]])

关键字参数为dst,fx,fy,interpolation

dst为缩放后的图像

dsize为(w,h),但是image是(h,w,c)

fx,fy为图像x,y方向的缩放比例，

interplolation为缩放时的插值方式，有三种插值方式：

cv2.INTER_AREA:使用象素关系重采样。当图像缩小时候，该方法可以避免波纹出现。当图像放大时，类似于 CV_INTER_NN方法　　　　

cv2.INTER_CUBIC: 立方插值

cv2.INTER_LINEAR: 双线形插值　

cv2.INTER_NN: 最近邻插值

[详细可查看该博客](http://www.tuicool.com/articles/rq6fIn)

'''

'''

cv2.imread(filename, flags=None):

flag:

cv2.IMREAD_COLOR 1: Loads a color image. Any transparency of image will be neglected. It is the default flag. 正常的3通道图

cv2.IMREAD_GRAYSCALE 0: Loads image in grayscale mode 单通道灰度图

cv2.IMREAD_UNCHANGED -1: Loads image as such including alpha channel 4通道图

注意: 默认应该是cv2.IMREAD_COLOR，如果你cv2.imread('gray.png')，虽然图片是灰度图，但是读入后会是3个通道值一样的3通道图片

'''

另外，PIL图像在转换为numpy.ndarray后，格式为(h,w,c)，像素顺序为RGB；

OpenCV在cv2.imread()后数据类型为numpy.ndarray，格式为(h,w,c)，像素顺序为BGR。

torchvision.transforms.ToTensor()

torchvision.transforms.transforms.py:61

class ToTensor(object):

    """Convert a ``PIL Image`` or ``numpy.ndarray`` to tensor.

    Converts a PIL Image or numpy.ndarray (H x W x C) in the range

    [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].

    """

    def __call__(self, pic):

        """

        Args:

            pic (PIL Image or numpy.ndarray): Image to be converted to tensor.

        Returns:

            Tensor: Converted image.

        """

        return F.to_tensor(pic)

    def __repr__(self):

        return self.__class__.__name__ + '()'

torchvision.transforms.functional.py:32

def to_tensor(pic):

    """Convert a ``PIL Image`` or ``numpy.ndarray`` to tensor.

    See ``ToTensor`` for more details.

    Args:

        pic (PIL Image or numpy.ndarray): Image to be converted to tensor.

    Returns:

        Tensor: Converted image.

    """

    if not(_is_pil_image(pic) or _is_numpy_image(pic)):

        raise TypeError('pic should be PIL Image or ndarray. Got {}'.format(type(pic)))

    if isinstance(pic, np.ndarray):

        # handle numpy array

        img = torch.from_numpy(pic.transpose((2, 0, 1)))

        # backward compatibility

        if isinstance(img, torch.ByteTensor):

            return img.float().div(255)

        else:

            return img

    if accimage is not None and isinstance(pic, accimage.Image):

        nppic = np.zeros([pic.channels, pic.height, pic.width], dtype=np.float32)

        pic.copyto(nppic)

        return torch.from_numpy(nppic)

    # handle PIL Image

    if pic.mode == 'I':

        img = torch.from_numpy(np.array(pic, np.int32, copy=False))

    elif pic.mode == 'I;16':

        img = torch.from_numpy(np.array(pic, np.int16, copy=False))

    elif pic.mode == 'F':

        img = torch.from_numpy(np.array(pic, np.float32, copy=False))

    elif pic.mode == '1':

        img = 255 * torch.from_numpy(np.array(pic, np.uint8, copy=False))

    else:

        img = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))

    # PIL image mode: L, P, I, F, RGB, YCbCr, RGBA, CMYK

    if pic.mode == 'YCbCr':

        nchannel = 3

    elif pic.mode == 'I;16':

        nchannel = 1

    else:

        nchannel = len(pic.mode)

    img = img.view(pic.size[1], pic.size[0], nchannel)

    # put it from HWC to CHW format

    # yikes, this transpose takes 80% of the loading time/CPU

    img = img.transpose(0, 1).transpose(0, 2).contiguous()

    if isinstance(img, torch.ByteTensor):

        return img.float().div(255)

    else:

        return img

可以从to_tensor()函数看到，函数接受PIL Image或numpy.ndarray，将其先由HWC转置为CHW格式，再转为float后每个像素除以255.

PyTorch载入图片后ToTensor解读（含PIL和OpenCV读取图片对比）的更多相关文章

使用Python的PIL模块来进行图片对比
使用Python的PIL模块来进行图片对比在使用google或者baidu搜图的时候会发现有一个图片颜色选项,感觉非常有意思,有人可能会想这肯定是人为的去划分的,呵呵,有这种可能,但是估计人会累死, ...
【转载】 opencv, PIL.Image的彩色图片维度 && caffe和pytorch的矩阵维度
原文地址: https://blog.csdn.net/u011668104/article/details/82718375 ------------------------------------ ...
Python中Opencv和PIL.Image读取图片的差异对比
近日,在进行深度学习进行推理的时候,发现不管怎么样都得不出正确的结果,再仔细和正确的代码进行对比了后发现原来是Python中不同的库读取的图片数组是有差异的. image = np.array(Ima ...
【小白学PyTorch】16 TF2读取图片的方法
[新闻]:机器学习炼丹术的粉丝的人工智能交流群已经建立,目前有目标检测.医学图像.NLP等多个学术交流分群和水群唠嗑的总群,欢迎大家加炼丹兄为好友,加入炼丹协会.微信:cyx645016617. 参考 ...
【CSS学习笔记】初始化CSS后，写li，并利用背景图片，来完成li小图标的效果，且达到个浏览器兼容
第一种情况 /*当标题前的图标时单独的一个点儿或者方块或者其他类似图标时,定义背景图background要放在<li>里. 在<li>中设置背景图片的尺寸,地址,不重复, ...
python科学计算库numpy和绘图库PIL的结合,素描图片(原创)
# 导入绘图库 from PIL import Image #导入科学计算库 import numpy as np #封装一个图像处理工具类 class TestNumpy(object): def ...
ajax读取图片后排列问题（先加载完图片再排列）
网上找了个瀑布流的图片排列插件.从数据库读取图片路径后显示时出现了位置重叠问题. $.ajax({ type: "POST", url: "index.aspx" ...
opencv图像处理时使用stringstream批量读取图片，处理后并保存
简介: 同文件输入输出流一样,使用stringstream可以批量读取图片,处理后并进行保存.因为C++中头文件 stringstream既可以从string读数据也可向string写数据,利于其这个 ...
python 利用PIL库进行更改图片大小的操作
python 是可以利用PIL库进行更改图片大小的操作的,当然一般情况下是不需要的,但是在一些特殊的利用场合,是需要改变图片的灰度或是大小等的操作的,其实用python更改图片的大小还是蛮简单的,只需 ...

随机推荐

JS export 异步导出
function getUrl () { req().then(res => { console.log(res); }).catch(err => { console.log(err); ...
WebRequest与WebResponse抽象类，DNS静态类、Ping类
一.概述 1.WebRequest: 对统一资源标识符 (URI) 发出请求. 这是一个 abstract 类. WebRequest的派生类:PackWebRequest.FileWebReques ...
DOM查找
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8&quo ...
python自动华（六）
Python自动化 [第六篇]:Python基础-面向对象目录: 面向过程VS面向对象面向对象编程介绍为什么要用面向对象进行开发面向对象的特性:封装.继承.多态面向过程 VS 面向对象 ...
[Javascript] Create an Image with JavaScript Using Fetch and URL.createObjectURL
Most developers are familiar with using img tags and assigning the src inside of HTML. It is also po ...
【安卓基础】WebView开发优化基础
最近工作很忙,不仅要抽空进行管理,还有开发任务在身,幸好有一些规划进行指导,所以还能顺利解决问题.在管理和技术上面,我认为技术是硬实力,管理是软实力,自己需要多点时间花在技术上. 回归正题,在项目中的 ...
Maven+Docker 部署
Maven+Docker 部署安装jdk8镜像 docker pull openjdk:8-jdk-alpine maven插件推送方式修改/etc/docker/daemon.json文件,加入 ...
036_监控 HTTP 服务器的状态(测试返回码)
#!/bin/bash #设置变量,url 为你需要检测的目标网站的网址(IP 或域名)url=http://192.168.4.5/index.html #定义函数 check_http:#使用 c ...
SQL Server report server使用
1.配置share point網站來改動報表打開Reporting Servers Configuration Manager,裏面有Web Service URL(http://loca ...
事务日志已满请参阅sys.databases中的log_reuse_wait_desc列解决办法
http://www.myexception.cn/sql-server/153219.html http://blog.csdn.net/kedingboy12345/article/details ...

PyTorch载入图片后ToTensor解读（含PIL和OpenCV读取图片对比）

概述

PIL与CV数据格式

torchvision.transforms.ToTensor()

PyTorch载入图片后ToTensor解读（含PIL和OpenCV读取图片对比）的更多相关文章

随机推荐

热门专题