tflearn 数据集太大无法加载进内存问题?——使用image_preloader 或者是 hdf5 dataset to deal with that issue
tflearn 数据集太大无法加载进内存问题?
|
Hi, all! I know that in pure tensorflow following snippet can be wrapped by for loop to train on several batches: batch_gen = generator(data) |
That is a good idea! While this get implemented, you can use image_preloader or hdf5 dataset to deal with that issue.
Image PreLoader
tflearn.data_utils.image_preloader (target_path, image_shape, mode='file', normalize=True, grayscale=False, categorical_labels=True, files_extension=None, filter_channel=False)
Create a python array (Preloader) that loads images on the fly (from disk or url). There is two ways to provide image samples 'folder' or 'file', see the specifications below.
'folder' mode: Load images from disk, given a root folder. This folder should be arranged as follow:
ROOT_FOLDER -> SUBFOLDER_0 (CLASS 0) -> CLASS0_IMG1.jpg -> CLASS0_IMG2.jpg -> ...-> SUBFOLDER_1 (CLASS 1) -> CLASS1_IMG1.jpg -> ...-> ...
Note that if sub-folders are not integers from 0 to n_classes, an id will be assigned to each sub-folder following alphabetical order.
'file' mode: A plain text file listing every image path and class id. This file should be formatted as follow:
/path/to/img1 class_id
/path/to/img2 class_id
/path/to/img3 class_id
Note that load images on the fly and convert is time inefficient, so you can instead use build_hdf5_image_dataset to build a HDF5 dataset that enable fast retrieval (this function takes similar arguments).
Examples
# Load path/class_id image file:
dataset_file = 'my_dataset.txt'
# Build the preloader array, resize images to 128x128
from tflearn.data_utils import image_preloader
X, Y = image_preloader(dataset_file, image_shape=(128, 128), mode='file', categorical_labels=True, normalize=True)
# Build neural network and train
network = ...
model = DNN(network, ...)
model.fit(X, Y)
Arguments
- target_path:
str. Path of root folder or images plain text file. - image_shape:
tuple (height, width). The images shape. Images that doesn't match that shape will be resized. - mode:
strin ['file', 'folder']. The data source mode. 'folder' accepts a root folder with each of his sub-folder representing a class containing the images to classify. 'file' accepts a single plain text file that contains every image path with their class id. Default: 'folder'. - categorical_labels:
bool. If True, labels are converted to binary vectors. - normalize:
bool. If True, normalize all pictures by dividing every image array by 255. - grayscale:
bool. If true, images are converted to grayscale. - files_extension:
list of str. A list of allowed image file extension, for example ['.jpg', '.jpeg', '.png']. If None, all files are allowed. - filter_channel:
bool. If true, images which the channel is not 3 should be filter.
Returns
(X, Y): with X the images array and Y the labels array.
参考:https://github.com/tflearn/tflearn/issues/555
I try preloader, but seems have bugs. Code as below:
`from future import division, print_function, absolute_import
import tflearn
n = 5
train_dataset_file = '/home/lfwin/imagenet-data/raw-data/train_10c'
test_dataset_file = '/home/lfwin/imagenet-data/raw-data/validation_10c/'
from tflearn.data_utils import image_preloader
X, Y = image_preloader(train_dataset_file, image_shape=(299, 299, 3), mode='folder',
categorical_labels=True, normalize=True)
(testX, testY) = image_preloader(test_dataset_file, image_shape=(299, 299, 3), mode='folder',
categorical_labels=True, normalize=True)
net = tflearn.input_data(shape=[None, 299, 299, 3])
net = tflearn.conv_2d(net, 16, 3, regularizer='L2', weight_decay=0.0001)
net = tflearn.residual_block(net, n, 16)
net = tflearn.residual_block(net, 1, 32, downsample=True)
net = tflearn.residual_block(net, n-1, 32)
net = tflearn.residual_block(net, 1, 64, downsample=True)
net = tflearn.residual_block(net, n-1, 64, downsample=True)
net = tflearn.batch_normalization(net)
net = tflearn.activation(net, 'relu')
net = tflearn.global_avg_pool(net)
net = tflearn.fully_connected(net, 20, activation='softmax')
mom = tflearn.Momentum(0.1, lr_decay=0.1, decay_step=32000, staircase=True)
net = tflearn.regression(net, optimizer=mom,
loss='categorical_crossentropy')
model = tflearn.DNN(net, checkpoint_path='model_resnet_cifar10',
max_checkpoints=10, tensorboard_verbose=0,
clip_gradients=0.)
model.fit(X, Y, n_epoch=1, validation_set=(testX, testY),
snapshot_epoch=False, snapshot_step=500,
show_metric=True, batch_size=16, shuffle=True,
run_id='resnet_imagenet')
`
During debugging, following bugs appeared:
Momentum | epoch: 000 | loss: -717286.50000 - acc: -30021.9395 -- iter: 00032/26000
...
run_id=run_id)
File "/home/lfwin/hello/tflearn/tflearn/helpers/trainer.py", line 289, in fit
show_metric)
File "/home/lfwin/hello/tflearn/tflearn/helpers/trainer.py", line 706, in _train
feed_batch)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 372, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 619, in _run
np_val = np.array(subfeed_val, dtype=subfeed_dtype)
ValueError: could not broadcast input array from shape (299,299,3) into shape (299,299)
1 epoch is 000 always, loss is Nan after 1st step.
2 broadcasting error from shape (299,299,3) into shape (299,299)
|
I thought posting this might help you somehow: I came |
I ran into this also and solved this by using tflearn.reshape(net, new_shape=[-1, 300, 300, 1]) after input_data. My problem was that grayscale=True with image_preloader caused (300, 300) shape so conv_2d wasn't my friend anymore and I didn't find any way to use normal np.reshape with image_preloader instance. Now everything gets jammed nicely into the right shape.
tflearn 数据集太大无法加载进内存问题?——使用image_preloader 或者是 hdf5 dataset to deal with that issue的更多相关文章
- 以最省内存的方式把大图片加载到内存及获取Exif信息和获取屏幕高度和宽度的新方法
我们在加载图片时经常会遇到内存溢出的问题,图片太大,我们加载图片时,一般都是用的如下一般方法(加载本地图片): /** * 不作处理,去加载图片的方法,碰到比较大的图片会内存溢出 */ private ...
- WPF 大数据加载过程中的等待效果——圆圈转动
大家肯定遇到过或将要遇到加载大数据的时候,如果出现长时间的空白等待,一般人的概念会是:难道卡死了? 作为一个懂技术的挨踢技术,即使你明知道数据量太大正在加载,但是假如看不到任何动静,自己觉得还是一种很 ...
- asp.net中TreeView的大数据加载速度优化
由于数据量太大,加载树时间很长,所以进行了优化 前台 .aspx <asp:Panel ID="Panel2" runat="server" Height ...
- 解决父类加载iframe,src参数过大导致加载失败
原文:解决父类加载iframe,src参数过大导致加载失败 <iframe src="*******.do?param=****" id="leftFrame&qu ...
- [WP8.1UI控件编程]Windows Phone大数据量网络图片列表的异步加载和内存优化
11.2.4 大数据量网络图片列表的异步加载和内存优化 虚拟化技术可以让Windows Phone上的大数据量列表不必担心会一次性加载所有的数据,保证了UI的流程性.对于虚拟化的技术,我们不仅仅只是依 ...
- [Aaronyang] 写给自己的WPF4.5 笔记6[三巴掌-大数据加载与WPF4.5 验证体系详解 2/3]
我要做回自己--Aaronyang的博客(www.ayjs.net) 博客摘要: Virtualizing虚拟化DEMO 和 大数据加载的思路及相关知识 WPF数据提供者的使用ObjectDataPr ...
- 【ESXI6.0】 ESXI6.0安装时无法安装网卡驱动的解决方法及将网卡驱动加载进ISO
http://blog.163.com/xifanliang@yeah/blog/static/115078488201571584321787/ 若安装时提示如下图所示 之后安装无法完成,会提示没有 ...
- Unity动态加载和内存管理(三合一)
原址:http://game.ceeger.com/forum/read.php?tid=4394#info 最近一直在和这些内容纠缠,把心得和大家共享一下: Unity里有两种动态加载机制:一是Re ...
- [转]全面理解Unity加载和内存管理
[转]全面理解Unity加载和内存管理 最近一直在和这些内容纠缠,把心得和大家共享一下: Unity里有两种动态加载机制:一是Resources.Load,一是通过AssetBundle,其实两者本质 ...
随机推荐
- MongoDb 出现配置服务不同步的处理
主要片方法就是用正常的配置文件的数据覆盖有问题的就行. 引用: http://dba.stackexchange.com/questions/48232/mongodb-config-servers- ...
- Unity -- 入门教程三
进入这个页面,按编译器版本进行下载,我用的是2010,所以要下载这个. 安装就不用我教了,下面开始看我是如何导入Unity VS的. 点击Import之后我们会发现并没有发生什么,但是接下来我们按一下 ...
- iOS开发 绘图详解
Quartz概述 Quartz是Mac OS X的Darwin核心之上的绘图层,有时候也认为是CoreGraphics.共有两种部分组成 Quartz Compositor,合成视窗系统,管理和合 ...
- 手动安装pip
apt-get instal pip 成功之后,有根据pip的提示,进行了升级,升级之后,pip就出问题了 为了解决上面问题,手动安装pip,依次执行下面命令 1 2 3 4 5 [root@min ...
- C# 将long类型写入二进制文件用bw.Write(num);将其读出用long num= br.ReadInt64();
理由: 因为long类型是 System.Int64 (长整型,占 8 字节,表示 64 位整数,范围大约 -(10 的 19) 次方 到 10 的 19 次方) 而long BinaryReader ...
- 简化动态MERGE的SQL计算
MSSQL.ORACLE等数据库支持MERGE语句更新表.但表结构未知时,因为缺乏集合类数据.用存储过程获得表结构再动态拼出SQL很麻烦,代码会有几十行之多:相同原因,用Java等高级语言实现也不简单 ...
- 《图论》——广度优先遍历算法(BFS)
十大算法之广度优先遍历: 本文以实例形式讲述了基于Java的图的广度优先遍历算法实现方法,详细方法例如以下: 用邻接矩阵存储图方法: 1.确定图的顶点个数和边的个数 2.输入顶点信息存储在一维数组ve ...
- MySQL数据导入与导出
http://blog.chinaunix.net/uid-23354495-id-3188029.html mysql备份脚本之select into outfile
- systemd、upstart和system V
http://blog.csdn.net/kumu_linux/article/details/7653802 systemd是Linux下的一种init软件,由Lennart Poettering ...
- Android对apk源代码的改动--反编译+源代码改动+又一次打包+签名【附HelloWorld的改动实例】
最近遇到了须要改动apk源代码的问题,于是上网查了下相关资料.编写了HelloWorld进行改动看看可行性,经过实验证明此方案可行,而且后来也成功用这种方法对目标apk进行了改动,仅仅只是须要改动的部 ...