https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版
#!/usr/bin/env python
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r"""Process the ImageNet Challenge bounding boxes for TensorFlow model training. Associate the ImageNet 2012 Challenge validation data set with labels. The raw ImageNet validation data set is expected to reside in JPEG files
located in the following directory structure. data_dir/ILSVRC2012_val_00000001.JPEG
data_dir/ILSVRC2012_val_00000002.JPEG
...
data_dir/ILSVRC2012_val_00050000.JPEG This script moves the files into a directory structure like such:
data_dir/n01440764/ILSVRC2012_val_00000293.JPEG
data_dir/n01440764/ILSVRC2012_val_00000543.JPEG
...
where 'n01440764' is the unique synset label associated with
these images. This directory reorganization requires a mapping from validation image
number (i.e. suffix of the original file) to the associated label. This
is provided in the ImageNet development kit via a Matlab file. In order to make life easier and divorce ourselves from Matlab, we instead
supply a custom text file that provides this mapping for us. Sample usage:
./preprocess_imagenet_validation_data.py ILSVRC2012_img_val \
imagenet_2012_validation_synset_labels.txt
""" from __future__ import absolute_import
from __future__ import division
from __future__ import print_function import os
import sys from six.moves import xrange # pylint: disable=redefined-builtin if __name__ == '__main__':
if len(sys.argv) < 3: # sys.argv返回脚本本身的名字及给定脚本的参数.
print('Invalid usage\n'
'usage: preprocess_imagenet_validation_data.py '
'<validation data dir> <validation labels file>')
sys.exit(-1) # System.exit(-1)是指所有程序(方法,类等)停止,系统停止运行。
data_dir = sys.argv[1]
validation_labels_file = sys.argv[2] # Read in the 50000 synsets associated with the validation data set.
# imagenet_2012_validation_synset_labels.txt 这个文件中有50000行类别,有重复,与50000图片是一一对应的
labels = [l.strip() for l in open(validation_labels_file).readlines()] # strip() 方法用于移除字符串头尾指定的字符(默认为空格或换行符)。
unique_labels = set(labels) # set() 函数创建一个无序不重复元素集,可进行关系测试,删除重复数据,还可以计算交集、差集、并集等。 # Make all sub-directories in the validation data dir.
for label in unique_labels:
labeled_data_dir = os.path.join(data_dir, label)
if not os.path.exists(labeled_data_dir):
os.makedirs(labeled_data_dir) # Move all of the image to the appropriate sub-directory.
for i in xrange(len(labels)): # xrange() 函数用法与 range 完全相同,所不同的是生成的不是一个数组,而是一个生成器。
basename = 'ILSVRC2012_val_000%.5d.JPEG' % (i + 1)
original_filename = os.path.join(data_dir, basename)
if not os.path.exists(original_filename):
#print('Failed to find: ' % original_filename)
continue
#sys.exit(-1)
new_filename = os.path.join(data_dir, labels[i], basename)
os.rename(original_filename, new_filename)
82行的代码一加进去,就出错:
TypeError: not all arguments converted during string formatting
过程中还出现了以下错误:
Organizing the validation data into sub-directories.
Traceback (most recent call last):
File "F:/datasets/preprocess_imagenet_validation_data.py", line 86, in <module>
os.rename(original_filename, new_filename)
PermissionError: [WinError 32] ▒▒һ▒▒▒▒▒▒▒▒▒▒ʹ▒ô▒▒ļ▒▒▒▒▒▒▒▒▒▒▒▒ʡ▒: 'F:/ILSVRC2012_img_val/ILSVRC2012_val_00032304.JPEG' -> 'F:/ILSVRC2012_img_val/n02109961\\ILSVRC2012_val_00032304.JPEG'
可能是不能够一次性重命名太多文件,反正我重新运行了
./download_and_convert_imagenet.sh /f/ILSVRC2012_img_val_varified
preprocess_imagenet_validation_data.py这个程序可以继续重命名文件。
https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版的更多相关文章
- https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/
https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/ ...
- 结对项目https://github.com/bxoing1994/test/blob/master/源代码
所选项目名称:文本替换 结对人:曲承玉 github地址 :https://github.com/bxoing1994/test/blob/master/源代码 结对人github地址:ht ...
- https://github.com/python/cpython/blob/master/Doc/library/contextlib.rst 被同一个线程多次获取的同步基元组件
# -*- coding: utf-8 -*- import time from threading import Lock, RLock from datetime import datetime ...
- https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
- https://github.com/PyMySQL/PyMySQL/blob/master/pymysql/connections.py
# Python implementation of the MySQL client-server protocol # http://dev.mysql.com/doc/internals/en/ ...
- 用swoole实现mysql的连接池--摘自https://github.com/153734009/doc/blob/master/php/mysql_pool.php
<?php $serv = new swoole_server("0.0.0.0", 9508); $serv->set(['worker_num'=>1 ...
- GC 的认识(转) https://github.com/qcrao/Go-Questions/blob/master/GC/GC.md#1-什么是-gc有什么作用
1. 什么是 GC,有什么作用? GC,全称 Garbage Collection,即垃圾回收,是一种自动内存管理的机制. 当程序向操作系统申请的内存不再需要时,垃圾回收主动将其回收并供其他代码进行内 ...
- tensorflow models flags 初步使用
参考官方仓库:https://github.com/tensorflow/models/tree/master/official/utils/flags 测试Demo代码如下: from absl i ...
- Ubuntu18.04下安装、测试tensorflow/models Tensorflow Object Detection API 笔记
参考:https://www.jianshu.com/p/1ed2d9ce6a88 安装 安装conda+tensorflow库 下载protoc linux x64版,https://github. ...
随机推荐
- APK优化工具zipalign的详细介绍和使用
什么是Zipalign? Zipalign是一个android平台上整理APK文件的工具,它首次被引入是在Android 1.6版本的SDK软件开发工具包中.它能够对打包的Android应用 ...
- adb获得安卓系统版本及截屏
[时间:2017-09] [状态:Open] [关键词:adb, android,系统版本,截屏,screencap] 本文主要是我遇到的android命令行用法的一个简单总结 系统版本 获取系统版本 ...
- windows下telnet命令不好用解决方案;
1.按网上说的windows下打开功能,telnet客户端打钩是必须的: 2.如果还不行,找到telnet.exe:然后把此路径添加到环境变量path下即可: ps:telnet的退出:quit命令:
- 异类查询要求为连接设置ANSI_NULLS和ANSI_WARNINGS选项
在查询分析器中,先输入两句 set ansi_nulls on set ansi_warnings on 执行然后再 Create Proc 存储过程 ...
- 【经验总结】- IDEA无法显示Project目录怎么办
1. 关闭IDEA 2.然后删除项目文件夹下的.idea文件夹 3.重新用IDEA工具打开项目 4.再次点击如下图即可搞定
- 3.贝叶斯网络表示(The Bayesian Network Representation)
对于一个n随机变量的联合分布,一般需要2**n-1个参数来表示这个分布.但是,我们可以通过随机变量之间的独立性,减少参数的个数. naive Beyes model: Bayesian Network ...
- python内建时间模块 time和datetime
时间模块 UTC(Coordinated Universal Time,世界协调时)亦即格林威治天文时间,世界标准时间.在中国为UTC+8.DST(Daylight Saving Time)即夏令时. ...
- 给你的app添加桌面widget
首先,什么是桌面widget,桌面widget是一种桌面插件,如下图: 这种类型的控件叫做widget,一般长按桌面会弹出一个界面让你选择控件,选择完了拖到桌面就能使用了. 下面我们为这个app来添加 ...
- Cesium简单使用
CesiumJS是一个基于javascript的浏览器器3d地图引擎 下载 https://cesiumjs.org/downloads/ 下载的Cesium-1.56.1,解压后的结构为 1.设置W ...
- timer计算两个方法执行时间
>>> from timeit import Timer >>> Timer("temp = x; x = y; y = temp", &quo ...