#!/usr/bin/env python
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r"""Process the ImageNet Challenge bounding boxes for TensorFlow model training. Associate the ImageNet 2012 Challenge validation data set with labels. The raw ImageNet validation data set is expected to reside in JPEG files
located in the following directory structure. data_dir/ILSVRC2012_val_00000001.JPEG
data_dir/ILSVRC2012_val_00000002.JPEG
...
data_dir/ILSVRC2012_val_00050000.JPEG This script moves the files into a directory structure like such:
data_dir/n01440764/ILSVRC2012_val_00000293.JPEG
data_dir/n01440764/ILSVRC2012_val_00000543.JPEG
...
where 'n01440764' is the unique synset label associated with
these images. This directory reorganization requires a mapping from validation image
number (i.e. suffix of the original file) to the associated label. This
is provided in the ImageNet development kit via a Matlab file. In order to make life easier and divorce ourselves from Matlab, we instead
supply a custom text file that provides this mapping for us. Sample usage:
./preprocess_imagenet_validation_data.py ILSVRC2012_img_val \
imagenet_2012_validation_synset_labels.txt
""" from __future__ import absolute_import
from __future__ import division
from __future__ import print_function import os
import sys from six.moves import xrange # pylint: disable=redefined-builtin if __name__ == '__main__':
if len(sys.argv) < 3: # sys.argv返回脚本本身的名字及给定脚本的参数.
print('Invalid usage\n'
'usage: preprocess_imagenet_validation_data.py '
'<validation data dir> <validation labels file>')
sys.exit(-1) # System.exit(-1)是指所有程序(方法,类等)停止,系统停止运行。
data_dir = sys.argv[1]
validation_labels_file = sys.argv[2] # Read in the 50000 synsets associated with the validation data set.
# imagenet_2012_validation_synset_labels.txt 这个文件中有50000行类别,有重复,与50000图片是一一对应的
labels = [l.strip() for l in open(validation_labels_file).readlines()] # strip() 方法用于移除字符串头尾指定的字符(默认为空格或换行符)。
unique_labels = set(labels) # set() 函数创建一个无序不重复元素集,可进行关系测试,删除重复数据,还可以计算交集、差集、并集等。 # Make all sub-directories in the validation data dir.
for label in unique_labels:
labeled_data_dir = os.path.join(data_dir, label)
if not os.path.exists(labeled_data_dir):
os.makedirs(labeled_data_dir) # Move all of the image to the appropriate sub-directory.
for i in xrange(len(labels)): # xrange() 函数用法与 range 完全相同,所不同的是生成的不是一个数组,而是一个生成器。
basename = 'ILSVRC2012_val_000%.5d.JPEG' % (i + 1)
original_filename = os.path.join(data_dir, basename)
if not os.path.exists(original_filename):
#print('Failed to find: ' % original_filename)
continue
#sys.exit(-1)
new_filename = os.path.join(data_dir, labels[i], basename)
os.rename(original_filename, new_filename)

82行的代码一加进去,就出错:

TypeError: not all arguments converted during string formatting

过程中还出现了以下错误:

Organizing the validation data into sub-directories.
Traceback (most recent call last):
File "F:/datasets/preprocess_imagenet_validation_data.py", line 86, in <module>
os.rename(original_filename, new_filename)
PermissionError: [WinError 32] ▒▒һ▒▒▒▒▒▒▒▒▒▒ʹ▒ô▒▒ļ▒▒▒▒▒▒▒▒޷▒▒▒▒ʡ▒: 'F:/ILSVRC2012_img_val/ILSVRC2012_val_00032304.JPEG' -> 'F:/ILSVRC2012_img_val/n02109961\\ILSVRC2012_val_00032304.JPEG'

可能是不能够一次性重命名太多文件,反正我重新运行了

./download_and_convert_imagenet.sh /f/ILSVRC2012_img_val_varified

preprocess_imagenet_validation_data.py这个程序可以继续重命名文件。

https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版的更多相关文章

  1. https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/

        https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/ ...

  2. 结对项目https://github.com/bxoing1994/test/blob/master/源代码

    所选项目名称:文本替换      结对人:曲承玉 github地址 :https://github.com/bxoing1994/test/blob/master/源代码 结对人github地址:ht ...

  3. https://github.com/python/cpython/blob/master/Doc/library/contextlib.rst 被同一个线程多次获取的同步基元组件

    # -*- coding: utf-8 -*- import time from threading import Lock, RLock from datetime import datetime ...

  4. https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go

    https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go

  5. https://github.com/PyMySQL/PyMySQL/blob/master/pymysql/connections.py

    # Python implementation of the MySQL client-server protocol # http://dev.mysql.com/doc/internals/en/ ...

  6. 用swoole实现mysql的连接池--摘自https://github.com/153734009/doc/blob/master/php/mysql_pool.php

    <?php   $serv = new swoole_server("0.0.0.0", 9508);   $serv->set(['worker_num'=>1 ...

  7. GC 的认识(转) https://github.com/qcrao/Go-Questions/blob/master/GC/GC.md#1-什么是-gc有什么作用

    1. 什么是 GC,有什么作用? GC,全称 Garbage Collection,即垃圾回收,是一种自动内存管理的机制. 当程序向操作系统申请的内存不再需要时,垃圾回收主动将其回收并供其他代码进行内 ...

  8. tensorflow models flags 初步使用

    参考官方仓库:https://github.com/tensorflow/models/tree/master/official/utils/flags 测试Demo代码如下: from absl i ...

  9. Ubuntu18.04下安装、测试tensorflow/models Tensorflow Object Detection API 笔记

    参考:https://www.jianshu.com/p/1ed2d9ce6a88 安装 安装conda+tensorflow库 下载protoc linux x64版,https://github. ...

随机推荐

  1. linux下pppoe连接管理

    一.安装pppoe组件 sudo apt-get install pppoe pppoeconf 二.配置pppoe 图形界面配置pppoe,在terminal里输入 nm-connection-ed ...

  2. hdoj:2055

    #include <iostream> #include <string> using namespace std; bool islower(char ch) { if (c ...

  3. Vue:$set和$delete

    一.$set 在开始讲解$set之前先看下面的一段代码,实现的功能:当点击“添加”按钮时,动态的给data里面的对象添加属性和值,代码示例如下: <!DOCTYPE html> <h ...

  4. 【转】wpf 模板选择器DataTemplateSelector及动态绑定,DataTemplate.Triggers触发器的使用

    通常,如果有多个 DataTemplate 可用于同一类型的对象,并且您希望根据每个数据对象的属性提供自己的逻辑来选择要应用的 DataTemplate,则应创建 DataTemplateSelect ...

  5. Android打开各种类型的文件方法总结

    很简单,通过调用系统的intent,我们可以打开各种文件,不熟悉的朋友可以了解下action.datatype.uri的相关知识. 通用方法如下: public static Intent openF ...

  6. 自定义GridView实现分割线解析

    前两天在些项目的时候碰到常用的GridView要实现一些分割线,之前就是用本方法利用listView和Item的背景颜色的不同线显示分割线.这是最low的一种做法.于是我就简单的写了一个自定义的 Gr ...

  7. layui动态数据表格-分页

    数据结构 $list = [ [,'], [,] ]; $json[; $json['; $json[; $json['data'] = $list; return json($json); 代码: ...

  8. 利用Python写入CSV文件的方法

    第一种:CSV写入中文 #! /usr/bin/env python # _*_ coding:utf- _*_ import csv csvfile = file('test.csv', 'wb') ...

  9. 如何优雅的选择字体(font-family)

    大家都知道,在不同操作系统.不同游览器里面默认显示的字体是不一样的,并且相同字体在不同操作系统里面渲染的效果也不尽相同,那么如何设置字体显示效果会比较好呢?下面我们逐步的分析一下: 一.首先我们看看各 ...

  10. 【消灭代办】第4周 - Echarts在移动端的各种填坑姿势

    啊呀呀呀呀...... 2018-12-03 代办一:坐标指示器相关问题: 见另一篇 第二问:https://www.cnblogs.com/padding1015/p/9936533.html 20 ...