https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版
#!/usr/bin/env python
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r"""Process the ImageNet Challenge bounding boxes for TensorFlow model training. Associate the ImageNet 2012 Challenge validation data set with labels. The raw ImageNet validation data set is expected to reside in JPEG files
located in the following directory structure. data_dir/ILSVRC2012_val_00000001.JPEG
data_dir/ILSVRC2012_val_00000002.JPEG
...
data_dir/ILSVRC2012_val_00050000.JPEG This script moves the files into a directory structure like such:
data_dir/n01440764/ILSVRC2012_val_00000293.JPEG
data_dir/n01440764/ILSVRC2012_val_00000543.JPEG
...
where 'n01440764' is the unique synset label associated with
these images. This directory reorganization requires a mapping from validation image
number (i.e. suffix of the original file) to the associated label. This
is provided in the ImageNet development kit via a Matlab file. In order to make life easier and divorce ourselves from Matlab, we instead
supply a custom text file that provides this mapping for us. Sample usage:
./preprocess_imagenet_validation_data.py ILSVRC2012_img_val \
imagenet_2012_validation_synset_labels.txt
""" from __future__ import absolute_import
from __future__ import division
from __future__ import print_function import os
import sys from six.moves import xrange # pylint: disable=redefined-builtin if __name__ == '__main__':
if len(sys.argv) < 3: # sys.argv返回脚本本身的名字及给定脚本的参数.
print('Invalid usage\n'
'usage: preprocess_imagenet_validation_data.py '
'<validation data dir> <validation labels file>')
sys.exit(-1) # System.exit(-1)是指所有程序(方法,类等)停止,系统停止运行。
data_dir = sys.argv[1]
validation_labels_file = sys.argv[2] # Read in the 50000 synsets associated with the validation data set.
# imagenet_2012_validation_synset_labels.txt 这个文件中有50000行类别,有重复,与50000图片是一一对应的
labels = [l.strip() for l in open(validation_labels_file).readlines()] # strip() 方法用于移除字符串头尾指定的字符(默认为空格或换行符)。
unique_labels = set(labels) # set() 函数创建一个无序不重复元素集,可进行关系测试,删除重复数据,还可以计算交集、差集、并集等。 # Make all sub-directories in the validation data dir.
for label in unique_labels:
labeled_data_dir = os.path.join(data_dir, label)
if not os.path.exists(labeled_data_dir):
os.makedirs(labeled_data_dir) # Move all of the image to the appropriate sub-directory.
for i in xrange(len(labels)): # xrange() 函数用法与 range 完全相同,所不同的是生成的不是一个数组,而是一个生成器。
basename = 'ILSVRC2012_val_000%.5d.JPEG' % (i + 1)
original_filename = os.path.join(data_dir, basename)
if not os.path.exists(original_filename):
#print('Failed to find: ' % original_filename)
continue
#sys.exit(-1)
new_filename = os.path.join(data_dir, labels[i], basename)
os.rename(original_filename, new_filename)
82行的代码一加进去,就出错:
TypeError: not all arguments converted during string formatting
过程中还出现了以下错误:
Organizing the validation data into sub-directories.
Traceback (most recent call last):
File "F:/datasets/preprocess_imagenet_validation_data.py", line 86, in <module>
os.rename(original_filename, new_filename)
PermissionError: [WinError 32] ▒▒һ▒▒▒▒▒▒▒▒▒▒ʹ▒ô▒▒ļ▒▒▒▒▒▒▒▒▒▒▒▒ʡ▒: 'F:/ILSVRC2012_img_val/ILSVRC2012_val_00032304.JPEG' -> 'F:/ILSVRC2012_img_val/n02109961\\ILSVRC2012_val_00032304.JPEG'
可能是不能够一次性重命名太多文件,反正我重新运行了
./download_and_convert_imagenet.sh /f/ILSVRC2012_img_val_varified
preprocess_imagenet_validation_data.py这个程序可以继续重命名文件。
https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版的更多相关文章
- https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/
https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/ ...
- 结对项目https://github.com/bxoing1994/test/blob/master/源代码
所选项目名称:文本替换 结对人:曲承玉 github地址 :https://github.com/bxoing1994/test/blob/master/源代码 结对人github地址:ht ...
- https://github.com/python/cpython/blob/master/Doc/library/contextlib.rst 被同一个线程多次获取的同步基元组件
# -*- coding: utf-8 -*- import time from threading import Lock, RLock from datetime import datetime ...
- https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
- https://github.com/PyMySQL/PyMySQL/blob/master/pymysql/connections.py
# Python implementation of the MySQL client-server protocol # http://dev.mysql.com/doc/internals/en/ ...
- 用swoole实现mysql的连接池--摘自https://github.com/153734009/doc/blob/master/php/mysql_pool.php
<?php $serv = new swoole_server("0.0.0.0", 9508); $serv->set(['worker_num'=>1 ...
- GC 的认识(转) https://github.com/qcrao/Go-Questions/blob/master/GC/GC.md#1-什么是-gc有什么作用
1. 什么是 GC,有什么作用? GC,全称 Garbage Collection,即垃圾回收,是一种自动内存管理的机制. 当程序向操作系统申请的内存不再需要时,垃圾回收主动将其回收并供其他代码进行内 ...
- tensorflow models flags 初步使用
参考官方仓库:https://github.com/tensorflow/models/tree/master/official/utils/flags 测试Demo代码如下: from absl i ...
- Ubuntu18.04下安装、测试tensorflow/models Tensorflow Object Detection API 笔记
参考:https://www.jianshu.com/p/1ed2d9ce6a88 安装 安装conda+tensorflow库 下载protoc linux x64版,https://github. ...
随机推荐
- Python 简单入门指北(二)
Python 简单入门指北(二) 2 函数 2.1 函数是一等公民 一等公民指的是 Python 的函数能够动态创建,能赋值给别的变量,能作为参传给函数,也能作为函数的返回值.总而言之,函数和普通变量 ...
- C语言 · 空白格式化
标题:空白格式化 “空白格式化”具体做法是:去掉所有首尾空白:中间的多个空白替换为一个空格.所谓空白指的是:空格.制表符.回车符. 填空为:*p_to<*p_from: #include< ...
- Servlet 2.0 && Servlet 3.0 新特性
概念:透传. Callback 在异步线程中是如何使用的.?? Servlet 2.0 && Servlet 3.0 新特性 Servlet 2.0 && Servle ...
- mui 浏览器一样自动缩放
<!doctype html> <html> <head> <meta charset="UTF-8"> <title> ...
- TextView 链接显示及跳转
当文字中出现URL.E-mail.电话号码等的时候,我们为TextView设置链接.总结起来,一共有4种方法来为TextView实现链接.我们一一举例介绍: 1. 在xml里添加android:aut ...
- 02Spark的左连接
两个文件,一个是用户的数据,一个是交易的数据. 用户: 交易: 流程如下: 分为以下几个步骤: (1)分别读取user文件和transform文件,并转为两个RDD. * (2)对上面两个RDD执行m ...
- 01Spark的TopN问题
和hadoop的目的一样,给你数据,然后取TopN.数据如下: 取出数据在排名前十的数据. 代码如下: package com.test.book; import java.util.ArrayLis ...
- c++中常用的泛型算法
std中定义了很好几种顺序容器,它们自身也提供了一些操作,但是还有很多算法,容器本身没有提供. 而在algorithm头文件中,提供了许多算法,适用了大多数顺序容器.与c++11相比,很多函数在 c+ ...
- 第四百一十五节,python常用排序算法学习
第四百一十五节,python常用排序算法学习 常用排序 名称 复杂度 说明 备注 冒泡排序Bubble Sort O(N*N) 将待排序的元素看作是竖着排列的“气泡”,较小的元素比较轻,从而要往上浮 ...
- SSH配置struts校验发生No result defined for action actions.AdminLoginAction and result input
配置struts校验发生No result defined for action actions.AdminLoginAction and result input,但是登录,success.jsp, ...