https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版
#!/usr/bin/env python
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r"""Process the ImageNet Challenge bounding boxes for TensorFlow model training. Associate the ImageNet 2012 Challenge validation data set with labels. The raw ImageNet validation data set is expected to reside in JPEG files
located in the following directory structure. data_dir/ILSVRC2012_val_00000001.JPEG
data_dir/ILSVRC2012_val_00000002.JPEG
...
data_dir/ILSVRC2012_val_00050000.JPEG This script moves the files into a directory structure like such:
data_dir/n01440764/ILSVRC2012_val_00000293.JPEG
data_dir/n01440764/ILSVRC2012_val_00000543.JPEG
...
where 'n01440764' is the unique synset label associated with
these images. This directory reorganization requires a mapping from validation image
number (i.e. suffix of the original file) to the associated label. This
is provided in the ImageNet development kit via a Matlab file. In order to make life easier and divorce ourselves from Matlab, we instead
supply a custom text file that provides this mapping for us. Sample usage:
./preprocess_imagenet_validation_data.py ILSVRC2012_img_val \
imagenet_2012_validation_synset_labels.txt
""" from __future__ import absolute_import
from __future__ import division
from __future__ import print_function import os
import sys from six.moves import xrange # pylint: disable=redefined-builtin if __name__ == '__main__':
if len(sys.argv) < 3: # sys.argv返回脚本本身的名字及给定脚本的参数.
print('Invalid usage\n'
'usage: preprocess_imagenet_validation_data.py '
'<validation data dir> <validation labels file>')
sys.exit(-1) # System.exit(-1)是指所有程序(方法,类等)停止,系统停止运行。
data_dir = sys.argv[1]
validation_labels_file = sys.argv[2] # Read in the 50000 synsets associated with the validation data set.
# imagenet_2012_validation_synset_labels.txt 这个文件中有50000行类别,有重复,与50000图片是一一对应的
labels = [l.strip() for l in open(validation_labels_file).readlines()] # strip() 方法用于移除字符串头尾指定的字符(默认为空格或换行符)。
unique_labels = set(labels) # set() 函数创建一个无序不重复元素集,可进行关系测试,删除重复数据,还可以计算交集、差集、并集等。 # Make all sub-directories in the validation data dir.
for label in unique_labels:
labeled_data_dir = os.path.join(data_dir, label)
if not os.path.exists(labeled_data_dir):
os.makedirs(labeled_data_dir) # Move all of the image to the appropriate sub-directory.
for i in xrange(len(labels)): # xrange() 函数用法与 range 完全相同,所不同的是生成的不是一个数组,而是一个生成器。
basename = 'ILSVRC2012_val_000%.5d.JPEG' % (i + 1)
original_filename = os.path.join(data_dir, basename)
if not os.path.exists(original_filename):
#print('Failed to find: ' % original_filename)
continue
#sys.exit(-1)
new_filename = os.path.join(data_dir, labels[i], basename)
os.rename(original_filename, new_filename)
82行的代码一加进去,就出错:
TypeError: not all arguments converted during string formatting
过程中还出现了以下错误:
Organizing the validation data into sub-directories.
Traceback (most recent call last):
File "F:/datasets/preprocess_imagenet_validation_data.py", line 86, in <module>
os.rename(original_filename, new_filename)
PermissionError: [WinError 32] ▒▒һ▒▒▒▒▒▒▒▒▒▒ʹ▒ô▒▒ļ▒▒▒▒▒▒▒▒▒▒▒▒ʡ▒: 'F:/ILSVRC2012_img_val/ILSVRC2012_val_00032304.JPEG' -> 'F:/ILSVRC2012_img_val/n02109961\\ILSVRC2012_val_00032304.JPEG'
可能是不能够一次性重命名太多文件,反正我重新运行了
./download_and_convert_imagenet.sh /f/ILSVRC2012_img_val_varified
preprocess_imagenet_validation_data.py这个程序可以继续重命名文件。
https://github.com/tensorflow/models/blob/master/research/slim/datasets/preprocess_imagenet_validation_data.py 改编版的更多相关文章
- https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/
https://github.com/chenghuige/tensorflow-exp/blob/master/examples/sparse-tensor-classification/ ...
- 结对项目https://github.com/bxoing1994/test/blob/master/源代码
所选项目名称:文本替换 结对人:曲承玉 github地址 :https://github.com/bxoing1994/test/blob/master/源代码 结对人github地址:ht ...
- https://github.com/python/cpython/blob/master/Doc/library/contextlib.rst 被同一个线程多次获取的同步基元组件
# -*- coding: utf-8 -*- import time from threading import Lock, RLock from datetime import datetime ...
- https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
https://github.com/golang/crypto/blob/master/bcrypt/bcrypt.go
- https://github.com/PyMySQL/PyMySQL/blob/master/pymysql/connections.py
# Python implementation of the MySQL client-server protocol # http://dev.mysql.com/doc/internals/en/ ...
- 用swoole实现mysql的连接池--摘自https://github.com/153734009/doc/blob/master/php/mysql_pool.php
<?php $serv = new swoole_server("0.0.0.0", 9508); $serv->set(['worker_num'=>1 ...
- GC 的认识(转) https://github.com/qcrao/Go-Questions/blob/master/GC/GC.md#1-什么是-gc有什么作用
1. 什么是 GC,有什么作用? GC,全称 Garbage Collection,即垃圾回收,是一种自动内存管理的机制. 当程序向操作系统申请的内存不再需要时,垃圾回收主动将其回收并供其他代码进行内 ...
- tensorflow models flags 初步使用
参考官方仓库:https://github.com/tensorflow/models/tree/master/official/utils/flags 测试Demo代码如下: from absl i ...
- Ubuntu18.04下安装、测试tensorflow/models Tensorflow Object Detection API 笔记
参考:https://www.jianshu.com/p/1ed2d9ce6a88 安装 安装conda+tensorflow库 下载protoc linux x64版,https://github. ...
随机推荐
- SAP S4HANA1610/Fiori安装过程全记录
经历各种坑,从硬件到文件,终于安装成功. 有需要安装或使用S4HANA(含Fiori)的同学可以参考. 安装文件分享给大家 链接:http://pan.baidu.com/s/1mi7LfIS 密码: ...
- java jdk 打开出错 Failed to load the JNI shared library
``` Failed to load the JNI shared library 解决方法 换了JDK 32位x86的 打开32位 eclipse 2017 oxygen 出现这个问题,修改 配置文 ...
- laravel5.8笔记二:部署
部署项目之前需要知道的几件事 1.有几个模块(admin,index,wap,api) 2.有几个数据库(mysql1,mysql2,mysql3) 3.有那些缓存(redis1,redis2,red ...
- 【静默】在RHEL 6.5上静默安装Oracle 18c
[静默]在RHEL 6.5上静默安装Oracle 18c Oracle 18c.18c其实就是12.2.0.2,19c就是12.2.0.3.db_home.zip 安装包大概4.25G,解压后有8.9 ...
- PXE:偷梁换柱,成功 启动 centos live
default menu.c32 timeout 1 ### 偷梁换柱,成功 label centos76-live menu label centos76-live from ftp kernel ...
- TCP相关面试题(转)
1.TCP三次握手过程 wireshark抓包为:(wireshark会将seq序号和ACK自动显示为相对值) 1)主机A发送标志syn=1,随机产生seq =1234567的数据包到服务 ...
- JavaScript:事件
1. 事件对象|事件冒泡 // 示例代码:[鼠标点击事件]的事件对象 var oBtn=document.getElementById('btn1'); // 按钮DOM oBtn.onclick=f ...
- Centos7 启动脚本
Centos7 启动脚本 启动脚本.如果进程已存在,输出错误信息后退出: #! /bin/bash PIDS=`ps -ef | grep '/usr/bin/node ./index.js' | g ...
- 关于Kafka java consumer管理TCP连接的讨论
本篇是<关于Kafka producer管理TCP连接的讨论>的续篇,主要讨论Kafka java consumer是如何管理TCP连接.实际上,这两篇大部分的内容是相同的,即consum ...
- 前端 jquery获取当前页面的URL信息
以前在做网站的时候,经常会遇到当前页的分类高亮显示,以便让用户了解当前处于哪个页面.之前一直是在每个不同页面写方法.工程量大,也不便于修改.一直在想有什么简便的方法实现.后来在网上查到可以用获取当前U ...