关于Tfrecord

写入Tfrecord

        print("convert data into tfrecord:train\n")

        out_file_train = "/home/huadong.wang/bo.yan/fudan_mtl/data/ace2005/bn_nw.train.tfrecord"

        writer = tf.python_io.TFRecordWriter(out_file_train)

        for i in tqdm(range(len(data_train))):

            record = tf.train.Example(features=tf.train.Features(feature={

                'word_ids': tf.train.Feature(bytes_list=tf.train.BytesList(value=[train_x[i].tostring()])),

                'et_ids1': tf.train.Feature(bytes_list=tf.train.BytesList(value=[train_et1[i].tostring()])),

                'et_ids2': tf.train.Feature(bytes_list=tf.train.BytesList(value=[train_et2[i].tostring()])),

                'position_ids1': tf.train.Feature(bytes_list=tf.train.BytesList(value=[train_p1[i].tostring()])),

                'position_ids2': tf.train.Feature(bytes_list=tf.train.BytesList(value=[train_p1[i].tostring()])),

                'chunks': tf.train.Feature(bytes_list=tf.train.BytesList(value=[train_chunks[i].tostring()])),

                'spath_ids': tf.train.Feature(bytes_list=tf.train.BytesList(value=[train_spath[i].tostring()])),

                'seq_len': tf.train.Feature(int64_list=tf.train.Int64List(value=[train_x_len[i]])),

                'label': tf.train.Feature(int64_list=tf.train.Int64List(value=[np.argmax(train_relation[i])])),

                'task': tf.train.Feature(int64_list=tf.train.Int64List(value=[np.int64(0)]))

            }))

            writer.write(record.SerializeToString())

        writer.close()

解析tfrecord

def _parse_tfexample(serialized_example):

  '''parse serialized tf.train.SequenceExample to tensors

  context features : label, task

  sequence features: sentence

  '''

  context_features={'label'    : tf.FixedLenFeature([], tf.int64),

                    'task'    : tf.FixedLenFeature([], tf.int64),

                    'seq_len': tf.FixedLenFeature([], tf.int64)}

  sequence_features={'word_ids': tf.FixedLenSequenceFeature([], tf.int64),

                     'et_ids1': tf.FixedLenSequenceFeature([], tf.int64),

                     'et_ids2': tf.FixedLenSequenceFeature([], tf.int64),

                     'position_ids1': tf.FixedLenSequenceFeature([], tf.int64),

                     'position_ids2': tf.FixedLenSequenceFeature([], tf.int64),

                     'chunks': tf.FixedLenSequenceFeature([], tf.int64),

                     'spath_ids': tf.FixedLenSequenceFeature([], tf.int64),

                     }

  context_dict, sequence_dict = tf.parse_single_sequence_example(

                      serialized_example,

                      context_features   = context_features,

                      sequence_features  = sequence_features)

  sentence = (sequence_dict['word_ids'],sequence_dict['et_ids1'],sequence_dict['et_ids2'],sequence_dict['position_ids1'],

              sequence_dict['position_ids2'],sequence_dict['chunks'],sequence_dict['spath_ids'], context_dict['seq_len'])

  label = context_dict['label']

  task = context_dict['task']

  return task, label, sentence

def read_tfrecord(epoch, batch_size):

  for dataset in DATASETS:

    train_record_file = os.path.join(OUT_DIR, dataset+'.train.tfrecord')

    test_record_file = os.path.join(OUT_DIR, dataset+'.test.tfrecord')

    train_data = util.read_tfrecord(train_record_file,

                                    epoch,

                                    batch_size,

                                    _parse_tfexample,

                                    shuffle=True)

    test_data = util.read_tfrecord(test_record_file,

                                    epoch,

                                   batch_size,

                                    _parse_tfexample,

                                    shuffle=False)

    yield train_data, test_data

模型中使用：

  def build_task_graph(self, data):

    task_label, labels, sentence = data

    # sentence = tf.nn.embedding_lookup(self.word_embed, sentence)

##########################

    word_ids, et_ids1,et_ids2,position_ids1,position_ids2,chunks,spath_ids,seq_len = sentence

    # sentence = word_ids

#########################

    self.word_ids = word_ids

    self.position_ids1 = position_ids1

    self.position_ids2 = position_ids2

    self.et_ids1 = et_ids1

    self.et_ids2 = et_ids2

    self.chunks_ids = chunks

    self.spath_ids = spath_ids

    self.seq_len = seq_len

    sentence = self.add_embedding_layers()

关于Tfrecord的更多相关文章

Tensorflow 处理libsvm格式数据生成TFRecord (parse libsvm data to TFRecord)
#写libsvm格式数据 write libsvm #!/usr/bin/env python #coding=gbk # ================================= ...
学习笔记TF016:CNN实现、数据集、TFRecord、加载图像、模型、训练、调试
AlexNet(Alex Krizhevsky,ILSVRC2012冠军)适合做图像分类.层自左向右.自上向下读取,关联层分为一组,高度.宽度减小,深度增加.深度增加减少网络计算量. 训练模型数据集 ...
[TFRecord格式数据]利用TFRecords存储与读取带标签的图片
利用TFRecords存储与读取带标签的图片原创文章,转载请注明出处~ 觉得有用的话,欢迎一起讨论相互学习~Follow Me TFRecords其实是一种二进制文件,虽然它不如其他格式好理解,但是 ...
深度学习原理与框架-Tfrecord数据集的读取与训练(代码) 1.tf.train.batch(获取batch图片) 2.tf.image.resize_image_with_crop_or_pad(图片压缩) 3.tf.train.per_image_stand..(图片标准化) 4.tf.train.string_input_producer(字符串入队列) 5.tf.TFRecord(读
1.tf.train.batch(image, batch_size=batch_size, num_threads=1) # 获取一个batch的数据参数说明:image表示输入图片,batch_ ...
深度学习原理与框架-Tfrecord数据集的制作 1.tf.train.Examples(数据转换为二进制) 3.tf.image.encode_jpeg(解码图片加码成jpeg) 4.tf.train.Coordinator(构建多线程通道) 5.threading.Thread(建立单线程) 6.tf.python_io.TFR(TFR读入器)
1. 配套使用: tf.train.Examples将数据转换为二进制,提升IO效率和方便管理对于int类型 : tf.train.Examples(features=tf.train.Featur ...
3. Tensorflow生成TFRecord
1. Tensorflow高效流水线Pipeline 2. Tensorflow的数据处理中的Dataset和Iterator 3. Tensorflow生成TFRecord 4. Tensorflo ...
TFRecord文件的读写
前言在跑通了官网的mnist和cifar10数据之后,笔者尝试着制作自己的数据集,并保存,读入,显示. TensorFlow可以支持cifar10的数据格式, 也提供了标准的TFRecord 格式,而 ...
目标检测的标注数据 .xml 转为 tfrecord 的格式用于 TensorFlow 训练
将目标检测的标注数据 .xml 转为 tfrecord 的格式用于 TensorFlow 训练. import xml.etree.ElementTree as ET import numpy as ...
tfrecord
制作自己的TFRecord数据集,读取,显示及代码详解 http://blog.csdn.net/miaomiaoyuan/article/details/56865361
3 TFRecord样例程序实战
将图片数据写入Record文件 # 定义函数转化变量类型. def _int64_feature(value): return tf.train.Feature(int64_list=tf.train ...

随机推荐

我对网络IO的理解
Unix/Linux系统下IO主要分为磁盘IO,网络IO,我今天主要说一下对网络IO的理解,网络IO主要是socket套接字的读(read).写(write),socket在Linux系统被抽象为流( ...
of_property_read_string_index（转）
https://biscuitos.github.io/blog/DTS-of_property_read_string_index/ 源码分析 of_property_read_string_ind ...
[技术博客]使用wx.downloadfile将图片下载到本地临时存储
目录目标代码展示重点讲解目标在上一篇技术博客中,我们生成的海报中包含图片,这些图片是存储到服务器上的,而canvas的drawimage函数只能读取本地文件,因此我们在drawCanvas之 ...
Linux查看端口使用情况
1.netstat -tunlp,查看已使用的端口 2.netstat -tunlp | grep 8080,查询指定端口使用情况 3.netstat命令无法使用需要安装net-tools yum i ...
查看linux系统版本及内核
一.查看Linux系统版本的命令(3种方法) 1.适用于所有的Linux发行版 cat /etc/issue [root@S-CentOS home]# cat /etc/issue CentOS r ...
Functional-Light-JS 摘录笔记（1）
function foo(...args) { console.log( args[3] ); } var arr = [ 1, 2, 3, 4, 5 ]; foo( ...arr ); Think ...
[转帖]自动交互式脚本--expect
自动交互式脚本--expect https://www.cnblogs.com/zhuiluoyu/p/4873869.html 我们经常会遇到一些需要与服务器程序打交道的场景,比如,从登陆某个服务器 ...
发现一个企业微信第三方应用开发的疑似BUG
1.企业微信两个账号A(超级管理员),账号B(分级管理员),账号B具有创建应用与小程序权限.2.账号B添加一个第三方应用后(创建后能看到第三方应用),使用下图接口登录时回调的agent一直为空,3.超 ...
如何申请腾讯地图用户Key
打开网页https://lbs.qq.com/,进入腾讯位置服务. 单击[登录],登录腾讯账号(本文以QQ登录为例),如果首次登陆腾讯位置服务,则提示注册开发者账号. 选择箭头处[注册新账号].填写手 ...
删除字符串中的字符（C语言）
题目: 编程序将给定字符串中指定字符删除.要求删除指定字符后原字符串不能留下空位置,字符串和指定字符均由键盘输入基本思路将字符串与要删除的字符进行比较,若为相同字符,则将字符串中的该字符替换为原字 ...

关于Tfrecord

关于Tfrecord的更多相关文章

随机推荐

热门专题