tensorflow实现siamese网络 (附代码)

转载自：https://blog.csdn.net/qq1483661204/article/details/79039702

Learning a Similarity Metric Discriminatively, with Application to Face
Verification 这个siamese文章链接。
本文主要讲解siamese网络，并用tensorflwo实现，在mnist数据集中，siamese网络和其他网络的不同之处在于，首先他是两个输入，它输入的不是标签，而是是否是同一类别，如果是同一类别就是0，否则就是1，文章中是用这个网络来做人脸识别，网络结构图如下：

从图中可以看到，他又两个输入，分别是下x1和x2，左右两个的网咯结构是一样的，并且他们共享权重，最后得到两个输出，分别是Gw(x1)和Gw(x2),这个网络的很好理解，当输入是同一张图片的时候，我们希望他们呢之间的欧式距离很小，当不是一张图片时，我们的欧式距离很大。有了网路结构，接下来就是定义损失函数，这个很重要，而经过我们的分析，我们可以知道，损失函数的特点应该是这样的，
(1) 当我们输入同一张图片时，他们之间的欧式距离越小，损失是越小的，距离越大，损失越大
(2) 当我们的输入是不同的图片的时候，他们之间的距离越大，损失越大
怎么理解呢，很简单，我们就是最小化把相同类的数据之间距离，最大化不同类之间的距离。
然后文章中定义的损失函数如下：
首先是定义距离，使用l2范数，公式如下：

距离其实就是欧式距离，有了距离，我们的损失函数和距离的关系我上面说了，如何包证满足上面的要求呢，文章提出这样的损失函数：

其中我们的Ew就是距离，Lg和L1相当于是一个系数，这个损失函数和交叉熵其实挺像，为了让损失函数满足上面的关系，让Lg满足单调递减，LI满足单调递增就可以。另外一个条件是：同类图片之间的距离必须比不同类之间的距离小，
其他条件如下：

然后作者也给出了证明，最终损失函数为：

Q是一个常数，这个损失函数就满足上面的关系，然后我用tensoflow写了一个损失函数如下：

需要强调的是，这个地方同一类图片是0，不同类图片是1，然后我自己用tensorflow实现的这个损失函数如下：

def siamese_loss(out1,out2,y,Q=5):

    Q = tf.constant(Q, name="Q",dtype=tf.float32)

    E_w = tf.sqrt(tf.reduce_sum(tf.square(out1-out2),1))

    pos = tf.multiply(tf.multiply(y,2/Q),tf.square(E_w))

    neg = tf.multiply(tf.multiply(1-y,2*Q),tf.exp(-2.77/Q*E_w))

    loss = pos + neg

    loss = tf.reduce_mean(loss)

    return loss

这就是损失函数，其他的代码如下：

 import tensorflow as tf

 from tensorflow.examples.tutorials.mnist import input_data

 import numpy as np

 tf.reset_default_graph()

 mnist = input_data.read_data_sets('./data/mnist',one_hot=True)

 print(mnist.validation.num_examples)

 print(mnist.train.num_examples)

 print(mnist.test.num_examples)

 def siamese_loss(out1,out2,y,Q=5):

     Q = tf.constant(Q, name="Q",dtype=tf.float32)

     E_w = tf.sqrt(tf.reduce_sum(tf.square(out1-out2),1))

     pos = tf.multiply(tf.multiply(y,2/Q),tf.square(E_w))

     neg = tf.multiply(tf.multiply(1-y,2*Q),tf.exp(-2.77/Q*E_w))

     loss = pos + neg

     loss = tf.reduce_mean(loss)

     return loss

 def siamese(inputs,keep_prob):

         with tf.name_scope('conv1') as scope:

             w1 = tf.Variable(tf.truncated_normal(shape=[3,3,1,32],stddev=0.05),name='w1')

             b1 = tf.Variable(tf.zeros(32),name='b1')

             conv1 = tf.nn.conv2d(inputs,w1,strides=[1,1,1,1],padding='SAME',name='conv1')

         with tf.name_scope('relu1') as scope:

             relu1 = tf.nn.relu(tf.add(conv1,b1),name='relu1')

         with tf.name_scope('conv2') as scope:

             w2 = tf.Variable(tf.truncated_normal(shape=[3,3,32,64],stddev=0.05),name='w2')

             b2 = tf.Variable(tf.zeros(64),name='b2')

             conv2 = tf.nn.conv2d(relu1,w2,strides=[1,2,2,1],padding='SAME',name='conv2')

         with tf.name_scope('relu2') as scope:

             relu2 = tf.nn.relu(conv2+b2,name='relu2')

         with tf.name_scope('conv3') as scope:

             w3 = tf.Variable(tf.truncated_normal(shape=[3,3,64,128],mean=0,stddev=0.05),name='w3')

             b3 = tf.Variable(tf.zeros(128),name='b3')

             conv3 = tf.nn.conv2d(relu2,w3,strides=[1,2,2,1],padding='SAME')

         with tf.name_scope('relu3') as scope:

             relu3 = tf.nn.relu(conv3+b3,name='relu3')

         with tf.name_scope('fc1') as scope:

             x_flat = tf.reshape(relu3,shape=[-1,7*7*128])

             w_fc1=tf.Variable(tf.truncated_normal(shape=[7*7*128,1024],stddev=0.05,mean=0),name='w_fc1')

             b_fc1 = tf.Variable(tf.zeros(1024),name='b_fc1')

             fc1 = tf.add(tf.matmul(x_flat,w_fc1),b_fc1)

         with tf.name_scope('relu_fc1') as scope:

             relu_fc1 = tf.nn.relu(fc1,name='relu_fc1')

         with tf.name_scope('drop_1') as scope:

             drop_1 = tf.nn.dropout(relu_fc1,keep_prob=keep_prob,name='drop_1')

         with tf.name_scope('bn_fc1') as scope:

             bn_fc1 = tf.layers.batch_normalization(drop_1,name='bn_fc1')

         with tf.name_scope('fc2') as scope:

             w_fc2 = tf.Variable(tf.truncated_normal(shape=[1024,512],stddev=0.05,mean=0),name='w_fc2')

             b_fc2 = tf.Variable(tf.zeros(512),name='b_fc2')

             fc2 = tf.add(tf.matmul(bn_fc1,w_fc2),b_fc2)

         with tf.name_scope('relu_fc2') as scope:

             relu_fc2 = tf.nn.relu(fc2,name='relu_fc2')

         with tf.name_scope('drop_2') as scope:

             drop_2 = tf.nn.dropout(relu_fc2,keep_prob=keep_prob,name='drop_2')

         with tf.name_scope('bn_fc2') as scope:

             bn_fc2 = tf.layers.batch_normalization(drop_2,name='bn_fc2')

         with tf.name_scope('fc3') as scope:

             w_fc3 = tf.Variable(tf.truncated_normal(shape=[512,2],stddev=0.05,mean=0),name='w_fc3')

             b_fc3 = tf.Variable(tf.zeros(2),name='b_fc3')

             fc3 = tf.add(tf.matmul(bn_fc2,w_fc3),b_fc3)

         return fc3

 lr = 0.01

 iterations = 20000

 batch_size = 64

 with tf.variable_scope('input_x1') as scope:

     x1 = tf.placeholder(tf.float32, shape=[None, 784])

     x_input_1 = tf.reshape(x1, [-1, 28, 28, 1])

 with tf.variable_scope('input_x2') as scope:

     x2 = tf.placeholder(tf.float32, shape=[None, 784])

     x_input_2 = tf.reshape(x2, [-1, 28, 28, 1])

 with tf.variable_scope('y') as scope:

     y = tf.placeholder(tf.float32, shape=[batch_size])

 with tf.name_scope('keep_prob') as scope:

     keep_prob = tf.placeholder(tf.float32)

 with tf.variable_scope('siamese') as scope:

     out1 = siamese(x_input_1,keep_prob)

     scope.reuse_variables()

     out2 = siamese(x_input_2,keep_prob)

 with tf.variable_scope('metrics') as scope:

     loss = siamese_loss(out1, out2, y)

     optimizer = tf.train.AdamOptimizer(lr).minimize(loss)

 loss_summary = tf.summary.scalar('loss',loss)

 merged_summary = tf.summary.merge_all()

 with tf.Session() as sess:

     writer = tf.summary.FileWriter('./graph/siamese',sess.graph)

     sess.run(tf.global_variables_initializer())

     for itera in range(iterations):

         xs_1, ys_1 = mnist.train.next_batch(batch_size)

         ys_1 = np.argmax(ys_1,axis=1)

         xs_2, ys_2 = mnist.train.next_batch(batch_size)

         ys_2 = np.argmax(ys_2,axis=1)

         y_s = np.array(ys_1==ys_2,dtype=np.float32)

         _,train_loss,summ = sess.run([optimizer,loss,merged_summary],feed_dict={x1:xs_1,x2:xs_2,y:y_s,keep_prob:0.6})

         writer.add_summary(summ,itera)

         if itera % 1000 == 1 :

             print('iter {},train loss {}'.format(itera,train_loss))

     embed = sess.run(out1,feed_dict={x1:mnist.test.images,keep_prob:0.6})

     test_img = mnist.test.images.reshape([-1,28,28,1])

     writer.close()

tensorflow实现siamese网络 (附代码)的更多相关文章

SVM原理以及Tensorflow 实现SVM分类(附代码)
1.1. SVM介绍 1.2. 工作原理 1.2.1. 几何间隔和函数间隔 1.2.2. 最大化间隔 - 1.2.2.0.0.1. $L( {x}^*)$对$ {x}^*$求导为0 - 1.2.2 ...
siamese网络&&tripletnet
siamese网络 - 之前记录过: https://www.cnblogs.com/ranjiewen/articles/7736089.html - 原始的siamese network: 输入一 ...
十图详解tensorflow数据读取机制（附代码）转知乎
十图详解tensorflow数据读取机制(附代码) - 何之源的文章 - 知乎 https://zhuanlan.zhihu.com/p/27238630
Siamese网络
1. 对比损失函数(Contrastive Loss function) 孪生架构的目的不是对输入图像进行分类,而是区分它们.因此,分类损失函数(如交叉熵)不是最合适的选择,这种架构更适合 ...
tensorflow笔记：多层LSTM代码分析
tensorflow笔记:多层LSTM代码分析标签(空格分隔): tensorflow笔记 tensorflow笔记系列: (一) tensorflow笔记:流程,概念和简单代码注释 (二) ten ...
Pytorch 入门之Siamese网络
首次体验Pytorch,本文参考于:github and PyTorch 中文网人脸相似度对比本文主要熟悉Pytorch大致流程,修改了读取数据部分.没有采用原作者的ImageFolder方法: ...
tensorflow笔记：多层CNN代码分析
tensorflow笔记系列: (一) tensorflow笔记:流程,概念和简单代码注释 (二) tensorflow笔记:多层CNN代码分析 (三) tensorflow笔记:多层LSTM代码分析 ...
SpringCloud-使用熔断器防止服务雪崩-Ribbon和Feign方式(附代码下载)
场景 SpringCloud-服务注册与实现-Eureka创建服务注册中心(附源码下载): https://blog.csdn.net/BADAO_LIUMANG_QIZHI/article/deta ...
小姐姐带你一起学：如何用Python实现7种机器学习算法（附代码）
小姐姐带你一起学:如何用Python实现7种机器学习算法(附代码) Python 被称为是最接近 AI 的语言.最近一位名叫Anna-Lena Popkes的小姐姐在GitHub上分享了自己如何使用P ...

随机推荐

20190716-Python网络数据采集/第 2 章复杂HTML解析
# P29/9# 解析,要考虑到可持续性问题,对方反爬修改后,仍继续有效,方为优秀代码# 解析一个目标网页前,需要做到以下几点:(1)明确目标内容:(2)寻找“打印此页”的链接,或查看网站有无HTML ...
MyBatis学习存档（5）——联表查询
之前的数据库操作都是基于一张表进行操作的,若一次查询涉及到多张表,那该如何进行操作呢? 首先明确联表查询的几个关系,大体可以分为一对一和一对多这两种情况,接下来对这两种情况进行分析: 一.建立表.添加 ...
java项目上线的流程（将web项目部署到公网）
本博文来源于网络,原文的地址在本篇博文最下方. 如何将java web项目上线/部署到公网关于如何将Java Web上线,部署到公网,让全世界的人都可以访问的问题.小编将作出系列化,完整的流程介绍. ...
springboot+eureka+mybatis+mysql环境下报504 Gateway Time-out
1.test环境下的数据库配置的 driver和url有问题, 在工程日志中的表现是不能打印出最新的日志,在部署前的日志能看到报错:
Angular 变更检测
angular 的钩子函数有 content 和 view , Docheck 子控件中有属性变化的时候,父组件的 Docheck content view 这3个会依次执行,即使这个属性不在 ...
sqlserver跨库操作数据
垮库只能读操作,写操作需要设置权限. USE [jdddb] GO /****** Object: StoredProcedure [dbo].[proc_LYOrderCancel] Script ...
wrbstrom使用
使用webstrom时遇到Firefox浏览器打不开问题,是webstrom未找到你Firefox的安装路径下面为大家提供解决方法: 文件--->设置--->工具--->web浏览器 ...
SAP分析云及协同计划
大家好, 我是SAP成都研究院S/4HANA Sales 团队的软件工程师Derek.四年前我从SAP Consulting团队转到SAP Labs从事Sales Analytics相关应用的开发,在 ...
Windows下Pycharm安装Tensorflow：ERROR: Could not find a version that satisfies the requirement tensorflow
今天在Windows下通过Pycharm安装Tensorflow时遇到两个问题: 使用pip安装其实原理都相同,只不过Pycharm是图形化的过程! 1.由于使用国外源总是导致Timeout 解决方法 ...
qjson中把记录或类型或泛型数组转换为json字符串
unit Unit4; interface uses Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System ...

tensorflow实现siamese网络 (附代码)

tensorflow实现siamese网络 (附代码)的更多相关文章

随机推荐

热门专题