[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记

http://openaccess.thecvf.com/content_cvpr_2017/papers/Kodirov_Semantic_Autoencoder_for_CVPR_2017_paper.pdf

Semantic Autoencoder for Zero-Shot Learning，Elyor Kodirov Tao Xiang Shaogang Gong，Queen Mary University of London, UK，{e.kodirov, t.xiang, s.gong}@qmul.ac.uk

亮点

通过对耦学习提升零次学习系统的性能（类似CycleGan）
结构非常简洁，且可直接求解，速度非常快
有效应用到其他相关任务（监督聚类）上，证明了范化性能

方法

Linear autoencoder

Model Formulation

which is a well-known Sylvester equation which can be solved efficiently by the Bartels-Stewart algorithm (matlab sylvester).

零次学习：基于以上算法有两种测试的方法：

将一个未知的类别特征样本xi通过W映射到语义空间（属性）si，通过比较语义空间的距离找到离它最近的类别（无训练样本），即为它的标签
将所有无训练数据类别的语义特征S通过WT映射到特征空间X，通过比较一个未知类别的样本xi和映射到特征空间的类别中心X的距离，找到离它最近的类别，即为它的标签
以上两种算法得到结果的准确度基本相同。

监督聚类：在这个问题中，语义空间即为类别标签空间（one-hot class label）。所有测试数据被影射到训练类别标签空间，然后使用k-means聚合

与已有模型的关系：零度学习已有模型一般学习一个满足以下条件的影射：

或者，在［54］中将属性影射到特征空间，学习目标变为，

文中的算法结合了这两者，而且由于W*=WT，在对耦学习中W不可能太大（否则，x乘以两个范数很大的的矩阵无法恢复原来的初始值），正则化项可以被忽略。

实验

零次学习

数据集：Semantic word vector representation is used for large-scale datasets (ImNet-1 and ImNet-2). We train a skip-gram text model on a corpus of 4.6M Wikipedia documents to obtain the word2vec2 [38, 37] word vectors.

特征：除 ImNet-1用AlexNet提取外，其他均使用了GoogleNet

结果：

Our SAE model achieves the best results on all 6 datasets.
On the smallscale datasets, the gap between our model’s results to the strongest competitor ranges from 3.5% to 6.5%.
On the large-scale datasets, the gaps are even bigger: On the largest ImNet-2, our model improves over the state-of-the-art SS-Voc [22] by 8.8%.
Both the encoder and decoder projection functions in our SAE model (SAE (W) and SAE (WT) respectively) can be used for effective ZSL.

The encoder projection function seems to be slightly better overall.

Measures how well a zero-shot learning method can trade-off between recognising data from seen classes and that of unseen classes

Holding out 20% of the data samples from the seen classes and mixing them with the samples from the unseen classes.
On AwA, our model is slightly worse than the SynCstruct [13].
However, on the more challenging CUB dataset, our method significantly outperforms the competitors.

聚类

数据集： A synthetic dataset and Oxford Flowers-17 (848 images)

结果：

On computational cost, our model (93s) is more expensive than MLCA (39%) but much better than all others (hours~days).
Achieves the best clustering accuracy

p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #042eee }
p.p2 { margin: 0.0px 0.0px 0.0px 0.0px; font: 16.0px "Helvetica Neue"; color: #323333 }
p.p3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
p.p4 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333; min-height: 16.0px }
p.p5 { margin: 0.0px 0.0px 0.0px 0.0px; font: 17.0px STIXGeneral; color: #323333 }
p.p6 { margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px STIXGeneral; color: #323333 }
p.p7 { margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px STIXGeneral; color: #323333 }
p.p8 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 17.0px STIXGeneral; color: #323333 }
p.p9 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 17.0px "Helvetica Neue"; color: #323333; min-height: 20.0px }
p.p10 { margin: 0.0px 0.0px 0.0px 0.0px; text-align: center; font: 19.0px STIXSizeOneSym; color: #323333 }
p.p11 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333; min-height: 17.0px }
li.li3 { margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px "Helvetica Neue"; color: #323333 }
span.s1 { text-decoration: underline }
span.s2 { }
span.s3 { font: 19.0px STIXSizeOneSym }
ul.ul1 { list-style-type: disc }
ul.ul2 { list-style-type: circle }

[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记的更多相关文章

Spectral Norm Regularization for Improving the Generalizability of Deep Learning论文笔记
Spectral Norm Regularization for Improving the Generalizability of Deep Learning论文笔记 2018年12月03日 00: ...
Deep Learning论文笔记之（四）CNN卷积神经网络推导和实现（转）
Deep Learning论文笔记之(四)CNN卷积神经网络推导和实现 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文, ...
Deep Learning论文笔记之（八）Deep Learning最新综述
Deep Learning论文笔记之(八)Deep Learning最新综述 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感觉看完 ...
Deep Learning论文笔记之（六）Multi-Stage多级架构分析
Deep Learning论文笔记之(六)Multi-Stage多级架构分析 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些 ...
Deep Learning论文笔记之（一）K-means特征学习
Deep Learning论文笔记之(一)K-means特征学习 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感 ...
Deep Learning论文笔记之（三）单层非监督学习网络分析
Deep Learning论文笔记之(三)单层非监督学习网络分析 zouxy09@qq.com http://blog.csdn.net/zouxy09 自己平时看了一些论文,但老感 ...
PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning --- 论文笔记
PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning ICLR 20 ...
Correlation Filter in Visual Tracking系列二：Fast Visual Tracking via Dense Spatio-Temporal Context Learning 论文笔记
原文再续,书接一上回.话说上一次我们讲到了Correlation Filter类 tracker的老祖宗MOSSE,那么接下来就让我们看看如何对其进一步地优化改良.这次要谈的论文是我们国内Zhang ...
Deep Learning论文笔记之（四）CNN卷积神经网络推导和实现
https://blog.csdn.net/zouxy09/article/details/9993371 自己平时看了一些论文,但老感觉看完过后就会慢慢的淡忘,某一天重新拾起来的时候又好像没有看过一 ...

随机推荐

Java 与 C++ 不一样的地方（持续更新中...）
本文仅以记录 Java 与 C++ 不同之处,以备随时查询. Java 程序运行机制 Java 是一门编译解释型的语言,即它在运行的过程中既需要编译也需要解释.如下图表示的是 Java 程序运行机制: ...
PO订单审批通过API
DECLARE l_return_status VARCHAR2(1); l_exception_msg VARCHAR2(4000); BEGIN mo_global.set_policy_cont ...
Javascript和BHO的相互调用简介
v:* { } o:* { } w:* { } .shape { }p.MsoNormal,li.MsoNormal,div.MsoNormal { margin: 0cm; margin-botto ...
Swift之GCD使用指南1
Grand Central Dispatch(GCD)是异步执行任务的技术之一.一般将应用程序中记述的线程管理用的代码在系统级中实现.开发者只需要定义想执行的任务并追加到适当的Dispatch Que ...
AES涉及的有限域乘法及字节填充方法
非常值得参考的是官方文档,它详细介绍了AES及其实验过程.博文AES加密算法的C++实现就是基于该文档的介绍及实现,是难得的一篇好文,故在本文最后会附上该文,以作备份. 还有很值得推荐的就是AES的 ...
如何在shell脚本中判断文件或者文件夹是否存在？
1:查找文件夹如果文件夹存在,则打印一句存在,否则打印不存在这里的话可以自由加一些指令. if [ test -d 文件夹名称 ] ; then echo "文件夹存在!" e ...
配置SharePoint环境加域提示网络名不可用[已解决]
今天去客户给机器做备机,带着装好SharePoint07的机器跑过去了,先做个LAN,然后连上机器开始工作:首先当然是改ip地址,然后都改好了开始加域,加了好几次,发现都不行,提示"指定的网 ...
objective-c中@autoreleasepool的用法
objc中关于自动释放池,有两种语法,一种old-fashioned是: NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init]; //d ...
javascript、ruby和C性能一瞥(3) :上汇编
在博文(1)和(2)里分别用了4中方式写一个素数筛选的算法,分别是javascript in browser.node.js.ruby和c:最终的结果是c最快,node.js其次,js in b虽然也 ...
关于IOS中使用支付功能（以支付宝为例）
支付宝是第三方支付平台,简单来说就是协调客户,商户,银行三者关系的方便平台使用支付宝进行一个完整的支付功能,大致有以下步骤: a 与支付宝进行签约,获得商户ID(partner)和账号ID(sell ...

[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记

[CVPR 2017] Semantic Autoencoder for Zero-Shot Learning论文笔记的更多相关文章

随机推荐

热门专题