Paper Reading:FPN】的更多相关文章

FPN 论文:Feature Pyramid Networks for Object Detection 发表时间:2017 发表作者:(Facebook AI Research)Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie 发表刊物/会议:CVPR 论文链接:论文链接 论文代码:点击此处 Feature Pyramid Networks (FPN) 是比较…
开篇第一篇就写一个paper reading吧,用markdown+vim写东西切换中英文挺麻烦的,有些就偷懒都用英文写了. Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras Abstract Optimization objectives: intrinsic/extrinsic parameters of all keyframes all selected pixels' depth Inte…
<Exploiting Relevance Feedback in Knowledge Graph> Publication: KDD 2015 Authors: Yu Su, Shengqi Yang, etc. Affiliation: UCSB... 1. Short description: p { margin-bottom: 0.1in; line-height: 120% } a:link { } This paper formulate the novice graph rel…
Perceptual Generative Adversarial Networks for Small Object Detection 2017-07-11  19:47:46   CVPR 2017 This paper use GAN to handle the issue of small object detection which is a very hard problem in general object detection. As shown in the followin…
In Defense of the Triplet Loss for Person Re-Identification  2017-07-02  14:04:20   This blog comes from: http://blog.csdn.net/shuzfan/article/details/70069822 Paper:  https://arxiv.org/abs/1703.07737 Github: https://github.com/VisualComputingInstitu…
Link of the Paper: https://arxiv.org/abs/1706.03762 Motivation: The inherently sequential nature of Recurrent Models precludes parallelization within training examples. Attention mechanisms have become an integral part of compelling sequence modeling…
Link of the Paper: https://arxiv.org/abs/1705.03122 Motivation: Compared to recurrent layers, convolutions create representations for fixed size contexts, however, the effective context size of the network can easily be made larger by stacking severa…
Link of the Paper: https://arxiv.org/pdf/1412.6632.pdf Main Points: The authors propose a multimodal Recurrent Neural Networks ( AlexNet/VGGNet + a multimodal layer + RNNs ). Their work has two major differences from these methods. Firstly, they inco…
Link of the Paper: https://arxiv.org/abs/1412.2306 Main Points: An Alignment Model: Convolutional Neural Networks over image regions ( An image -> RCNN -> Top 19 detected locations in addition to the whole image -> the representations based on th…
Link of the Paper: https://ieeexplore.ieee.org/document/7298856/ A Correlative Paper: Learning a Recurrent Visual Representation for Image Caption Generation (Link of the Paper: https://arxiv.org/abs/1411.5654) Main Points: A bi-directional mapping m…
Link of the Paper: https://arxiv.org/abs/1411.4555 Main Points: A generative model ( NIC, GoogLeNet + LSTM ) based on a deep recurrent architecture: the model is trained to maximize the likelihoodP(S|I) of the target description sentence given the tr…
Link of the Paper: https://arxiv.org/abs/1411.4389 Main Points: A novel Recurrent Convolutional Architecture ( CNN + LSTM ): both Spatially and Temporally Deep. The recurrent long-term models are directly connected to modern visual convnet models and…
Link of the Paper: http://papers.nips.cc/paper/4470-im2text-describing-images-using-1-million-captioned-photographs.pdf Main Points: A large novel data set containing images from the web with associated captions written by people, filtered so that th…
Link of the Paper: https://arxiv.org/pdf/1409.3215.pdf Main Points: Encoder-Decoder Model: Input sequence -> A vector of a fixed dimensionality -> Target sequence. A multilayered  LSTM: The LSTM did not have difficulty on long sentences. Deep LSTMs…
1. Neuroaesthetics in fashion: modeling the perception of fashionability, Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, Raquel Urtasun, in CVPR 2015. Goal: learn and predict how fashionable a person looks on a photograph, and suggest subtle…
1. Sketch me that shoe, Qian Yu, Feng Liu, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales, Cheng Change Loy, in CVPR 2016. A unique characteristic of sketches in the context of image retrieval is that they offer inherently fine-grained visual descript…
Link of the Paper: https://arxiv.org/abs/1805.09019 Innovations: The authors propose a CNN + CNN framework for image captioning. There are four modules in the framework: vision module ( VGG-16 ), which is adopted to "watch" images; language modu…
Link of the Paper: https://arxiv.org/abs/1806.06422 Innovations: The authors propose a novel learning based discriminative evaluation metric that is directly trained to distinguish between human and machine-generated captions. They train an automatic…
Link of the Paper: https://arxiv.org/abs/1711.09151 Motivation: LSTM units are complex and inherently sequential across time. Convolutional networks have shown advantages on machine translation and conditional image generation. Innovation: The author…
Link of the Paper: https://arxiv.org/pdf/1504.06692.pdf Innovations: The authors propose the Novel Visual Concept learning from Sentences ( NVCS ) task. In this task, methods need to learn novel concepts from sentence descriptions of a few images. Th…
Link of the Paper: https://arxiv.org/pdf/1502.03044.pdf Main Points: Encoder-Decoder Framework: Encoder uses a convolutional neural network to extract a set of feature vectors which the authors refer to as annotation vectors. The extractor produces L…
Link of the Paper: https://arxiv.org/abs/1609.06647 A Correlative Paper: Show and Tell: A Neural Image Caption Generator (Link of the Paper: https://arxiv.org/abs/1411.4555) Main Points ( Improvements Over the CVPR2015 Model  ): Image Model Improveme…
Learning while Reading 不限于具体的书,只限于知识的宽度 这个系列集合了一周所学所看的精华,它们往往来自不只一本书 我们之所以将自然界分类,组织成各种概念,并按其分类,主要是因为我们是整个口语交流社会共同遵守的协定的参与者,这个协定以语言的形式固定下来.除非赞成这个协定中规定的有关语言信息的组织和分类,否则我们根本无法交谈. ——Benjamin Lee Whorf Learning and Asking 为什么选择面向对象? 机器语言.汇编语言.面向过程的语言,通过一层层…
what has been done: This paper proposed a novel Deep Supervised Hashing method to learn a compact similarity-presevering binary code for the huge body of image data. Data sets:  CIFAR-10: 60,000 32*32 belonging to 10 mutually exclusively categories(6…
Relation Networks for Object Detection笔记  写在前面:关于这篇论文的背景知识,请参考我前面的两篇随笔(<关于目标检测>和<关于注意力机制>) 摘要: 所有最先进的物体检测系统仍然依赖于单独识别物体实例, 在学习过程中并没有利用它们的关系.(背景) 这个工作提出了一个目标关系模块.它通过它们的外观特征和几何图形之间的交互来同时处理一组物体,从而对它们之间的关系进行建模.它是轻量级的和就地(in-place)这里的relation module是…
Paper: Object Recognition from Scale-Invariant Features Sorce: http://www.cs.ubc.ca/~lowe/papers/iccv99.pdf SIFT 即Scale Invariant Feature Transfrom, 尺度不变变换,由David Lowe提出.是CV最著名也最常用的特征.在图像目标识别的应用中,常常要求图像的特征有很好的roboust即不容易受到平移,旋转,尺度缩放,光照,仿射的英雄.SIFT算子具有…
论文:word2vec Parameter Learning Explained 发表时间:2016 发表作者:Xin Rong 论文链接:论文链接 为了揭开Word2vec的神秘面纱,不得不重新整理复习了Word2vec的相关资料. Xin Rong 的这篇英文paper是更多人首推的 Word2vec 参考资料.这篇论文理论完备,由浅入深,且直击要害,既有 高屋建瓴的 intuition 的解释,也有细节的推导过程.下面一起学习下这篇paper. 由于word2vec模型学习生成的词向量表示…
论文:Scale-Aware Trident Networks for Object Detection 发表时间:2019 发表作者:(University of Chinese Academy of Sciences)Yuntao Chen, (TuSimple)Naiyan Wang 发表刊物/会议:ICCV 论文链接:论文链接 论文代码:代码链接 DetNet 这篇文章主要要解决的问题便是目标检测中最为棘手的scale variation问题.使用了非常简单干净的办法在标准的COCO b…
论文:DetNet: A Backbone network for Object Detection 发表时间:2018 发表作者:(Face++)Chao Peng, Gang Yu (Tsinghua University)Zeming Li 发表刊物/会议:ECCV 论文链接:论文链接 DetNet 基于CNN的目标检测器可以分为两类:单阶段(one-stage)检测器,如YOLO.SSD.RetinaNet,以及双阶段(two-stage)检测器,典型的如Faster-RCNN.R-FC…
Inside-Outside Net (ION) 论文:Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks 发表时间:2016 发表作者:(Cornell University)Sean Bell, C. Lawrence Zitnick,(Microsoft Research)Kavita Bala, Ross Girshick 论文链接:论文链接 本文…