本文是arxiv上一篇较短的文章,之所以看是因为其标题中半监督和文本分类吸引了我.不过看完之后觉得所做的工作比较少,但想法其实也挺不错. 大多数的半监督方法都选择将小扰动施加到输入向量或其表示中,这种方式在计算机视觉上比较成功,但对于离散型的文本却不适合.为了将这个方法应用于文本输入,本文将神经网络\(M\)进行拆分:\(M=U \circ F\).其中\(F\)被冻结(freeze),用于特征提取和基于droput添加噪声,\(U\)则可以是任意的半监督算法.同时,论文还对\(F\)逐渐解冻(…
论文地址 Abstract Open-text semantic parsers are designed to interpret any statement in natural language by inferring a corresponding meaning representation (MR – a formal representation of its sense). 开放文本语义分析器被设计为通过推断相应的意义表示(MR -其意义的正式表示)来解释自然语言中的任何语句.…
by 南大周志华 摘要 监督学习技术通过学习大量训练数据来构建预测模型,其中每个训练样本都有其对应的真值输出.尽管现有的技术已经取得了巨大的成功,但值得注意的是,由于数据标注过程的高成本,很多任务很难获得如全部真值标签这样的强监督信息.因此,能够使用弱监督的机器学习技术是可取的.本文综述了弱监督学习的一些研究进展,主要关注三种弱监督类型:不完全监督,即只有一部分样本有标签:不确切监督,即训练样本只有粗粒度的标签:以及不准确监督,即给定的标签不一定总是真值. 关键词:机器学习,弱监督学习,监督学习…
Ref: Combining CNN and RNN for spoken language identification Ref: Convolutional Methods for Text [1] CONVOLUTIONAL, LONG SHORT-TERM MEMORY, FULLY CONNECTED DEEP NEURAL NETWORKS [2] Efficient Character-level Document Classification by Combining Convo…
论文链接:https://aclweb.org/anthology/P18-1031 对文章内容的总结 文章研究了一些在general corous上pretrain LM,然后把得到的model transfer到text classiffication上 整个过程的训练技巧. 这些技巧的切入点是learning rate. 主要是三个: (1)discriminative fine-tuning (其中的discriminative 指 fine-tune each layer with d…
Text Classification For purpose of word embedding extrinsic evaluation, especially downstream task. Some concepts are informed from 复旦大学NLP组 Statistical-Based Method Logistic Regression Statistics perspective based text classification described as fo…
Machine Learning Algorithms Study Notes 高雪松 @雪松Cedro Microsoft MVP 本系列文章是Andrew Ng 在斯坦福的机器学习课程 CS 229 的学习笔记. Machine Learning Algorithms Study Notes 系列文章介绍 2    Supervised Learning    3 2.1    Perceptron Learning Algorithm (PLA)    3 2.1.1    PLA --…
ECCV-2010 Tutorial: Feature Learning for Image Classification Organizers Kai Yu (NEC Laboratories America, kyu@sv.nec-labs.com), Andrew Ng (Stanford University, ang@cs.stanford.edu) Place & Time: Creta Maris Hotel, Crete, Greece, 9:00 – 13:00, Septem…
Supervised Learning In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output. Supervised learning problems are categorized…
Github上的一个开源项目,文档讲得极清晰 Github - https://github.com/dennybritz/cnn-text-classification-tf 原文- http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/ In this post we will implement a model similar to Kim Yoon’s Convolut…