Algorithm:

Refrence from one ICML15 paper: Word Mover's Distance.

1. First use Google's word2vec tool to get distributed word representing aka. word vectors.

2. Then use earth mover's distance as similarity measure metric.

3. Solve the EMD problem as transportation problem by Hungarian Algorithm.


Outcome:

Result looks not bad, but still have ways to improve the precision.

For example: use n-gram to keep a little bit sentence structure.

Distributed Sentence Similarity Base on Word Mover's Distance的更多相关文章

  1. 唐诗掠影:基于词移距离(Word Mover's Distance)的唐诗诗句匹配实践

    词移距离(Word Mover's Distance)是在词向量的基础上发展而来的用来衡量文档相似性的度量.   词移距离的具体介绍参考http://blog.csdn.net/qrlhl/artic ...

  2. [LeetCode] Sentence Similarity II 句子相似度之二

    Given two sentences words1, words2 (each represented as an array of strings), and a list of similar ...

  3. 734. Sentence Similarity 有字典数组的相似句子

    [抄题]: Given two sentences words1, words2 (each represented as an array of strings), and a list of si ...

  4. [LeetCode] 737. Sentence Similarity II 句子相似度之二

    Given two sentences words1, words2 (each represented as an array of strings), and a list of similar ...

  5. LeetCode 734. Sentence Similarity

    原题链接在这里:https://leetcode.com/problems/sentence-similarity/ 题目: Given two sentences words1, words2 (e ...

  6. [LeetCode] 737. Sentence Similarity II 句子相似度 II

    Given two sentences words1, words2 (each represented as an array of strings), and a list of similar ...

  7. [LeetCode] 734. Sentence Similarity 句子相似度

    Given two sentences words1, words2 (each represented as an array of strings), and a list of similar ...

  8. Comparing Sentence Similarity Methods

    Reference:Comparing Sentence Similarity Methods,知乎.

  9. 论文阅读笔记: Multi-Perspective Sentence Similarity Modeling with Convolution Neural Networks

    论文概况 Multi-Perspective Sentence Similarity Modeling with Convolution Neural Networks是处理比较两个句子相似度的问题, ...

随机推荐

  1. windows和linux共享文件

    一篇文章: 环境:主机操作系统 是Windows XP ,虚拟机 是Ubuntu 9.10,虚拟机是VirtualBox 3.08. 1. 安装增强功能包(Guest Additions) 安装好Ub ...

  2. 车牌识别LPR(三)-- LPR系统整体结构

    第三篇:系统的整体架构 LPR系统大体上可由图像采集系统,图像处理系统,数据库管理系统三个子系统组成.它综合了通讯.信息.控制.传感.计算机等各种先进技术,构成一个智能电子系统. 图像采集系统:图像采 ...

  3. android sqlite支持的数据类型

    Sqlite3支持的数据类型 :NULL.INTEGER.REAL.TEXT.BLOB 但实际上,sqlite3也接受如下的数据类型:    smallint 16 位元的整数.    interge ...

  4. vi 编辑内容中查找字符位置

    [root@localhost gdm]# vi /etc/X11/gdm/gdm.conf # You can also use the gdm-restart and gdm-safe-resta ...

  5. 面试题_1_to_16_多线程、并发及线程的基础问题

    多线程.并发及线程的基础问题 1)Java 中能创建 volatile 数组吗?能,Java 中可以创建 volatile 类型数组,不过只是一个指向数组的引用,而不是整个数组.我的意思是,如果改变引 ...

  6. Awesome Algorithms

    Awesome Algorithms A curated list of awesome places to learn and/or practice algorithms. Inspired by ...

  7. Jenkins iOS – Git, xcodebuild, TestFlight

    Introduction with Jenkins iOS If you are new to continuous integration for mobile platforms then you ...

  8. Android wakelock机制

      Wake Lock是一种锁的机制, 只要有人拿着这个锁,系统就无法进入休眠,可以被用户态程序和内核获得. 这个锁可以是有超时的或者是没有超时的,超时的锁会在时间过去以后自动解锁. 如果没有锁了或者 ...

  9. java 语言里 遍历 collection 的方式

    我来简单说一下吧,一般有2种方法来遍历collection中的元素,以HashSet为例子HashSet hs=new HashSet();hs.add("hello");hs.a ...

  10. UVa 11752 (素数筛选 快速幂) The Super Powers

    首先有个关键性的结论就是一个数的合数幂就是超级幂. 最小的合数是4,所以枚举底数的上限是pow(2^64, 1/4) = 2^16 = 65536 对于底数base,指数的上限就是ceil(64*lo ...