http://www.cs.princeton.edu/~blei/topicmodeling.html

Topic models are a suite of algorithms that uncover the hidden thematic structure in document collections. These algorithms help us develop new ways to search, browse and summarize large archives of texts.

Below, you will find links to introductory materials, corpus browsers based on topic models, and open source software (from my research group) for topic modeling.

Introductory materials

Corpus browsers based on topic models

The structure uncovered by topic models can be used to explore an otherwise unorganized collection. The following are browsers of large collections of documents, built with topic models.

Also see Sean Gerrish's discipline browser for an interesting application of topic modeling at JSTOR.

To build your own browsers, see Allison Chaney's excellent Topic Model Visualization Engine(TMVE). For example, here is a browser of 100,000 Wikipedia articles that uses TMVE.

Topic modeling software

Our research group has released many open-source software packages for topic modeling. Please post questions, comments, and suggestions about this code to the topic models mailing list.

Link Model/Algorithm Language Author Notes
lda-c Latent Dirichlet allocation C D. Blei This implements variational inference for LDA.
class-slda Supervised topic models for classifiation C++ C. Wang Implements supervised topic models with a categorical response.
lda R package for Gibbs sampling in many models R J. Chang Implements many models and is fast . Supports LDA, RTMs (for networked documents), MMSB (for network data), and sLDA (with a continuous response).
online lda Online inference for LDA Python M. Hoffman Fits topic models to massive data. The demo downloads random Wikipedia articles and fits a topic model to them.
online hdp Online inference for the HDP Python C. Wang Fits hierarchical Dirichlet process topic models to massive data. The algorithm determines the number of topics.
tmve(online) Topic Model Visualization Engine Python A. Chaney A package for creating corpus browsers. See, for example,Wikipedia.
ctr Collaborative modeling for recommendation C++ C. Wang Implements variational inference for a collaborative topic models. These models recommend items to users based on item content and other users' ratings.
dtm Dynamic topic models and the influence model C++ S. Gerrish This implements topics that change over time and a model of how individual documents predict that change.
hdp Hierarchical Dirichlet processes C++ C. Wang Topic models where the data determine the number of topics. This implements Gibbs sampling.
ctm-c Correlated topic models C D. Blei This implements variational inference for the CTM.
diln Discrete infinite logistic normal C J. Paisley This implements the discrete infinite logistic normal, a Bayesian nonparametric topic model that finds correlated topics.
hlda Hierarchical latent Dirichlet allocation C D. Blei This implements a topic model that finds a hierarchy of topics. The structure of the hierarchy is determined by the data.
turbotopics Turbo topics Python D. Blei Turbo topics find significant multiword phrases in topics.

Topic modeling【经典模型】的更多相关文章

  1. 用GibbsLDA做Topic Modeling

    http://weblab.com.cityu.edu.hk/blog/luheng/2011/06/24/%E7%94%A8gibbslda%E5%81%9Atopic-modeling/#comm ...

  2. 论文《Entity Linking with Effective Acronym Expansion, Instance Selection and Topic Modeling》

    Entity Linking with Effective Acronym Expansion, Instance Selection and Topic Modeling 一.主要贡献 1. pro ...

  3. 【Keras篇】---利用keras改写VGG16经典模型在手写数字识别体中的应用

    一.前述 VGG16是由16层神经网络构成的经典模型,包括多层卷积,多层全连接层,一般我们改写的时候卷积层基本不动,全连接层从后面几层依次向前改写,因为先改参数较小的. 二.具体 1.因为本文中代码需 ...

  4. 【神经网络篇】--基于数据集cifa10的经典模型实例

    一.前述 本文分享一篇基于数据集cifa10的经典模型架构和代码. 二.代码 import tensorflow as tf import numpy as np import math import ...

  5. 【BZOJ 3232】圈地游戏 二分+SPFA判环/最小割经典模型

    最小割经典模型指的是“一堆元素进行选取,对于某个元素的取舍有代价或价值,对于某些对元素,选取后会有额外代价或价值”的经典最小割模型,建立倒三角进行最小割.这个二分是显然的,一开始我也是想到了最小割的那 ...

  6. 大话CNN经典模型:VGGNet

       2014年,牛津大学计算机视觉组(Visual Geometry Group)和Google DeepMind公司的研究员一起研发出了新的深度卷积神经网络:VGGNet,并取得了ILSVRC20 ...

  7. 大话CNN经典模型:AlexNet

    2012年,Alex Krizhevsky.Ilya Sutskever在多伦多大学Geoff Hinton的实验室设计出了一个深层的卷积神经网络AlexNet,夺得了2012年ImageNet LS ...

  8. 大话CNN经典模型:LeNet

        近几年来,卷积神经网络(Convolutional Neural Networks,简称CNN)在图像识别中取得了非常成功的应用,成为深度学习的一大亮点.CNN发展至今,已经有很多变种,其中有 ...

  9. 【思维题 经典模型】cf632F. Magic Matrix

    非常妙的经典模型转化啊…… You're given a matrix A of size n × n. Let's call the matrix with nonnegative elements ...

随机推荐

  1. Linux 安全rm

    先将shell脚本放在某个全局路径下,如/usr/local/bin #!/bin/sh # safe rm # Don't remove the file, just move them to a ...

  2. Microsoft office2007免费版下载(安装 + 破解)

    office2007官方下载 免费完整版是微软推出的办公软件,office2007使用方法很简单,解压软件之后,运行“setup.exe”之后按照提示点击下一步,输入产品秘钥,就可以正常安装了.Mic ...

  3. JavaScript 与 Java、PHP 的比较

    网站开发的实践从设计方面开始,包括客户端编程语言.大体上说,在网页设计中使用了三种语言:HTML,CSS和Java.自从网站发明以来,HTML和CSS已经成为网页设计的基础,但是Java被用于添加网站 ...

  4. break、continue与return的区别

    1. break break语句的使用场合主要是switch语句和循环结构.在循环结构中使用break语句,如果执行了break语句,那么就退出循环,接着执行循环结构下面的第一条语句.如果在多重嵌套循 ...

  5. 使用.NET Remoting开发分布式应用——配置文件篇

    我们已经知道可以通过编码的方式配置服务器通道和远程客户机,除此之外,还可以使用配置文件对服务器通道和远程客户机进行配置.使用远程客户机和服务器对象的配置文件的优点在于,用户无需修改任何一行代码,也无需 ...

  6. font-face自定义字体使用方法

    今天闲的蛋疼小七来聊一聊关于css3的font-face属性的使用方法: 首先应该好多人没用过这个属性,那只能说你们的设计师还是有人性的, 一旦电脑系统没有的特殊字体或者你设计师故意装13为难你就需要 ...

  7. 程序或-内存区域分配& ELF分析 ***

    一.在学习之前我们先看看ELF文件. ELF分为三种类型: 1. .o 可重定位文件(relocalble file) 2. 可执行文件 3. 共享库(shared library) 三种格式基本上从 ...

  8. What’s that ALUA exactly?

    What’s that ALUA exactly? 29 September, 20098 Comments Of course by now we have all read the excelle ...

  9. 安装jenkins 的时候 记录默认密码文件为空的情况

    1.把文件的权限改成 chmod 777 .jenkins/secrets/initialAdminPassword  然后再使用编辑器打开,密码就出来的 密码文件的地址 /var/root/.hud ...

  10. js判断是否是用微信浏览器打开

    有时候微信开发,需要根据使用的浏览器不同,来进行不同的处理. 下面的代码,可以判断是否使用的是微信浏览器. <!DOCTYPE HTML> <html lang="en&q ...