http://www.cs.princeton.edu/~blei/topicmodeling.html

Topic models are a suite of algorithms that uncover the hidden thematic structure in document collections. These algorithms help us develop new ways to search, browse and summarize large archives of texts.

Below, you will find links to introductory materials, corpus browsers based on topic models, and open source software (from my research group) for topic modeling.

Introductory materials

Corpus browsers based on topic models

The structure uncovered by topic models can be used to explore an otherwise unorganized collection. The following are browsers of large collections of documents, built with topic models.

Also see Sean Gerrish's discipline browser for an interesting application of topic modeling at JSTOR.

To build your own browsers, see Allison Chaney's excellent Topic Model Visualization Engine(TMVE). For example, here is a browser of 100,000 Wikipedia articles that uses TMVE.

Topic modeling software

Our research group has released many open-source software packages for topic modeling. Please post questions, comments, and suggestions about this code to the topic models mailing list.

Link Model/Algorithm Language Author Notes
lda-c Latent Dirichlet allocation C D. Blei This implements variational inference for LDA.
class-slda Supervised topic models for classifiation C++ C. Wang Implements supervised topic models with a categorical response.
lda R package for Gibbs sampling in many models R J. Chang Implements many models and is fast . Supports LDA, RTMs (for networked documents), MMSB (for network data), and sLDA (with a continuous response).
online lda Online inference for LDA Python M. Hoffman Fits topic models to massive data. The demo downloads random Wikipedia articles and fits a topic model to them.
online hdp Online inference for the HDP Python C. Wang Fits hierarchical Dirichlet process topic models to massive data. The algorithm determines the number of topics.
tmve(online) Topic Model Visualization Engine Python A. Chaney A package for creating corpus browsers. See, for example,Wikipedia.
ctr Collaborative modeling for recommendation C++ C. Wang Implements variational inference for a collaborative topic models. These models recommend items to users based on item content and other users' ratings.
dtm Dynamic topic models and the influence model C++ S. Gerrish This implements topics that change over time and a model of how individual documents predict that change.
hdp Hierarchical Dirichlet processes C++ C. Wang Topic models where the data determine the number of topics. This implements Gibbs sampling.
ctm-c Correlated topic models C D. Blei This implements variational inference for the CTM.
diln Discrete infinite logistic normal C J. Paisley This implements the discrete infinite logistic normal, a Bayesian nonparametric topic model that finds correlated topics.
hlda Hierarchical latent Dirichlet allocation C D. Blei This implements a topic model that finds a hierarchy of topics. The structure of the hierarchy is determined by the data.
turbotopics Turbo topics Python D. Blei Turbo topics find significant multiword phrases in topics.

Topic modeling【经典模型】的更多相关文章

  1. 用GibbsLDA做Topic Modeling

    http://weblab.com.cityu.edu.hk/blog/luheng/2011/06/24/%E7%94%A8gibbslda%E5%81%9Atopic-modeling/#comm ...

  2. 论文《Entity Linking with Effective Acronym Expansion, Instance Selection and Topic Modeling》

    Entity Linking with Effective Acronym Expansion, Instance Selection and Topic Modeling 一.主要贡献 1. pro ...

  3. 【Keras篇】---利用keras改写VGG16经典模型在手写数字识别体中的应用

    一.前述 VGG16是由16层神经网络构成的经典模型,包括多层卷积,多层全连接层,一般我们改写的时候卷积层基本不动,全连接层从后面几层依次向前改写,因为先改参数较小的. 二.具体 1.因为本文中代码需 ...

  4. 【神经网络篇】--基于数据集cifa10的经典模型实例

    一.前述 本文分享一篇基于数据集cifa10的经典模型架构和代码. 二.代码 import tensorflow as tf import numpy as np import math import ...

  5. 【BZOJ 3232】圈地游戏 二分+SPFA判环/最小割经典模型

    最小割经典模型指的是“一堆元素进行选取,对于某个元素的取舍有代价或价值,对于某些对元素,选取后会有额外代价或价值”的经典最小割模型,建立倒三角进行最小割.这个二分是显然的,一开始我也是想到了最小割的那 ...

  6. 大话CNN经典模型:VGGNet

       2014年,牛津大学计算机视觉组(Visual Geometry Group)和Google DeepMind公司的研究员一起研发出了新的深度卷积神经网络:VGGNet,并取得了ILSVRC20 ...

  7. 大话CNN经典模型:AlexNet

    2012年,Alex Krizhevsky.Ilya Sutskever在多伦多大学Geoff Hinton的实验室设计出了一个深层的卷积神经网络AlexNet,夺得了2012年ImageNet LS ...

  8. 大话CNN经典模型:LeNet

        近几年来,卷积神经网络(Convolutional Neural Networks,简称CNN)在图像识别中取得了非常成功的应用,成为深度学习的一大亮点.CNN发展至今,已经有很多变种,其中有 ...

  9. 【思维题 经典模型】cf632F. Magic Matrix

    非常妙的经典模型转化啊…… You're given a matrix A of size n × n. Let's call the matrix with nonnegative elements ...

随机推荐

  1. struts2逻辑视图类型汇总与解释(转)

    在struts2框架中,当action处理完之后,就应该向用户返回结果信息,该任务被分为两部分:结果类型和结果本身. 结果类型提供了返回给用户信息类型的实现细节.结果类型通常在Struts2中就已预定 ...

  2. win2003从组策略关闭端口(445/135/137/138/139/3389等)教程

    一些恶劣的病毒会从某些端口入侵计算机,因此关闭某些用不到的而又具有高风险的端口就显得很有必要,是服务器管理员要做的基本的安全防范.本文将介绍win2003系统在组策略关闭某一个端口的教程,文章以关闭4 ...

  3. 微型 Python Web 框架: Bottle

    微型 Python Web 框架: Bottle 在 19/09/11 07:04 PM 由 COSTONY 发表 Bottle 是一个非常小巧但高效的微型 Python Web 框架,它被设计为仅仅 ...

  4. Switch能否使用String做参数

    在Java语言中Swith可以使用参数类型有:Only convertible int values, strings or enum variables are permitted 可以自动转换为整 ...

  5. block的基本使用

    block用来保存一段代码 block的标志:^ block跟函数很像: 1. 可以保存代码 2. 有返回值 3. 有形参 4. 调用方式一样 定义bolock变量 例1: void (^myBloc ...

  6. 【HDU】4632 Palindrome subsequence(回文子串的个数)

    思路:设dp[i][j] 为i到j内回文子串的个数.先枚举所有字符串区间.再依据容斥原理. 那么状态转移方程为   dp[i][j] = dp[i][j-1] + dp[i+1][j] - dp[i+ ...

  7. 监听文本框输入oninput和onpropertychange事件

    前端页面开发的很多情况下都需要实时监听文本框输入,比如腾讯微博编写140字的微博时输入框动态显示还可以输入的字数.过去一般都使用onchange/onkeyup/onkeypress/onkeydow ...

  8. 一个苹果证书如何多次使用——导出p12文件[多台电脑使用]

    为什么要导出.p12文件 当我们用大于三个mac设备开发应用时,想要申请新的证书,如果在我们的证书里,包含了3个发布证书,2个开发证书,可以发现再也申请不了开发证书和发布证书了(一般在我们的证书界面中 ...

  9. Java之父职场路

    Java之父——詹姆斯·高斯林出生于加拿大,是一位计算机编程天才.在卡内基·梅隆大学攻读计算机博士学位时,他编写了多处理器版本的Unix操作系统,是JAVA编程语言的创始人.1991年,在Sun公司工 ...

  10. int 和 Integer 有什么区别

    原文地址:https://blog.csdn.net/chenliguan/article/details/53888018 1 int与Integer的基本使用对比 (1)Integer是int的包 ...