http://www.cs.princeton.edu/~blei/topicmodeling.html

Topic models are a suite of algorithms that uncover the hidden thematic structure in document collections. These algorithms help us develop new ways to search, browse and summarize large archives of texts.

Below, you will find links to introductory materials, corpus browsers based on topic models, and open source software (from my research group) for topic modeling.

Introductory materials

Corpus browsers based on topic models

The structure uncovered by topic models can be used to explore an otherwise unorganized collection. The following are browsers of large collections of documents, built with topic models.

Also see Sean Gerrish's discipline browser for an interesting application of topic modeling at JSTOR.

To build your own browsers, see Allison Chaney's excellent Topic Model Visualization Engine(TMVE). For example, here is a browser of 100,000 Wikipedia articles that uses TMVE.

Topic modeling software

Our research group has released many open-source software packages for topic modeling. Please post questions, comments, and suggestions about this code to the topic models mailing list.

Link Model/Algorithm Language Author Notes
lda-c Latent Dirichlet allocation C D. Blei This implements variational inference for LDA.
class-slda Supervised topic models for classifiation C++ C. Wang Implements supervised topic models with a categorical response.
lda R package for Gibbs sampling in many models R J. Chang Implements many models and is fast . Supports LDA, RTMs (for networked documents), MMSB (for network data), and sLDA (with a continuous response).
online lda Online inference for LDA Python M. Hoffman Fits topic models to massive data. The demo downloads random Wikipedia articles and fits a topic model to them.
online hdp Online inference for the HDP Python C. Wang Fits hierarchical Dirichlet process topic models to massive data. The algorithm determines the number of topics.
tmve(online) Topic Model Visualization Engine Python A. Chaney A package for creating corpus browsers. See, for example,Wikipedia.
ctr Collaborative modeling for recommendation C++ C. Wang Implements variational inference for a collaborative topic models. These models recommend items to users based on item content and other users' ratings.
dtm Dynamic topic models and the influence model C++ S. Gerrish This implements topics that change over time and a model of how individual documents predict that change.
hdp Hierarchical Dirichlet processes C++ C. Wang Topic models where the data determine the number of topics. This implements Gibbs sampling.
ctm-c Correlated topic models C D. Blei This implements variational inference for the CTM.
diln Discrete infinite logistic normal C J. Paisley This implements the discrete infinite logistic normal, a Bayesian nonparametric topic model that finds correlated topics.
hlda Hierarchical latent Dirichlet allocation C D. Blei This implements a topic model that finds a hierarchy of topics. The structure of the hierarchy is determined by the data.
turbotopics Turbo topics Python D. Blei Turbo topics find significant multiword phrases in topics.

Topic modeling【经典模型】的更多相关文章

  1. 用GibbsLDA做Topic Modeling

    http://weblab.com.cityu.edu.hk/blog/luheng/2011/06/24/%E7%94%A8gibbslda%E5%81%9Atopic-modeling/#comm ...

  2. 论文《Entity Linking with Effective Acronym Expansion, Instance Selection and Topic Modeling》

    Entity Linking with Effective Acronym Expansion, Instance Selection and Topic Modeling 一.主要贡献 1. pro ...

  3. 【Keras篇】---利用keras改写VGG16经典模型在手写数字识别体中的应用

    一.前述 VGG16是由16层神经网络构成的经典模型,包括多层卷积,多层全连接层,一般我们改写的时候卷积层基本不动,全连接层从后面几层依次向前改写,因为先改参数较小的. 二.具体 1.因为本文中代码需 ...

  4. 【神经网络篇】--基于数据集cifa10的经典模型实例

    一.前述 本文分享一篇基于数据集cifa10的经典模型架构和代码. 二.代码 import tensorflow as tf import numpy as np import math import ...

  5. 【BZOJ 3232】圈地游戏 二分+SPFA判环/最小割经典模型

    最小割经典模型指的是“一堆元素进行选取,对于某个元素的取舍有代价或价值,对于某些对元素,选取后会有额外代价或价值”的经典最小割模型,建立倒三角进行最小割.这个二分是显然的,一开始我也是想到了最小割的那 ...

  6. 大话CNN经典模型:VGGNet

       2014年,牛津大学计算机视觉组(Visual Geometry Group)和Google DeepMind公司的研究员一起研发出了新的深度卷积神经网络:VGGNet,并取得了ILSVRC20 ...

  7. 大话CNN经典模型:AlexNet

    2012年,Alex Krizhevsky.Ilya Sutskever在多伦多大学Geoff Hinton的实验室设计出了一个深层的卷积神经网络AlexNet,夺得了2012年ImageNet LS ...

  8. 大话CNN经典模型:LeNet

        近几年来,卷积神经网络(Convolutional Neural Networks,简称CNN)在图像识别中取得了非常成功的应用,成为深度学习的一大亮点.CNN发展至今,已经有很多变种,其中有 ...

  9. 【思维题 经典模型】cf632F. Magic Matrix

    非常妙的经典模型转化啊…… You're given a matrix A of size n × n. Let's call the matrix with nonnegative elements ...

随机推荐

  1. Android UI之LinearLayout详解

    ※※※摘自http://www.cnblogs.com/salam/archive/2010/10/20/1856793.html LinearLayout是线性布局控件,它包含的子控件将以横向或竖向 ...

  2. bzoj 2435 道路修建

    Written with StackEdit. Description 在 \(W\) 星球上有 \(n\) 个国家.为了各自国家的经济发展,他们决定在各个国家 之间建设双向道路使得国家之间连通.但是 ...

  3. 《DSP using MATLAB》示例Example7.18

    代码: M = 33; alpha = (M-1)/2; l = 0:M-1; wl = (2*pi/M)*l; T1 = 0.1095; T2 = 0.598; Hrs = [zeros(1,11) ...

  4. 【sqlite】基础知识

    最近做一个数控系统的项目,winCE嵌入式操作系统+.Net Compact Framework环境+VS2008开发平台,开发的设备程序部署到winCE系统下的设备中运行.. 个年头,SQLite也 ...

  5. js中typeof用法详细介绍

    typeof 运算符把类型信息当作字符串返回,包括有大家常有变量类型.   typeof 运算符把类型信息当作字符串返回.typeof 返回值有六种可能: "number," &q ...

  6. 关于ant及svnant的一点随记

    在使用svnant的时候: 注意一下: 1.JDK版本,svnant目前更新到1.3.1,其中svnkit.jar是不支持1.7/1.8JDK的,容易出现各种错误 Ps:下载http://www.sv ...

  7. .NET 应用程序域?

    为了提升windows系统的稳定性与可靠性,windows通过进程来实现.所有的可执行代码.数据以及其他资源都被包含在进程中,不允许其他进程对它进行访问(除非有足够的权限).对于.NET应用程序,还进 ...

  8. 学习动态性能表(1)--v$sysstat

    由动态性能表学到的 第一篇--v$sysstat  2007.5.23 按照OracleDocument中的描述,v$sysstat存储自数据库实例运行那刻起就开始累计全实例(instance-wid ...

  9. C# 实现程序只启动一次(多次运行激活第一个实例,使其获得焦点,并在最前端显示)

    防止程序运行多个实例的方法有多种,如:通过使用互斥量和进程名等.而我想要实现的是:在程序运行多个实例时激活的是第一个实例,使其获得焦点,并在前端显示. 主要用到两个API 函数: ShowWindow ...

  10. 解决ubantu中sublime不支持中文的方法

    更新然后将系统升级到最新版本,在linux终端输入 sudo apt-get update && sudo apt-get 在本地目录中克隆此repo:    如果你没有git的话就安 ...