Topic Models

Some stuff on Topic Models-

http://en.wikipedia.org/wiki/Topic_model

In machine learning and natural language processing, a topic model is a type of statistical model for discovering the abstract “topics” that occur in a collection of documents. An early topic model was probabilistic latent semantic indexing (PLSI), created by Thomas Hofmann in 1999.[1] Latent Dirichlet allocation (LDA), perhaps the most common topic model currently in use, is a generalization of PLSI developed by David Blei, Andrew Ng, and Michael Jordan in 2002, allowing documents to have a mixture of topics.[2] Other topic models are generally extensions on LDA, such as Pachinko allocation, which improves on LDA by modeling correlations between topics in addition to the word correlations which constitute topics. Although topic models were first described and implemented in the context of natural language processing, they have applications in other fields such as bioinformatics.

http://en.wikipedia.org/wiki/Latent_Dirichlet_allocation

In statistics, latent Dirichlet allocation (LDA) is a generative model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar. For example, if observations are words collected into documents, it posits that each document is a mixture of a small number of topics and that each word’s creation is attributable to one of the document’s topics. LDA is an example of a topic model

David M Blei’s page on Topic Models-

http://www.cs.princeton.edu/~blei/topicmodeling.html

a general introduction to topic modeling .
At KDD-2011 a long tutorial about topic modeling. The slides are here .
slides from a talk on dynamic and correlated topic models applied to the journal Science . (Here is a video of the talk.)
a more technical review paper about this field.
David Mimno maintains a bibliography of topic modeling papers and software.

The topic models mailing list is a good forum for discussing topic modeling.

In R,

topicmodels and lda are two R packages for LDA analysis.

Some resources I compiled on Slideshare based on the above-

Topicmodels

Topic models

Blei ngjordan2003

Lda

Canini09a

Blei lafferty2009

Blei2011

View more documents from Ajay Ohri

Modeling science

I guess a topic model on topic model literature would be a fine example of a “meme”

Happy 2012.

Author: Ajay Ohri

http://about.me/ajayohri View all posts by Ajay Ohri

Please share:

Related

Author: Ajay Ohri

Leave a comment Cancel reply