JSS launches special edition for GUI for #Rstats

I love GUIs (graphical user interfaces)- they might be TCL/TK based or GTK based or even QT based. As a researcher they help me with faster coding, as a consultant they help with faster transition of projects from startup to handover stage  and as an R  instructor helps me get people to learn R faster.

I wish Python had some GUIs though ;)

 

from the open access journal of statistical software-

JSS Special Volume 49: Graphical User Interfaces for R

Graphical User Interfaces for R
Pedro M. Valero-Mora, Ruben Ledesma
Vol. 49, Issue 1, Jun 2012
Submitted 2012-06-03, Accepted 2012-06-03
Integrated Degradation Models in R Using iDEMO
Ya-Shan Cheng, Chien-Yu Peng
Vol. 49, Issue 2, Jun 2012
Submitted 2010-12-31, Accepted 2011-06-29
Glotaran: A Java-Based Graphical User Interface for the R Package TIMP
Joris J. Snellenburg, Sergey Laptenok, Ralf Seger, Katharine M. Mullen, Ivo H. M. van Stokkum
Vol. 49, Issue 3, Jun 2012
Submitted 2011-01-20, Accepted 2011-09-16
A Graphical User Interface for R in a Rich Client Platform for Ecological Modeling
Marcel Austenfeld, Wolfram Beyschlag
Vol. 49, Issue 4, Jun 2012
Submitted 2011-01-05, Accepted 2012-02-20
Closing the Gap between Methodologists and End-Users: R as a Computational Back-End
Byron C. Wallace, Issa J. Dahabreh, Thomas A. Trikalinos, Joseph Lau, Paul Trow, Christopher H. Schmid
Vol. 49, Issue 5, Jun 2012
Submitted 2010-11-01, Accepted 2012-12-20
tourrGui: A gWidgets GUI for the Tour to Explore High-Dimensional Data Using Low-Dimensional Projections
Bei Huang, Dianne Cook, Hadley Wickham
Vol. 49, Issue 6, Jun 2012
Submitted 2011-01-20, Accepted 2012-04-16
The RcmdrPlugin.survival Package: Extending the R Commander Interface to Survival Analysis
John Fox, Marilia S. Carvalho
Vol. 49, Issue 7, Jun 2012
Submitted 2010-12-26, Accepted 2011-12-28
Deducer: A Data Analysis GUI for R
Ian Fellows
Vol. 49, Issue 8, Jun 2012
Submitted 2011-02-28, Accepted 2011-09-08
RKWard: A Comprehensive Graphical User Interface and Integrated Development Environment for Statistical Analysis with R
Stefan Rödiger, Thomas Friedrichsmeier, Prasenjit Kapat, Meik Michalke
Vol. 49, Issue 9, Jun 2012
Submitted 2010-12-28, Accepted 2011-05-06
gWidgetsWWW: Creating Interactive Web Pages within R
John Verzani
Vol. 49, Issue 10, Jun 2012
Submitted 2010-12-17, Accepted 2011-05-11
Oscars and Interfaces
Antony Unwin
Vol. 49, Issue 11, Jun 2012
Submitted 2010-12-08, Accepted 2011-07-15

Text Mining Barack Obama using R #rstats

  • We copy and paste President Barack Obama’s “Yes We Can” speech in a text document and read it in. For a word cloud we need a dataframe with two columns, one with words and the the other with frequency.We read in the transcript from “http://politics.nuvvo.com/lesson/4678-transcript-of-obamas-speech-yes-we-can” and paste in the file located in the local directory- C:/Users/KUs/Desktop/new. Note tm is a powerful package and will read ALL the text documents within the particular folder

> library(tm)

>library(wordcloud)

> txt2=”C:/Users/KUs/Desktop/new”

> b=Corpus(DirSource(txt2), readerControl = list(language = “eng”))

> b<- tm_map(b, tolower) #Changes case to lower case

> b<- tm_map(b, stripWhitespace) #Strips White Space

> b <- tm_map(b, removePunctuation) #Removes Punctuation

> tdm <- TermDocumentMatrix(b)

> m1 <- as.matrix(tdm)

> v1<- sort(rowSums(m1),decreasing=TRUE)

> d1<- data.frame(word = names(v1),freq=v1)

> wordcloud(d1$word,d1$freq)

Now it seems we need to remove some of the very commonly occuring words like “the” and “and”. We are not using the standard stopwords in english (the tm package provides that see Chapter 13 Text Mining case studies), as the words “we” and “can” are also included .

> b <- tm_map(b, removeWords, c(“and”,”the”)) # We remove the words “and” and “the”.> tdm <- TermDocumentMatrix(b)

> m1 <- as.matrix(tdm)

> v1<- sort(rowSums(m1),decreasing=TRUE)

> d1<- data.frame(word = names(v1),freq=v1)

> wordcloud(d1$word,d1$freq)

But let’s see how the wordcloud changes if we remove all English Stopwords.

> b <- tm_map(b, removeWords, stopwords(“english”)) #Removes Stop Words

and rerunning the code after this

> tdm <- TermDocumentMatrix(b)

> m1 <- as.matrix(tdm)

> v1<- sort(rowSums(m1),decreasing=TRUE)

> d1<- data.frame(word = names(v1),freq=v1)

> wordcloud(d1$word,d1$freq)

 

and you can draw your own conclusions from the content of this famous speech based on your political preferences.

Politicians can give interesting speeches but they may be full of simple sounding words…..

 

Citation-

1. Ingo Feinerer (2012). tm: Text Mining Package. R package version0.5-7.1.

Ingo Feinerer, Kurt Hornik, and David Meyer (2008). Text Mining
Infrastructure in R. Journal of Statistical Software 25/5. URL:
http://www.jstatsoft.org/v25/i05/

2. Ian Fellows (2012). wordcloud: Word Clouds. R package version 2.0.

http://CRAN.R-project.org/package=wordcloud

3. You can see more than 100 of Obama’s speeches at http://obamaspeeches.com/

Quote- numbers dont lie, people do.

.

Topic Models in R- search documents for similarity by frequency

Zombie-process
Image via Wikipedia

From the marvelous lovely Journal of Statistical Software, ignored by mainstream corporatia, but beloved to academia. here is one more interesting and very timely paper.

Can be used to grade stdudents homework, catch terrorists as in plagiarists , search engine spam linkers. Enjoy!