R and SAS- Together again at PAWS

Two of my favorite speakers ( though maybe not favorite to each other) speak at PAWS ,

Anne Milley from SAS and David Smith, REvolution Computing.Also a great author and writer, Stephen Baker from Numerati ( that mathematical equivalent of The Godfather). More events at the link below.

Hmmmm- I hope they attend each other’s sessions just to keep up, but is that asking too much?

Citation-http://www.predictiveanalyticsworld.com/dc/2009/agenda.php#day1-22

7:30pm-10:00pm
useR Meeting
Room: Magnolia
– Sponsored by  Please join the group at www.meetup.com/R-users-DC/

R is an open source programming language for statistical computing, data analysis, and graphical visualization. R has an estimated one million users worldwide, and its user base is growing. While most commonly used within academia, in fields such as computational biology and applied statistics, it is gaining currency in commercial areas such as quantitative finance and business intelligence.

Among R’s strengths as a language are its powerful built-in tools for inferential statistics, its compact modeling syntax, its data visualization capabilities, and its ease of connectivity with persistent data stores (from databases to flatfiles).

In addition, R is open source nature and extensible via add-on “packages” allowing it to keep up with the leading edge in academic research.

For all its strengths, though, R has an admittedly steep learning curve; the first steps towards learning and using R can be challenging.

This DC R Users Group is dedicated to bringing together area practitioners of R to exchange knowledge, inspire new users, and spur the adoption of R for innovative research and commercial applications.


Wednesday October 21, 2009

8:00am-9:00am
Registration & Continental Breakfast


9:00am-9:50am
Keynote
Room: Magnolia
Opportunities and Pitfalls:
What the World Does and Doesn’t Want from Predictive Analytics

Mathematicians and statisticians are churning through mountains of data in their efforts to model and predict human behavior. The goal is to optimize every function possible, from sales and marketing to the enterprise itself. These Numerati are guided by the two dominant models of the late 20th century, the modeling of financial markets and of industrial systems. How do humans fit into these systems? And what will their response be when the analytic systems appear to misunderstand them or invade their privacy?

Stephen Baker joins PAW to directly address the Numerati. In his keynote presentation, Mr. Baker will guide us toward the untapped goldmines where predictive analytics will be embraced and thrive, and teach us to anticipate and maneuver around two central pitfalls: Consumer misperception of us, and our inadvertent mistreatment of them.

Moderator: Eric Siegel, Program Chair, Predictive Analytics World

Speaker: Stephen Baker, BusinessWeek – author, The Numerati


9:50am-10:10am
Platinum Sponsor Presentation
Room: Magnolia
Strength in Numbers: ACE!

As more organizations are beginning their analytical journey or reinvigorating their existing efforts, Analytic Centers of Excellence (ACEs) are helping them along the way. The interest in ACEs is growing across industries as organizations seek better ways to tap into their analytic infrastructure-most importantly, scarce high-end analytic expertise to improve results. We will highlight valuable best practices for achieving greater analytic bandwidth realizing more and better evidence-based decisions.

Moderator: Eric Siegel, Program Chair, Predictive Analytics World

Speaker: Anne Milley, Senior Director of Tech. Product Marketing, SAS

Red R- A new beginning

Check out an interesting new interface to R.

Note I haven’t tested it but plan to do so shortly as I am currently using Ubuntu 9 almost exclusively nowadays.

R fans who are  not quite overjoyed  with the wonderful beauty and charm  of the traditional R GUI may want to give it a try.

Citation-

http://code.google.com/p/r-orange/

Note- This website does not assume responsibilty for any software glitches as R comes with no warranty- unlike other softwares that come loaded with both a warranty and then bug-fix patches.

redr

Turning the Internet into a Super Computer (Uodated)

If you are a fan of distributed computing or parallel computation- you may notice a strange thing

All parallel computation basically involves tieing a lot of desktops and servers together- which is very profitable for the companies that make them.

However the biggest and most idle source of processing power is the Internet- millions of web servers.


So is it possible to create a parallel computer which ties in the web servers. Imagine if we take the R Web package , add it to R- Hadoop Streaming package AND SNOW Phase 3 and create a wordpress plugin for the web.

Eye%20of%20the%20Storm
Would it work? Could it work? or is this a craxy thought

TO add some more fun- add in bit torrenting BUT on web servers through the wordpress or a javascript plugin. OR a cloud operating system ( call it K.U.S.H)

Are you having fun yet?

So if we have this resource, then it basically helps reduce the digital divide a bit.

Think about it——————————————————————————–
Ajay- Note I am working on this project in my spare time and I call it K.U.S.H

Kinematically User Shared Hosts ( KUSH is also my son). Thus if you pursue it, prior ..*.art is claimed.

pps I was kidding on the prior *art , some people didnt get the *.

3 Billion Asian and Africans stay without a computer because someone has a patent to each and everything that goes with it.

Art Credit- “http://www.sgeier.net/fractals/fractals/02/Eye%20of%20the%20Storm.jpg”

Interview Professor John Fox Creator R Commander

Here is an interview with Prof John Fox, creator of the very popular R language based GUI, RCmdr.

Ajay- Describe your career in science from your high school days to the science books you have written. What do you think can be done to increase interest in science in young people.

John Fox- I’m a sociologist and social statistician, so I don’t have a career in science, as that term is generally understood. I was interested in science as a child, however: I attended a science high school in New York City (Brooklyn Tech), and when I began university in 1964 at New York’s City College, I started in engineering. I moved subsequently through majors in philosophy and psychology, before finishing in sociology — had I not graduated in 1968 I probably would have moved on to something else. I took a statistics course during my last year as an undergraduate and found it fascinating. I enrolled in the sociology graduate program at the University of Michigan, where I specialized in social psychology and demography, and finished with a PhD in 1972 when I was 24 years old. I became interested in computers during my first year in graduate school, where I initially learned to program in Fortran. I also took quite a few courses in statistics and math.

I haven’t written any science books, but I have written and edited a number of books on social statistics, including, most recently, Applied Regression Analysis and Generalized Linear Models, Second Edition (Sage, 2008).

I’m afraid that I don’t know how to interest young people in science. Science seemed intrinsically interesting to me when I was young, and still does.

Ajay- What prompted you to R Commander. How would you describe R Commander as a tool, say for a user of other languages and who want to learn R, but get afraid of the syntax.

John- I originally programmed the R Commander so that I could use R to teach introductory statistics courses to sociology undergraduates. I previously taught this course with Minitab or SPSS, which were programs that I never used for my own work. I waited for someone to come up with a simple, portable, easily installed point-and-click interface to R, but nothing appeared on the horizon, and so I decided to give it a try myself.

I suppose that the R Commander can ease users into writing commands, inasmuch as the commands are displayed, but I suspect that most users don’t look at them. I think that serious prospective users of R should be encouraged to use the command-line interface along with a script editor of some sort. I wouldn’t exaggerate the difficulty of learning R: I came to R — actually S then — after having programmed in perhaps a dozen other languages, most recently at that point Lisp, and found the S language particularly easy to pick up.

Ajay- I particularly like the R Cmdr plugins. Is it possible for anyone to increase R Commander with a customized package- plugin.

John- That’s the basic idea, though the plug-in author has to be able to program in R and must learn a little Tcl/Tk.

Ajay- Have you thought of using the R Commander GUI on an Amazon EC2 and thus making R high performance computing say available on demand ( similar to Zementis model deployment using Amazon Ec2). What are you views on the future of statistical computing

John- I’m not sure whether or how an interface like the Rcmdr, which is Tcl/Tk-based, can be adapted to cloud computing. I also don’t feel qualified to predict the future of statistical computing.

I think that R is where the action is for the near future.

Ajay-What are the best ways for using R Commander as a teaching tool ( I noticed the help is a bit outdated).

John- Is the help a bit outdated? My intention is that the R Commander should be largely self-explanatory. Most people know how to use point-and-click interfaces. In the basic courses for which it is principally designed, my goals are to teach the essential ideas of statistical reasoning and some skills in data analysis. In this kind of course, statistical software should facilitate the basic goals of the course.

As I said, for serious data analysis, I believe that it’s a good idea to encourage use of the command-line interface.

Ajay- What are your views on R being recognized by SAS Institute for it’s IML product. Do you think there can be a middle way for open source and proprietary software to exist.

John- I imagine that R is a challenge for producers of proprietary software like SAS, partly because R development moves more quickly, but also because R is giving away something that SAS and other vendors of proprietary statistical software are selling. For example, I once used SAS quite a bit but don’t anymore. I also have the sense that for some time SAS has directed its energies more toward business uses of its software than toward purely statistical applications.

Ajay- Do people in R Core team recognize the importance of GUI? What does the rest of R community feel? What has the feedback of users ben to you. Any plans to corporate sponsors for R Commander ( Rattle , an R language data mining GUI has a version called Rstat at http://www.informationbuilders.com/products/webfocus/predictivemodeling.html while the free version and code is at rattle.togaware.com)

John- I feel that the R Commander GUI has been generally positively received, both by members of R Core who have said something about it to me and by others in the R community. Of course, a nice feature of the R package system is that people can simply ignore packages in which they have no interest. I noticed recently that a Journal of Statistical Software paper that I wrote several years ago on the Rcmdr package has been downloaded nearly 35,000 times.

Because I wouldn’t expect many students using the Rcmdr package in a course to read that paper, I expect that the package is being used fairly widely.

Ajay- What does John Fox do for fun or as a hobby?

John- I’m tempted to say that much of my work is fun — particularly doing research, writing programs, and writing papers and books. I used to be quite a serious photographer, but I haven’t done that in years, and the technology of photography has changed a great deal. I run and swim for exercise, but that’s not really fun. I like to read and to travel, but who doesn’t?

Biography-

Prof John Fox is a giant in his chosen fields and has edited/authored 13 books and written chapters for 12 more books. He has also written and been published in almost 49 Journal articles. He is also editor in chief for R News newsletter. You can read more about Dr Fox at http://socserv.mcmaster.ca/jfox/

On R Cmdr-

R Cmdr has substantially decreased the hygiene factor for people wanting to learn R- they begin with the GUI and then later transition to customization using command line. It is so simple in its design that even under graduates have started basic data analysis with R Cmdr after just a class.You can read more on it here at http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/Getting-Started-with-the-Rcmdr.pdf

Presenting R

Here is a short presentation I made for fellow students at work.

It is generally at a beginner’s level or for people having trouble transitioning to R.

and if you want to see the video presentation you can see it from here on UTK’s lecture capture mechanism

Title: R Help Session
Speaker: A. Ohri
Desription: Session for R Beginners
URL: http://vcweb.bus.utk.edu/20090911-103113-cap403/

In addition, here is a link for the handout:

https://docs.google.com/present/edit?id=0AdYMMvghK2ytZGN2c3MzNThfODA2ZzY4N2I5bno&hl=en&invite=CKvforMH

SAS to launch SAS/IML with R ( updated)

Updated- Tammi Kay George has given an extensive answer to the questions posed –

While SAS Institute is going ahead with plans to launch SAS/IML with R integration, the questions that should be in the minds of people familiar to R and SAS are-

1) Will SAS Institute share revenue with package developers.

2) Would R Core stick to strategy of favouring REvolution Computing over other R application providers. or the R community to an anti-SAS strategy.

3) Would academics and students be better off or more confused than before.

4) Will this help restrict R ‘s brand perception to just a matrix level program or Will this help R gain more enterprise acceptance.

5) What about legal questions of GPL source sode sharing.

So many questions. So little time.

Screenshot- Existing SAS/IML Studio Page

Citation-http://www.sas.com/technologies/analytics/statistics/iml/index.html#section=3

sasiml

Movie Review- Inglorious Basterds

When a Knoxville born director creates a movie with Brad Pitt speaking in an East Tennessee accent, Orange country citizens cant help but catch the movie. If you need escapist fare to charge them ole grey cells with good ol fashioned American movie- this is the one.

I could wax eloquent on the direction and deft artistry of Quentin Tarantino, or the acting of Hans Landau- or even the breakthrough perfromance of Brad Pitt who plays the leader of a Nazi hunting commando unit. For the first time- you see Brad’s character overshadowing his adorable mannerisms. Watch how he rolls those Southern R’s.

But there is plenty of that- and you need to watch it yourself.

As the director himself puts it- A Basterd’s work is never done.

Inglourious-Bastards-1819

This is just a loong weekend post..

If you liked the concept of data mining and entertainment- well I actually got inspired by the entertaining writing of the prominent R blogger J D Long of Cerebral Mastication

http://www.cerebralmastication.com/