Ok I promised a weekly cartoon on Friday but it’s Saturday. Last week we spoofed Larry Ellison , Jim Goodnight and Bill Gates– people who created billions of taxes for the economy but would be regarded as evil by some open source guys- though they may have created more jobs for more families than the whole Federal Reserve Bank did in 2008-10. Jobs are necessary for families. Period.
In Part 2- we see Open Source is actually older than Stallman (yes people are older than Stallman) – in fact open source has been around for far more time than even
Jim Goodnight’s current age- which can be revealed by using proc goodnight options=all.
Sorry, the words went, we cant offer you a contract
The cheque is in the mail, said another
I will send the contract shortly, was a thirds refrain
Not now, maybe next year, decade or century
Writers, unite
Nothing to lose,
but your editors and creditors
So once again,
going back to the broken worn laptop,
hammering away keys, to ham away the stoic egoistic grief
You are in the wrong country, color, class,
Just when you thought you got the hang of the game,
The game flipped, from rugby to basketball,
but not quite cricket.
You have been hanging out with the rich kids again,
with the richness of your thoughts to compensate,
for the inadequacy of your pocket.
Time to come back,
Dear writer,
It is time to write.
Here is a short list of resources and material I put together as starting points for R and Cloud Computing It’s a bit messy but overall should serve quite comprehensively.
Cloud computing is a commonly used expression to imply a generational change in computing from desktop-servers to remote and massive computing connections,shared computers, enabled by high bandwidth across the internet.
As per the National Institute of Standards and Technology Definition,
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
Rweb is developed and maintained by Jeff Banfield. The Rweb Home Page provides access to all three versions of Rweb—a simple text entry form that returns output and graphs, a more sophisticated JavaScript version that provides a multiple window environment, and a set of point and click modules that are useful for introductory statistics courses and require no knowledge of the R language. All of the Rweb versions can analyze Web accessible datasets if a URL is provided.
The paper “Rweb: Web-based Statistical Analysis”, providing a detailed explanation of the different versions of Rweb and an overview of how Rweb works, was published in the Journal of Statistical Software (http://www.jstatsoft.org/v04/i01/).
Ulf Bartel has developed R-Online, a simple on-line programming environment for R which intends to make the first steps in statistical programming with R (especially with time series) as easy as possible. There is no need for a local installation since the only requirement for the user is a JavaScript capable browser. See http://osvisions.com/r-online/ for more information.
Rcgi is a CGI WWW interface to R by MJ Ray. It had the ability to use “embedded code”: you could mix user input and code, allowing the HTMLauthor to do anything from load in data sets to enter most of the commands for users without writing CGI scripts. Graphical output was possible in PostScript or GIF formats and the executed code was presented to the user for revision. However, it is not clear if the project is still active.
Currently, a modified version of Rcgi by Mai Zhou (actually, two versions: one with (bitmap) graphics and one without) as well as the original code are available from http://www.ms.uky.edu/~statweb/.
David Firth has written CGIwithR, an R add-on package available from CRAN. It provides some simple extensions to R to facilitate running R scripts through the CGI interface to a web server, and allows submission of data using both GET and POST methods. It is easily installed using Apache under Linux and in principle should run on any platform that supports R and a web server provided that the installer has the necessary security permissions. David’s paper “CGIwithR: Facilities for Processing Web Forms Using R” was published in the Journal of Statistical Software (http://www.jstatsoft.org/v08/i10/). The package is now maintained by Duncan Temple Lang and has a web page athttp://www.omegahat.org/CGIwithR/.
Rpad, developed and actively maintained by Tom Short, provides a sophisticated environment which combines some of the features of the previous approaches with quite a bit of JavaScript, allowing for a GUI-like behavior (with sortable tables, clickable graphics, editable output), etc.
Jeff Horner is working on the R/Apache Integration Project which embeds the R interpreter inside Apache 2 (and beyond). A tutorial and presentation are available from the project web page at http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RApacheProject.
Rserve is a project actively developed by Simon Urbanek. It implements a TCP/IP server which allows other programs to use facilities of R. Clients are available from the web site for Java and C++ (and could be written for other languages that support TCP/IP sockets).
OpenStatServer is being developed by a team lead by Greg Warnes; it aims “to provide clean access to computational modules defined in a variety of computational environments (R, SAS, Matlab, etc) via a single well-defined client interface” and to turn computational services into web services.
Two projects use PHP to provide a web interface to R. R_PHP_Online by Steve Chen (though it is unclear if this project is still active) is somewhat similar to the above Rcgi and Rweb. R-php is actively developed by Alfredo Pontillo and Angelo Mineo and provides both a web interface to R and a set of pre-specified analyses that need no R code input.
webbioc is “an integrated web interface for doing microarray analysis using several of the Bioconductor packages” and is designed to be installed at local sites as a shared computing resource.
Rwui is a web application to create user-friendly web interfaces for R scripts. All code for the web interface is created automatically. There is no need for the user to do any extra scripting or learn any new scripting techniques. Rwui can also be found at http://rwui.cryst.bbk.ac.uk.
Finally, the R.rsp package by Henrik Bengtsson introduces “R Server Pages”. Analogous to Java Server Pages, an R server page is typically HTMLwith embedded R code that gets evaluated when the page is requested. The package includes an internal cross-platform HTTP server implemented in Tcl, so provides a good framework for including web-based user interfaces in packages. The approach is similar to the use of the brew package withRapache with the advantage of cross-platform support and easy installation.
Remote access to R/Bioconductor on EBI’s 64-bit Linux Cluster
Start the workbench by downloading the package for your operating system (Macintosh or Windows), or via Java Web Start, and you will get access to an instance of R running on one of EBI’s powerful machines. You can install additional packages, upload your own data, work with graphics and collaborate with colleagues, all as if you are running R locally, but unlimited by your machine’s memory, processor or data storage capacity.
Most up-to-date R version built for multicore CPUs
Access to all Bioconductor packages
Access to our computing infrastructure
Fast access to data stored in EBI’s repositories (e.g., public microarray data in ArrayExpress)
Using R Google Docs http://www.omegahat.org/RGoogleDocs/run.pdf
It uses the XML and RCurl packages and illustrates that it is relatively quick and easy
to use their primitives to interact with Web services.
Amazon’s EC2 is a type of cloud that provides on demand computing infrastructures called an Amazon Machine Images or AMIs. In general, these types of cloud provide several benefits:
Simple and convenient to use. An AMI contains your applications, libraries, data and all associated configuration settings. You simply access it. You don’t need to configure it. This applies not only to applications like R, but also can include any third-party data that you require.
On-demand availability. AMIs are available over the Internet whenever you need them. You can configure the AMIs yourself without involving the service provider. You don’t need to order any hardware and set it up.
Elastic access. With elastic access, you can rapidly provision and access the additional resources you need. Again, no human intervention from the service provider is required. This type of elastic capacity can be used to handle surge requirements when you might need many machines for a short time in order to complete a computation.
Pay per use. The cost of 1 AMI for 100 hours and 100 AMI for 1 hour is the same. With pay per use pricing, which is sometimes called utility pricing, you simply pay for the resources that you use.
#This example requires you had previously created a bucket named data_language on your Google Storage and you had uploaded a CSV file named language_id.txt (your data) into this bucket – see for details
library(predictionapirwrapper)
Elastic-R is a new portal built using the Biocep-R platform. It enables statisticians, computational scientists, financial analysts, educators and students to use cloud resources seamlessly; to work with R engines and use their full capabilities from within simple browsers; to collaborate, share and reuse functions, algorithms, user interfaces, R sessions, servers; and to perform elastic distributed computing with any number of virtual machines to solve computationally intensive problems.
Also see Karim Chine’s http://biocep-distrib.r-forge.r-project.org/
R for Salesforce.com
At the point of writing this, there seem to be zero R based apps on Salesforce.com This could be a big opportunity for developers as both Apex and R have similar structures Developers could write free code in R and charge for their translated version in Apex on Salesforce.com
Force.com and Salesforce have many (1009) apps at http://sites.force.com/appexchange/home for cloud computing for
businesses, but very few forecasting and statistical simulation apps.
These are like iPhone apps except meant for business purposes (I am
unaware if any university is offering salesforce.com integration
though google apps and amazon related research seems to be on)
Personal Note-Mentioning SAS in an email to a R list is a big no-no in terms of getting a response and love. Same for being careless about which R help list to email (like R devel or R packages or R help)
Tableau was named by Software Magazine as the fastest growing software company in the $10 million to $30 million range in the world, and the second fastest growing software company worldwide overall. The ranking stems from the publication’s 28th annual Software 500 ranking of the world’s largest software service providers.
“We’re growing fast because the market is starving for easy-to-use products that deliver rapid-fire business intelligence to everyone. Our customers want ways to unlock their databases and produce engaging reports and dashboards,” said Christian Chabot CEO and co-founder of Tableau.
Put together an Academy-Award winning professor from the nation’s most prestigious university, a savvy business leader with a passion for data, and a brilliant computer scientist. Add in one of the most challenging problems in software – making databases and spreadsheets understandable to ordinary people. You have just recreated the fundamental ingredients for Tableau.
The catalyst? A Department of Defense (DOD) project aimed at increasing people’s ability to analyze information and brought to famed Stanford professor, Pat Hanrahan. A founding member of Pixar and later its chief architect for RenderMan, Pat invented the technology that changed the world of animated film. If you know Buzz and Woody of “Toy Story”, you have Pat to thank.
Under Pat’s leadership, a team of Stanford Ph.D.s got together just down the hall from the Google folks. Pat and Chris Stolte, the brilliant computer scientist, realized that data visualization could produce large gains in people’s ability to understand information. Rather than analyzing data in text form and then creating visualizations of those findings, Pat and Chris invented a technology called VizQL™ by which visualization is part of the journey and not just the destination. Fast analytics and visualization for everyone was born.
While satisfying the DOD project, Pat and Chris met Christian Chabot, a former data analyst who turned into Jello when he saw what had been invented. The three formed a company and spun out of Stanford like so many before them (Yahoo, Google, VMWare, SUN). With Christian on board as CEO, Tableau rapidly hit one success after another: its first customer (now Tableau’s VP, Operations, Tom Walker), an OEM deal with Hyperion (now Oracle), funding from New Enterprise Associates, a PC Magazine award for “Product of the Year” just one year after launch, and now over 50,000 people in 50+ countries benefiting from the breakthrough.
and now a demo I ran on the Kaggle contest data (it is a csv dataset with 95000 rows)
I found Tableau works extremely good at pivoting data and visualizing it -almost like Excel on Steroids. Download the free version here ( I dont know about an academic program (see links below) but software is not expensive at all)
The Professional Edition is a visual analysis and reporting solution for data stored in MS SQL Server, MS Analysis Services, Oracle, IBM DB2, Netezza, Hyperion Essbase, Teradata, Vertica, MySQL, PostgreSQL, Firebird, Excel, MS Access or Text Files. Available via download.
Tableau Server enables users of Tableau Desktop Professional to publish workbooks and visualizations to a server where users with web browsers can access and interact with the results. Available via download.
* Price is per Named User and includes one year of maintenance (upgrades and support). Products are made available as a download immediately after purchase. You may revisit the download site at any time during your current maintenance period to access the latest releases.
Prologue– In April 2009 I was as happy as I could be. or should have been. I was working from my house on my own sleep -wake schedule, had a bouncy 1 yr old son, an adoring girlfriend turned wife, a nice in-the-money 3 Bedroom Suburban apartment, and my startup /website was doing great, pulling in almost 4000 USD per month which was really huge given that I was parked in Delhi, India. and I had just been selected for a fully paid up grad school in the United States. Yes sir, life could not have been much better.
But the reality was I was extremely unhappy- or as unhappy as a person could be without being crazy about it. I was addicted to the always-on rush of the internet, working without a break on my writing and my job/contracts. I was ignoring my wife ‘s demands for more time as childish, and the rest of my family as interfering. Even my son seemed an time-crawl at times, so he spent more time with his nanny than me. I was hooked- and the drug was electronic, unblinking and always on. When I was not working, I was playing games on Facebook, tweeting like a teenager, or playing paid strategy games.I had been having mild arguments for the past several weeks with my wife, but I dismissed those concerns as feminine posturing. I mean, I knew her for eight years now- four before marriage and four after that. Any demands from her for more time or even to help out with the work at home meant time away from my computer or my business or even my sleep. After the high of fourteen go-go hours at work I needed some pills at the end of each day to sleep.
The wife could wait. The job, the money and the networking could not.
And then she walked out on me. With the kid. And the nanny. And With my credit card.
I was furious. How could this happen to me? In vain I raged against her, created scenes at her house to get my son back. She gave my credit card back. But it would take much more time than I realized to get my life back.
At work- I slowly began burning out. My pill fueled sleep was not refreshing and gave rise to hesitant and erratic behavior. Once again I blamed my client and co workers. They were the stupid ones- me- I was the creative genius. I lost their respect and then their friendship- eventually losing my monthly income. I now had a big mortgage on my house and no income to support it. And no family to fill the big house too.
Too proud to admit whose fault it was, I packed my bags to start life afresh in school in America.
My parents were supportive, especially my father. I had often been distant with him as he too had a demanding job as a police officer. I rationalized my work alcoholism on the ground that I was doing it for my family and making good money, unlike my father who had just spent thirty years on a middle level pay and much more work. My father had endured those complaints silently and just as silently he helped me through basic therapy to help me reach a medical condition fit enough to travel.
In America-
I resolved to start life anew in the United States. At first it went well. I swam in the housing pool and walked along the beautiful green campus.My immigrant energy was good enough for me to start impressing my classmates and my teachers. I used the opportunities available in the US to travel to conferences in New York and Las Vegas. And I partied like a bachelor in both these places. I had nothing else left to lose and so I thought. Beer bars were my salvation and I was redeemed there-at least for the evening. But then a familiar pattern from the past emerged- I could not focus long enough on my studies. My medication (this time prescribed by a doctor) increased but I steadily drank as well. I tried distracting myself with sports- especially with the university football team which played every Saturday. Caught up in the weekly ritual, I hoped it would give a good outlet to the hurt I felt inside. It was one such Saturday that I came across my Baptist friends.
I was hitch-hiking my way to the football stadium and steadily grew flustered as there was a steady stream of cars hurrying to their tailgates without sparing a thought for me. After walking in vain for almost half an hour under the burning sun, I threw my hands up, looked up at the sky, and silently implored heaven to give me a break.
Enter Jesus
And then a minor miracle happened. I got a ride from 4 guys in a car. For a colored guy to get a ride is a minor miracle in East Tennessee. It turned out Brendan,Brian , Brett and James were members of the local Crown College– a private Baptist college and they did not mind crowding in the back seat so I could have a comfortable ride. When Brendan mentioned that he and James were thinking of being Pastors after graduation, I asked them if we could study the Bible together-we promised to meet next Saturday and that was it.
The Bible studies were quite different from my childhood readings of the Bible while studying at a Catholic school in India. My relationship with God was that of a teenager with his Dad- I prayed only when I needed help or money or both.
These Bible studies were more like group discussions based on what we were doing, or where we wanted to be, all revolving around the particular quote or passage which was being studied. Contrary to my exceptions, studying the Bible was as natural for me as studying the Gita- and the study was much more rational and logical than I thought it would be.
After a couple of months of these I felt confident enough to go to their local Baptist Church. In the meantime I had expanded my weekly Bible studies to two- every Saturday with the awesome foursome Crownies and every Wednesday at Moe’s – a burrito eating place with Austin , a University student I had met while rock climbing in the campus gymnasium.
The Change
After a couple of months doing Bible Study I felt confident to go to Church, with Austin and my friends. While I alternated between the Baptist and the Prebysterian church I eventually settled for the Redeemer church as that was closest to me. In the meantime, Austin got engaged ,graduated and became busy holding two jobs. It was at a Christmas party that I first met Michael who had just started working with Bridges, a campus ministry.
Doing Michael ‘s Bible study was an evolution for me spiritually as I could now not only just read the Bible but start learning to apply it’s teachings in my daily life. While the Bible is clearly a religious book for Jews and Christians, the application of the works in your daily life is very common sense. People reading the Bible are happier than people not reading it, and happiness is what we strive for in our mortal life.
My relationships with people in my personal life improved, as I started learning the value of tolerance and forgiveness. I started drinking much much less, my health and disposition became much better. For the first time in 32 years of my life I could even say I was blessed to be with friends rather than be a habitual loner.
And God worked his miracles in my professional and personal life as well. While still at a struggle for money, I was having a steady income as my website business- and I started preparing to reconcile back to my son , my wife and my parents in India.Happy Beginning.
I am now in India, have a delightful 3 year old son, a caring wife, a small startup. This morning I got up and started designing the logo of my new firm. I talk better with my parents than I have in many decades.
I still follow Jesus but I dont go to church. Jesus was all about love, but churches can sometimes talk about hating gays, terrorists, foreigners etc.
I realized that believing in God has nothing to do with following or unfollowing a particular religion.
Are life’s struggles over? Am I never going to be in trouble or trying times again. I dont think so- if at all being more spiritually aware means a greater understanding of right and wrong and a bigger task to walk the straight and narrow path. I call this my Happy Beginning- and I wish you the same. Whatever religion you may have faith in, forgiveness and belief in a higher more forgiving , more merciful God can only help you achieve calm and happiness.
You can go to a pub /not believe in anything or choose to go to a church (or a temple /synagogue/ mosque).
Chances are people who believe in the latter are going to be happier. And if you add this with a merciful and forgiving attitude towards others (like God has for you) – there is no limit to where you Happy Beginning can take you.
I did get divorced and lost custody of my son. I also wrote books, became more successful in my life. Above all , learning to accept Gid without caring too much for the baggage that preachers impose in the name of organized religion, helps calm me down.
I am currently playing/ trying out RApache- one more excellent R product from Vanderbilt’s excellent Dept of Biostatistics and it’s prodigious coder Jeff Horner.
I really liked the virtual machine idea- you can download a virtual image of Rapache and play with it- .vmx is easy to create and great to share-
Basically using R Apache (with an EC2 on backend) can help you create customized dashboards, BI apps, etc all using R’s graphical and statistical capabilities.
Rapache embeds the R interpreter inside the Apache 2 web server. By doing this, Rapache realizes the full potential of R and its facilities over the web. R programmers configure appache by mapping Universal Resource Locaters (URL’s) to either R scripts or R functions. The R code relies on CGI variables to read a client request and R’s input/output facilities to write the response.
One advantage to Rapache’s architecture is robust multi-process management by Apache. In contrast to Rserve and RSOAP, Rapache is a pre-fork server utilizing HTTP as the communications protocol. Another advantage is a clear separation, a loose coupling, of R code from client code. With Rserve and RSOAP, the client must send data and R commands to be executed on the server. With Rapache the only client requirements are the ability to communicate via HTTP. Additionally, Rapache gains significant authentication, authorization, and encryption mechanism by virtue of being embedded in Apache.
Existing Demos of Architechture based on R Apache-
You can download version 1.1.10 of rApache now. There
are only two significant changes and you don’t have to edit your
apache config or change any code (just recompile rApache and
reinstall):
1) Error reporting should be more informative. both when you
accidentally introduce errors in the Apache config, and when your code
introduces warnings and errors from web requests.
I’ve struggled with this one for awhile, not really knowing what
strategy would be best. Basically, rApache hooks into the R I/O layer
at such a low level that it’s hard to capture all warnings and errors
as they occur and introduce them to the user in a sane manner. In
prior releases, when ROutputErrors was in effect (either the apache
directive or the R function) one would typically see a bunch of grey
boxes with a red outline with a title of RApache Warning/Error!!!.
Unfortunately those grey boxes could contain empty lines, one line of
error, or a few that relate to the lines in previously displayed
boxes. Really a big uninformative mess.
The new approach is to print just one warning box with the title
“”Oops!!! <b>rApache</b> has something to tell you. View source and
read the HTML comments at the end.” and then as the title implies you
can read the HTML comment located at the end of the file… after the
closing html. That way, you’re actually reading how R would present
the warnings and errors to you as if you executed the code at the R
command prompt. And if you don’t use ROutputErrors, the warning/error
messages are printed in the Apache log file, just as they were before,
but nicer 😉
2) Code dispatching has changed so please let me know if I’ve
introduced any strange behavior.
This was necessary to enhance error reporting. Prior to this release,
rApache would use R’s C API exclusively to build up the call to your
code that is then passed to R’s evaluation engine. The advantage to
this approach is that it’s much more efficient as there is no parsing
involved, however all information about parse errors, files which
produced errors, etc. were lost. The new approach uses R’s built-in
parse function to build up the call and then passes it of to R. A
slight overhead, but it should be negligible. So, if you feel that
this approach is too slow OR I’ve introduced bugs or strange behavior,
please let me know.
FUTURE PLANS
I’m gaining more experience building Debian/Ubuntu packages each day,
so hopefully by some time in 2011 you can rely on binary releases for
these distributions and not install rApache from source! Fingers
crossed!
Development on the rApache 1.1 branch will be winding down (save bug
fix releases) as I transition to the 1.2 branch. This will involve
taking out a small chunk of code that defines the rApache development
environment (all the CGI variables and the functions such as
setHeader, setCookie, etc) and placing it in its own R package…
unnamed as of yet. This is to facilitate my development of the ralite
R package, a small single user cross-platform web server.
The goal for ralite is to speed up development of R web applications,
take out a bit of friction in the development process by not having to
run the full rApache server. Plus it would allow users to develop in
the rApache enronment while on windows and later deploy on more
capable server environments. The secondary goal for ralite is it’s use
in other web server environments (nginx and IIS come to mind) as a
persistent per-client process.
And finally, wiki.rapache.net will be the new www.rapache.net once I
translate the manual over… any day now.