Top R Interviews

 

Portrait of baron A.I.Vassiliev (later - count)
Image via Wikipedia

 

Here is a list of the Top R Related Interviews I have done (in random order)-

1) John Fox , Creator of R Commander

https://decisionstats.com/2009/09/14/interview-professor-john-fox-creator-r-commander/

2) Dr Graham Williams, Creator of Rattle

https://decisionstats.com/2009/01/13/interview-dr-graham-williams/

3) David Smith, back when he was community Director of then Revolution Computing.

https://decisionstats.com/2009/05/29/interview-david-smith-revolution-computing/

and his second interview

https://decisionstats.com/2010/08/03/q-a-with-david-smith-revolution-analytics/

4) Robert Schultz, the first CEO of Revolution Computing (now Analytics)

https://decisionstats.com/2009/01/31/interviewrichard-schultz-ceo-revolution-computing/

5) Bob  Muenchen, author of R for SAS and SPSS users AND R for Stata users

https://decisionstats.com/2010/06/29/interview-r-for-stata-users/

https://decisionstats.com/2008/10/16/r-for-sas-and-spss-users/

6) Karim Chine, creator Biocep, Cloud Computing for R

https://decisionstats.com/2009/06/21/interview-karim-chine-biocep-cloud-computing-with-r/

7) Paul van Eikeran, Inference for R,the first enterprise package to use R from within MS Office.

https://decisionstats.com/2009/06/04/inference-for-r/

8) Hadley Wickham, creator GGPlot and R Author

https://decisionstats.com/2010/01/12/interview-hadley-wickham-r-project-data-visualization-guru/

Thats a lot of R interviews- I need to balance them out a bit I guess.

Unbreakable Oracle Linux- and Unshakable-Libre Office-

Tux, the Linux penguin
Image via Wikipedia

Oracle announced Unbreakable Oracle Linux (which is the first time I have seen Unbreakable word used in a formal software name)- Hats off to good ol’ Larry chutzpah. It is also quite a fast form of Linux for Enterprises-as the stats say at http://www.oracle.com/us/technologies/linux/ubreakable-enterprise-kernel-linux-173350.html

LibreOffice is a new fork from OpenOffice– Basically people who want to ensure OpenOffice remains free. It basically consists of efforts from everybody except Apple, Microsoft and Oracle (http://www.documentfoundation.org/supporters/) and it’s a new kind of workable office productivity suite-determined to remain free. I have used it- a bit shaky- but I really liked the new design and willingly will test it (and auto submit bugs) . It would be interesting to see the reaction of enterprise vendors like SAS, IBM,Dell, HP (and Lenovo)  and etc -as their support would be critical to both Unbreakable Oracle Linux and Unshakable LibreOffice.

See more here-http://www.documentfoundation.org/download/

Interesting Interview with Quentin G,AsterData

Here is an interesting interview with Quentin G, CEO AsterData, Marketing trumpeting aside apart-the insights on the whats next vision thing are quite good.

Sourcehttp://www.arnoldit.com/search-wizards-speak/aster-data.html

As you look down the road, what are the three major challenges you see for vendors who keep trying to solve big data and other “now” problems with old tools?

Old tools and traditional architectures cannot scale effectively to handle massive data volumes that reach 100’s of terabytes nor can they effectively process large data volumes in a high performance manner. Further, they are restricted to what SQL querying allows. The three challenges I have noted are:

First, performance, specifically, poor performance on large data volumes and heavy workloads: The pre-existing systems rely on storing data in a traditional DBMS or data warehouse and then extracting a sample of data to a separate processing tier. This greatly restricts data insights and analytics as only a sample of data is analyzed and understood.  As more data is stored in these systems they suffer from performance degradation as more users try to access the system concurrently. Additionally moving masses of data out of the traditional DBMS to a separate processing tier adds latency and slows down analytics and response times. This pre-existing architecture greatly limits performance especially as data sizes grow.

Second, limited analytics: Pre-existing systems rely mostly on SQL for data querying and analysis. SQL poses several limitations and is not suited for ad hoc querying, deep data exploration and a range of other analytics. MapReduce overcomes the limitations of SQL and SQL-MapReduce in particular opens up a new class of analytics that cannot be achieved with SQL alone.

And, third, limitations of types of data that can be stored and analyzed: Traditional systems are not designed for non-relational or unstructured data. New solutions such as Aster Data’s are designed from the ground up to handle both relational and non-relational data. Organizations want to store and process a range of data types and do this in a single platform. New solutions allow for different data types to be handled in a single platform whereas pre-existing architectures and solutions are specialized around a single data type or format – this restricts the diversity of analytics that can be performed on these systems.

Read the whole interview at –http://www.arnoldit.com/search-wizards-speak/aster-data.html

Speaking of which- there is a new webinar by Merv Adrian (interview on Decisionstats) and Colin White-

 

http://now.eloqua.com/es.asp?s=1015&e=1862&elq=9ec9b73872e849b88d2943cca920acda

and from the famous AOL website- a profile of AsterData’s money flow which kind of hints at an IPO two years onwards-

http://www.crunchbase.com/company/aster-data-systems

Cloudera and Aster Data partner up

Basically making it easier for data to move between the two systems Hadoop (Cloudera) and Aster’s Analytics (MapReduce/SQL)-

From the press release-http://goo.gl/vgsr

today announced an agreement that unites Cloudera Distribution for Hadoop (CDH) with Aster Data nCluster. The integration enables customers to leverage MPP platforms for large-scale data processing, management and analytics across structured and unstructured formats to analyze massive amounts of data for deeper business insight.

Cloudera is building a massively parallel, two-way connector for high-speed movement of data between CDH and Aster Data nCluster. The connector will be supported as part of Cloudera Enterprise.

Learning Hadoop

Curious on learning hadoop- a hot resume skill

Try

http://www.cloudera.com/hadoop-training/#certification

Cloudera Certification for Hadoop

Cloudera Certification establishes you as a trusted and valuable resource for those working with Hadoop. Whether your company is just looking into the technology or your customers are asking for help, Cloudera Certification demonstrates your ability to solve problems using Hadoop.

  • Consultants, developers and technical leaders can use Cloudera Certification to demonstrate their experience with Hadoop.
  • Employers can use Cloudera Certification to identify candidates for new jobs or internal promotions, as well as ensure team members share a common knowledge base.
  • Customers can reduce risk by relying on contractors and suppliers who retain current Cloudera Certification for their personnel.

If you’d like to obtain Cloudera Certification for Developers or Administrators

http://www.cloudera.com/hadoop-training/#certification

Why Cloud?

Here are some reasons why cloud computing is very helpful to small business owners like me- and can be very helpful to even bigger people.

1) Infrastructure Overhead becomes zero

– I need NOT invest in secure powerbackups (like a big battery for electricity power-outs-true in India), data disaster management (read raid), software licensing compliance.

All this is done for me by infrastructure providers like Google and Amazon.

For simple office productivity, I type on Google Docs that auto-saves my data,writing on cloud. I need not backup- Google does it for me.  Ditto for presentations and spreadsheets. Amazon gets me the latest Window software installed whenever I logon- I need not be  bothered by software contracts (read bug fixes and patches) any more.

2) Renting Hardware by the hour- A small business owner cannot invest too much in computing hardware (or software). The pay as you use makes sense for them. I could never afford a 8 cores desktop with 25 gb RAM- but I sure can rent and use it to bid for heavier data projects that I would have had to let go in the past.

3) Renting software by the hour- You may have bought your last PC for all time

An example- A windows micro instance costs you 3 cents per hour on Amazon. If you take a mathematical look at upgrading your PC to latest Windows, buying more and more upgraded desktops just to keep up, those costs would exceed 3 cents per hour. For Unix, it is 2 cents per hour, and those softwares (like Red Hat Linux and Ubuntu have increasingly been design friendly even for non techie users)

Some other software companies especially in enterprise software plan to and already offer paid machine images that basically adds their software layer on top of the OS and you can rent software for the hour.

It does not make sense for customers to effectively subsidize golf tournaments, rock concerts, conference networks by their own money- as they can rent software by the hour and switch to pay per use.

People especially SME consultants, academics and students and cost conscious customers – in Analytics would love to see a world where they could say run SAS Enterprise Miner for 10 dollars a hour for two hours to build a data mining model on 25 gb RAM, rather than hurt their pockets and profitability in Annual license models. Ditto for SPSS, JMP, KXEN, Revolution R, Oracle Data Mining (already available on Amazon) , SAP (??), WPS ( on cloud ???? ) . It’s the economy, stupid.

Corporates have realized that cutting down on Hardware and software expenses is more preferable to cutting down people. Would you rather fire people in your own team to buy that big HP or Dell or IBM Server (effectively subsidizing jobs in those companies). IF you had to choose between an annual license renewal for your analytics software TO renting software by the hour and using those savings for better benefits for your employees, what makes business sense for you to invest in.

Goodbye annual license fees.  Welcome brave new world.

Blog Update

Some changes at Decisionstats-

1) We are back at Decisionstats.com and Decisionstats.wordpress.com will point to that as well. The SEO effects would be interesting and so would be the Instant Pagerank or LinkRank or whatever Coffee/Percolator they use in Cali to index the site.

2) AsterData is no longer a sponsor- but Predictive Analytics Conference is. Welcome PAWS! I have been a blog partner to PAWS ever since it began- and it’s a great marketing fit. Expect to see a lot of exclusive content and interviews from great speakers at PAWS.

3) The Feedblitz newsletter (now at 404 subscribers) is now a weekly subscription to send one big big email rather than lots of email through the week- this is because my blogging frequency is moving up as I collect material for a new book on business analytics that I would probably release in 2011 (if all goes well, touchwood). Linkedin group would be getting a weekly update announcement. If you are connected to Decisionstats on Analyticbridge _ I would soon try to find a way to update the whole post automatically using RSS and Ning.com . or not. Depends.

4) R continues to be a bigger focus. So will SPSS and maybe JMP. Newer softwares or older softwares that change more rapidly would get more coverage. Generally a particular software is covered if it has newer features, or an interesting techie conference, or it gets sued.

5) I will occasionally write a poem or post a video once a week randomly to prove geeks and nerds and analysts can have fun (much more fun actually dont we)

Thanks for reading this. Sept 2010 was the best ever for Decisionstats.com – we crossed 15,000 + visitors and thanks for that again! I promise to bore you less and less as we grow old together on the blog 😉