Bring it on Bing

A few notes on Bing

screenshot-ajay-ohri-bing-mozilla-firefox

  • The design is better ( read newer). Google still thinks design is something they studied and forgot in semester 1 of engineering – but the Ipod like design is cool.
  • I like the preview link  feature- just hover the mouse to get a sleak preview of what the searh page goes to- it saves time I think A LOT.
  • Surprisingly the results are more and in different order than Google
  • Images result was again different than Google but I liked the images options on left margin
  • Google results are still more pertinent ( but not much) on the first page but Bing’s archive seemed fresher ( like catching my Linkedin profile changed url while Google gave an error)

Overall summary- it is NEW and DIFFERENT and GOOD. Good enough to add to the toolbar. But not great enough to leave 8 year old habits of Googling it. Unless Google guys really bung it up.

Citation- http://bing.com

screenshot-ajay-ohri-bing-images-mozilla-firefox

KXEN Webinar on Automation

Here is a webinare from KXEN on automation- having seen the product in action multiple times it is always a Wow moment when you see KXEN build a model in 5 minutes flat from thousands of variables and tens of thousands of rows. If you have not seen the latest version of KXEN in action – do take time out for 60 minutes to see this

From http://www.kxen.com/index.php?option=com_content&task=view&id=546&Itemid=985

KXEN’s Automation Revolutionizes Modeling Productivity

  • Date: June 9, 2009
  • Time: 9:00 am Pacific/12:00 noon Eastern
  • Duration: 60 minutes

Register Now!

You have already recognized improved marketing performance by investing in a campaign management solution and data mining tools. Why might you be interested in KXEN, the leader in data mining automation?

If you are like many businesses these days, you would like to be able to do more with less. You have a limited analytical team and more modeling requirements than ever.

With KXEN, our customers are able to produce models in 1/10th to 1/100th of the time of traditional data mining tools, while not sacrificing model accuracy or robustness.

What makes KXEN different? In this webinar, you will learn how and why KXEN is unique. And why your business might want to select KXEN as your data mining solution.

What will be demonstrated

This presentation will show you how KXEN automates the data mining process including:

Data Preparation
Variable Selection
Model Building
Model Validation
Scoring Code Generation

Who should attend

Statisticians, data miners, data analysts, business analysts and marketing executives who want to increase the productivity of their analytics team.

Register Now!

Disclaimer- I am a consultant to KXEN on social media

Interview David Smith REvolution Computing

Here is an Interview with REvolution Computing’s Director of Community David Smith.

Our development team spent more than six months making R work on 64-bit Windows (and optimizing it for speed), which we released as REvolution R Enterprise bundled with ParallelR.” David Smith

Ajay -Tell us about your journey in science. In particular tell us what attracted you to R and the open source movement.

David- I got my start in science in 1990 working with CSIRO (the government science organization in Australia) after I completed my degree in mathematics and computer science. Seeing the diversity of projects the statisticians there worked on really opened my eyes to statistics as the way of objectively answering questions about science.

That’s also when I was first introduced to the S language, the forerunner of R. I was hooked immediately; it was just so natural for doing the work I had to do. I also had the benefit of a wonderful mentor, Professor Bill Venables, who at the time was teaching S to CSIRO scientists at remote stations around Australia. He brought me along on his travels as an assistant. I learned a lot about the practice of statistical computing helping those scientists solve their problems (and got to visit some great parts of Australia, too).

Ajay- How do you think we should help bring more students to the fields of mathematics and science-

David- For me, statistics is the practical application of mathematics to the real world of messy data, complex problems and difficult conclusions. And in recent years, lots of statistical problems have broken out of geeky science applications to become truly mainstream, even sexy. In our new information society, graduating statisticians have a bright future ahead of them which I think will inevitably draw more students to the field.

Ajay- Your blog at REVolution Computing is one of the best technical corporate blogs. In particular the monthly round up of new packages, R events and product launches all written in a lucid style. Are there any plans for a REvolution computing community or network as well instead of just the blog.

David- Yes, definitely. We recently hired Danese Cooper as our Open Source Diva to help us in this area. Danese has a wealth of experience building open-source communities, such as for Java at Sun. We’ll be announcing some new community initiatives this summer. In the meantime, of course, we’ll continue with the Revolutions blog, which has proven to be a great vehicle for getting the word out about R to a community that hasn’t heard about it before. Thanks for the kind words about the blog, by the way — it’s been a lot of fun to write. It will be a continuing part of our community strategy, and I even plan to expand the roster of authors in the future, too. (If you’re an aspiring R blogger, please get in touch!)

Ajay- I kind of get confused between what exactly is 32 bit or 64 bit computing in terms of hardware and software. What is the deal there. How do Enterprise solutions from REvolution take care of the 64 bit computing. How exactly does Parallel computing and optimized math libraries in REvolution R help as compared to other flavors of R.

David– Fundamentally, 64-bit systems allow you to process larger data sets with R — as long as you have a version of R compiled to take advantage of the increased memory available. (I wrote about some of the technical details behind this recently on the blog.)  One of the really exciting trends I’ve noticed over the past 6 months is that R is being applied to larger and more complex problems in areas like predictive analytics and social networking data, so being able to process the largest data sets is key.

One common mis perception is that 64-bit systems are inherently faster than their 32-bit equivalents, but this isn’t generally the case. To speed up large problems, the best approach is to break the problem down into smaller components and run them in parallel on multiple machines. We created the ParallelR suite of packages to make it easy to break down such problems in R and run them on a multiprocessor workstation, a local cluster or grid, or even cloud computing systems like Amazon’s EC2 .

” While the core R team produces versions of R for 64-bit Linux systems, they don’t make one for Windows. Our development team spent more than six months making R work on 64-bit Windows (and optimizing it for speed), which we released as REvolution R Enterprise bundled with ParallelR. We’re excited by the scale of the applications our subscribers are already tackling with a combination of 64-bit and parallel computing”

Ajay-  Command line is oh so commanding. Please describe any plans to support or help any R GUI like rattle or R Commander. Do you think Revolution R can get more users if it does help a GUI.

David- Right now we’re focusing on making R easier to use for programmers by creating a new GUI for programming and debugging R code. We heard feedback from some clients who were concerned about training their programmers in R without a modern development environment available. So we’re addressing that by improving R to make the “standard” features programmers expect (like step debugging and variable inspection) work in R and integrating it with the standard environment for programmers on Windows, Visual Studio.

In my opinion R’s strength lies in its combination of high-quality of statistical algorithms with a language ideal for applying them, so “hiding” the language behind a general-purpose GUI negates that strength a bit, I think. On the other hand it would be nice to have an open-source “user-friendly” tool for desktop statistical analysis, so I’m glad others are working to extend R in that area.

Ajay- Companies like SAS are investing in SaaS and cloud computing. Zementis offers scored models on the cloud through PMML. Any views on just building the model or analytics on the cloud itself.

David- To me, cloud computing is a cost-effective way of dynamically scaling hardware to the problem at hand. Not everyone has access to a 20-machine cluster for high-performing computing — and even those that do can’t instantly convert it to a cluster of 100 or 1000 machines to satisfy a sudden spike in demand. REvolution R Enterprise with ParallelR is unique in that it provides a platform for creating sophisticated data analysis applications distributed in the cloud, quickly and easily.

Using clouds for building models is a no-brainer for parallel-computing problems: I recently wrote about how parallel backtesting for financial trading can easily be deployed on Amazon EC2, for example. PMML is a great way of deploying static models, but one of the big advantages of cloud computing is that it makes it possible to update your model much more frequently, to keep your predictions in tune with the latest source data.

Ajay- What are the major alliances that REvolution has in the industry.

David- We have a number of industry partners. Microsoft and Intel, in particular, provide financial and technical support allowing us to really strengthen and optimize R on Windows, a platform that has been somewhat underserved by the open-source community. With Sybase, we’ve been working on combing REvolution R and Sybase Rap to produce some exciting advances in financial risk analytics. Similarly, we’ve been doing work with Vhayu’s Velocity database to provide high-performance data extraction. On the life sciences front, Pfizer is not only a valued client but in many ways a partner who has helped us “road-test” commercial grade R deployment with great success.

Ajay- What are the major R packages that REvolution supports and optimizes and how exactly do they work/help?

David- REvolution R works with all the R packages: in fact, we provide a mirror of CRAN so our subscribers have access to the truly amazing breadth and depth of analytic and graphical methods available in third-party R packages. Those packages that perform intensive mathematical calculations automatically benefit from the optimized math libraries that we incorporate in REvolution R Enterprise. In the future, we plan to work with authors of some key packages provide further improvements — in particular, to make packages work with ParallelR to reduce computation times in multiprocessor or cloud computing environments.

Ajay- Are you planning to lay off people during the recession. does REvolution Computing offer internships to college graduates. What do people at REvolution Computing do to have fun?

David- On the contrary, we’ve been hiring recently. We don’t have an intern program in place just yet, though. For me, it’s been a really fun place to work. Working for an open-source company has a different vibe than the commercial software companies I’ve worked for before. The most fun for me has been meeting with R users around the country and sharing stories about how R is really making a difference in so many different venues — over a few beers of course!


David Smith
Director of Community

David has a long history with the statistical community.  After graduating with a degree in Statistics from the University of Adelaide, South Australia, David spent four years researching statistical methodology at Lancaster University (United Kingdom), where he also developed a number of packages for the S-PLUS statistical modeling environment. David continued his association with S-PLUS at Insightful (now TIBCO Spotfire) where for more than eight years he oversaw the product management of S-PLUS and other statistical and data mining products. David is the co-author (with Bill Venables) of the tutorial manual, An Introduction to R , and one of the originating developers of ESS: Emacs Speaks Statistics. Prior to joining REvolution, David was Vice President, Product Management at Zynchros, Inc.

AjayTo know more about David Smith and REvolution Computing do visit http://www.revolution-computing.com and

http://www.blog.revolution-computing.com
Also see interview with Richard Schultz ,­CEO REvolution Computing here.

http://www.decisionstats.com/2009/01/31/interviewrichard-schultz-ceo-revolution-computing/

More R please

some R news

0 The R Foundation Website I guess the http://www.r-project.org team is busy prettyfying before the annual R users conference kicks in- the website of www.r-project.org ( I was told it looks has the aesthetic visual appeal of dead cat splattered on the autobahn a very HTML 4.0 kind of retro look )

I cant believe the R Site and R core honchos finds the following image the prettiest image to represent graphical abilities of R

The R core site has tremendous functionality and demand though I wonder if they can just put up some ads and get some funding/ two way research tie- up with Google —Google uses R extensively, and can help with online methods as well, and is listed as supporting organization at http://www.r-project.org/foundation/memberlist.html …..

The R archives are a collection of emails and thats not documentation at all – but

1 Revolution R Website and particularly David Smith’s blog is a great way to stay updated on R news at http://blog.revolution-computing.com/

I have covered REvolution R before, and they are truly impressive.

http://www.decisionstats.com/2009/01/31/interviewrichard-schultz-ceo-revolution-computing/

It seems the domain name revolutioncomputing.com was squatted ( by NC?) so thats why the hyphenated web name. It is a very lucid website- though I do request them to put more video/podcasts and a Tweet this button would be great :))

and another more techie post here

http://blog.revolution-computing.com/2009/05/verifying-zipfs-powerdistribution-law-for-cities.html

Another great source is the Twitter – it seems that Twitter R users use the hashtag #rstats to search for R kind of news and code – that should help R bloggers and at a later date users.

Click here for checking it out

http://search.twitter.com/search?q=#stats

2 Some more R forums and sites

Forum for R Enterprise Users http://www.revolution-computing.com/forum

A R Tips Site http://onertipaday.blogspot.com/

The R Journal ( yes there is a journal for all hard working R fans) http://journal.r-project.org/

R on Linkedin http://www.linkedin.com/groups?about=&gid=77616

and the Analytic Bridge community group for R

http://www.analyticbridge.com/group/rprojectandotherfreesoftwaretools

2 Here is a terrific post by Robert Grossman

at http://blog.rgrossman.com/2009/05/17/running-r-on-amazons-ec2/

I liked the way he built the case for using R on Amazon EC2 ( Business case not Use case) and then proceeded to a step by step tutorial simple and powerful blog post. I hope R comes out with a standardized Online R Doc like that which is a single point search able archive for code – something like the SAS online doc (which remains free for WPS users 😉 ) but the way the web is evolving it seems the present mish mash method would continue

the main steps to use R on a pre-configured AMI.

Set up.
The set up needs to be done just once.

1. Set up an Amazon Web Services (AWS) account by going to:

aws.amazon.com.

If you already have an Amazon account for buying books and other items from Amazon, then you can use this account also for AWS.
2. Login to the AWS console
3. Create a “key-pair” by clinking on the link “Key Pairs” in the Configuration section of the Navigation Menu on the left hand side of the AWS console page.
4. Clink on the “Create Key Pair” button, about a quarter of the way down the page.
5. Name the key pair and save it to working directory, say /home/rlg/work.

Launching the AMI. These steps are done whenever you want to launch a new AMI.

1. Login to the AWS console. Click on the Amazon EC2 tab.
2. Click the “AMIs” button under the “Images and Instances” section of the left navigation menu of the AWS console.
3. Enter “opendatagroup” in the search box and select the AMI labeled
“opendatagroup/r-timeseries.manifest.xml”, which
is AMI instance “ami-ea846283″.
4. Enter the number of instances to launch (1), the name of the key pair that you have previously created, and select “web server” for the security group. Click the launch button to launch the AMI. Be sure to terminate the AMI when you are done.
5. Wait until the status of the AMI is “running.” This usually takes about 5 minutes.

Accessing the AMI.

1. Get the public IP address of the new AMI. The easiest way to do this is to select the AMI by checking the box. This provides some additional information about the AMI at the bottom of the window. You can can copy the IP address there.
2. Open a console window and cd to your working directory which contains the key-pair that you previously downloaded.
3. Type the command:
ssh -i testkp.pem -X root@ec2-67-202-44-197.compute-1.amazonaws.com

Here we assume that the name of the key-pair you created is “testkp.pem.” The flag “-X” starts a session that supports X11. If you don’t have X11 on your machine, you can still login and use R but the graphics in the example below won’t be displayed on your computer.

Using R on the AMI.

1. Change your directory and start R

#cd examples
#R
2. Test R by entering a R expression, such as:

> mean(1:100)
[1] 50.5
>
3. From within R, you can also source one of the example scripts to see some time series computations:

> source(‘NYSE.r’)
4. After a minute or so, you should see a graph on your screen. After the graph is finished being drawn, you should see a prompt:

CR to continue

Enter a carriage return and you should see another graph. You will need to enter a carriage return 8 times to complete the script (you can also choose to break out of the script if you get bored with the all the graphs.
5. When you are done, exit your R session with a control-D. Exit your ssh session with an “exit” and terminte your AMI from the Amazon AWS console. You can also choose to leave your AMI running (it is only a few dollars a day).

Acknowledgements: Steve Vejcik from Open Data Group wrote the R scripts and configured the AMI.

AjayTerrific R companies, blogs, tweets, research and sites, but do let me know your feedback . Just un-other R day.

White Riders

Here is a nice company started by a fellow batchmate from the Indian Institute of Management, Kaustubh Mishra. It is called White Riders- It is a relative pioneer in adventure travel. Note these bikers are well behaved MBA’s and imparting Team Building Management lessons along the way. I caught up with Kaustubh long enough for him to tell me why he chose the adventure travel business.

km1
Ajay – What has been the story of your career and what message would you like to send to young people aspiring for MBA’s or just starting their careers?

Kaustubh- My first job was as a peon with SPCA, handling paperwork, dishes, etc. My Father wanted to see me getting a bicycle from my own money and that is why it happened. Thanks to Papa, I learnt some important lessons while serving people. During graduation I was doing odd jobs like a faculty at a computer institute, freelance programmer, etc.

The first experience of a large organization came @ Bharti Telecom, where I did my summers. It was a market research project and I remember sleeping in an interviewee’s cabin during a survey. After my PGDM from IIML, I got into Tech Pacific, and then ICICI then ABN AMRO. Please visit my linkedin profile for more details

My message to people doing their MBA is simple – MBA is not the end, it is just a via media for you to get into a good career. Get into an MBA because YOU want to do it and not because everyone else is doing it. There are so many careers options in front of you, follow your heart.

For people starting their careers, just 7 words – realize the power within & follow your dreams.

Ajay- Why did you create a startup? Why did you name it White Collar company ( there was an ad of a business school reunion which had the same name). What is your vision for White Collar Company
Kaustubh
– When I was doing my job, I was always over achieving targets, but after some time a rut sets in. I also realized that complete freedom and maximum returns for my efforts were absent. There were so many things, ideas, etc simmering inside me but I could not do anything inside. To do all that, I had to venture on my own and venture I did. So the biggest reason I started my own company was to put my ideas into practice.

White Collar is a name generally associated with knowledge. I first wanted to name it ‘white’ but the name, domain name, trademarks etc were not available. White denotes knowledge. Our goddess of knowledge and learning ‘Saraswati’ is dressed in white. As all my ventures are essentially about knowledge and learning, so white collar. And White Collar Biker sounds cool and very oxymoronish.

I see White Collar Company to be known as the cradle of new ideas, innovation and creativity in the field of knowledge. A university is next in some years.

Ajay- What are the key learnings that you have learnt in this short period? name some companies in the United States that are similar to your company. What do you think is the market potential of this segment.
Kaustubh-
We are 3 industries – adventure tourism, corporate training and hr advisory. While in the first and the last there are people doing nearly the same thing (I would not say exactly, because we do have our USPs) in corporate training – White Collar Company is the only company in the world conducting management training through motorcycles

With innovation and RoI being extremely important in training, the market potential is huge. In adventure tourism also the potential is great as we are waking upto it. In consultancy as we operate in SME space, the potential again is very large.

It has been a short period to have big learning, but I have been applying learning I had in my previous jobs to this like vendor management, marketing channel management, etc. But yes, I learnt the art of hard bargain and negotiations during this short period.

Ajay- Is an MBA (IIM or Otherwise) necessary for success. Comments please.
Kaustubh-Ajay, your question here says success. Before answering this question, I would first differentiate between 2 successes we are talking about. Success in corporate life is different from success as a entrepreneur.

For being successful as a corporate executive, MBA to a certain extent is good. It gives you certain kind of thought processes and also a platform for future success.

However, if we talk of a successful entrepreneur, I personally do not think MBA will matter much. In fact I often talk of the ‘1st of the month’ syndrome – this is the comfort of getting a handsome amount deposited as your salary every month. When you get into that comfort zone, it becomes very hard to come out. Larger the amount, harder it gets. For a successful entrepreneur – perseverance, self belief, ability of trust and ability to take risk is very important. I doubt if any MBA is going to give you that. The very same thought processes, way of thinking that help you succeed in corporate life, need to be challenged as an entrepreneur.

Ajay- Whats your vision for your web site. Which website is a good analogy for it? Why should anyone visit the website?

Kaustubh- I am not a technical person, but having said that, I see my website to be the focal point of my business. I myself built my website using widgets, etc and going forward all my business will happen from the site. By 2010, we will put a strong CRM and PRM on the website, thus enabling all business processes to be routed through the website. Like I said, I am not a techie, but I think Web 2.0, participative nature of the internet and cloud computing are going to help me save and optimize. We already have an online chat built in site, any customer can come and get more details about our programs.

Going forward, customers will be able to do bookings themselves on the site. Vendors will be able to log in do all necessary business through website and we plan to implement SFA for our employees. I believe this answers the vision and why should anyone visit my site.

Ajay-What is your favorite incident in this short period of your startup. What were the key learnings. Are you seeking venture capital funds.

Kaustubh- For customers, I thought the typical profile that will come will be young males, I was delighted when a female became our first customer. We have tweaked our marketing strategy and positioning after that.

At this stage my baby is too young and fragile. If I give her crutches to walk, she will never be able to stand up herself and be counted. So while we will go for external funding at some point of time, that time is not now. With our kind of business model, right now we are not ready for the interference of a venture capitalist.

dsc01530-300x224

So if you always wanted to travel to India and have an adventure as well contact Kaustubh at http://www.wccindia.com/rider/R_kaustubh.html and he will show you to be a White Rider too.

02020022_jpg

Read more about his company here – http://www.wccindia.com/rider/whywhite.html

Creating Online Communities


Sometime back I had asked the question- How much do you think would it be to have the top 100 bloggers on SAS language on the same page, in a manner that the RSS feeds get updated on their own. The answer is here-

Wordframe. I had covered this software before in comparison to Ning.com and they proved favorable.

This is a small startup, East Europe based and very hard working. They allegedly wanted to become open source and had plans to create third party applications when I checked with them in January but this may be on hold for a new product launch.

Would sas.com pay for 1000$ set up fee and 200$ monthly fee for getting the top 100 SAS bloggers on their sascommunity.org website.

Would oracle pay for 1000$ set up fee and 200$ monthly fee for getting the top 100 Oracle bloggers on a website sponsored by them.

How much would Aster Data pay for say 100 bloggers about Hadoop ( ahem- assuming there are 100 people who CAN blog about Hadoop- a bit like Einstein’s 5 people in the world can understand his theory of relativity).

Check this site out.

image