AA classic paper by Donald E Knuth (creator of Tex) on the information complexity of songs can help listeners of music with an interest in analytics. This paper is a classic and dates from 1985 but is pertinent even today.
We should all ask China to free Tibet because of the following reasons-
10 Reasons to Free Tibet
1) Replace a system of governance which is giving 12% GDP growth with a 1000 year old belief that one old guy is really a reincarnation of GOD
2) Because it is a romantic idea
3) The average Tibetan is much better economically than most other countries in Asia and Africa. Still freedom is messy- Donald Rumsfield.
4) So we can sell beer, Facebook ads, Internet Pornography to Tibetans which do not have the liberty to do so currently
5) So we can explore that area for mining and minerals
6) Damn it. We need one more ally for the free world. So we can invade more non free countries.
7) Tibetans girls are hot.
8) Dalai Lama is cool. and he doesnot charge by the hour unlike other yoga Gurus.
9) We need to encircle China just like we did in the 19th Century and Opium Wars
10) So artists like Ai Wei Wei can blog freely
1 Reason not to Free Tibet
1) Tibetans want to be free. If we give them democracy- they will be disappointed to know that the bullets just get replaced by the pepper spray. How silly is that? The desire to be free- when there is no such thing as free anymore.
(This was an article in Sarcasm and meant as literary and not a pseudo-intellectual political article. I have no training in Politics. For details see http://en.wikipedia.org/wiki/Sarcasm
This is NOT an April fool joke or a publicity stunt. It is also not meant to provoke discussion for the sake of provocation.
For a time, as I have studied both US and India , in what makes Government work or fail, academia work or fail, or businesses to work or fail- a common thread is the quality of people involved. Someone who is a wasteful businessman, will be a wasteful politician. Someone who is a flamboyant businessman with flair more than substance will continue that in public life.
Accordingly I have created a Facebook cause-
If Donald Trump can run for President, I can think of no one who has done more for the American South. Unlike the tech heavy, Stanford dominated boom in California, the Mid West and South have been declining centers of influence. Cities like Austin Texas or Raleigh, North California are the exception rather than norm there. A friend who went to Duke once told me, the worst thing is to be borne a rural white male who is poor in America. There are no groups lobbying for education or internet hi fi blazing speeds for you. Socially you are expected to walk and thrive alone.
The Southern Baptist Church has managed to infiltrate and influence young minds there- the average conservative American seemed better off and happier in his moderated social behaviour. But the Church exacts a 10 % tithe, and it is efficient in stretching every dollar and every cent of church donations. Government works with the best intentions, but spending someone else’s money (your tax money money by a bureaucrat) is always more inefficient than the actual owner spending it alone. Taxes are higher than the 10 % tithe and seem to accomplish much less social change. You would rather go to work or go to war?
Accordingly I find that on the West Coast there are very few tech savvy leaders with a track record of both fiscal pragmatism, educational reform and job creation. Certainly the industry lobbyist is smarter at evading taxes than the average Joe, and campaign financing is still dependent on deep pockets despite the innovations of internet retail fund raising.
Would you like your Senator to be as considerate of creating jobs as entrepreneurs are. Jim Goodnight here is a metaphor for all entrepreneurs who dont believe in reckless hire-fire,outsourcing and long term views on people.
Click here to spread this cause- perhaps it will make existing politicians more efficient just by the threat of new competition.
My annual traffic to this blog was almost 99,000 . Add in additional views on networking sites plus the 400 plus RSS readers- so I can say traffic was 1,20,000 for 2010. Nice. Thanks for reading and hope it was worth your time. (this is a long post and will take almost 440 secs to read but the summary is just given)
My intent is either to inform you, give something useful or atleast something interesting.
Sandro Saita from http://www.dataminingblog.com/ just named me for an award on his blog (but my surname is ohRi , Sandro left me without an R- What would I be without R :)) ).
Aw! I am touched. Google for “Data Mining Blog” and Sandro is the best that it is in data mining writing.
DMR People Award 2010
There are a lot of active people in the field of data mining. You can discuss with them on forums. You can read their blogs. You can also meet them in events such as PAW or KDD. Among the people I follow on a regular basis, I have elected:
He has been very active in 2010, especially on his blog . Good work Ajay and continue sharing your experience with us!”
What did I write in 2010- stuff.
What did you read on this blog- well thats the top posts list.
|Top 10 Graphical User Interfaces in Statistical Software||6,237|
|Wealth = function (numeracy, memory recall)||2,014|
|Matlab-Mathematica-R and GPU Computing||1,946|
|The Top Statistical Softwares (GUI)||1,405|
|Using Facebook Analytics (Updated)||1,313|
|Test drive a Chrome notebook.||1,170|
|Top ten RRReasons R is bad for you ?||1,157|
|Interview Hadley Wickham R Project Data Visualization Guru||1,007|
|Using Red R- R with a Visual Interface||854|
|SAS Institute files first lawsuit against WPS- Episode 1||790|
|Interview Professor John Fox Creator R Commander||764|
|R Package Creating||754|
|Windows Azure vs Amazon EC2 (and Google Storage)||726|
|Norman Nie: R GUI and More||716|
|Startups for Geeks||682|
|Google Maps – Jet Ski across Pacific Ocean||670|
|Not so AWkward after all: R GUI RKWard||579|
|Red R 1.8- Pretty GUI||570|
|Parallel Programming using R in Windows||569|
|R is an epic fail or is it just overhyped||559|
|Enterprise Linux rises rapidly:New Report||537|
|Rapid Miner- R Extension||518|
|Creating a Blog Aggregator for free||504|
|So which software is the best analytical software? Sigh- It depends||473|
|Revolution R for Linux||465|
|John Sall sets JMP 9 free to tango with R||460|
So how do people come here –
well I guess I owe Tal G for almost 9000 views ( incidentally I withdrew posting my blog from R- Bloggers and Analyticbridge blogs – due to SEO keyword reasons and some spam I was getting see (below))
http://r-bloggers.com is still the CAT’s whiskers and I read it a lot.
I still dont know who linked my blog to a free sex movie site with 400 views but I have a few suspects.
Still reading this post- gosh let me sell you some advertising. It is only $100 a month (yes its a recession)
Advertisers are treated on First in -Last out (FILO)
I have been told I am obsessed with SEO , but I dont care much for search engines apart from Google, and yes SEO is an interesting science (they should really re name it GEO or Google Engine Optimization)
Apparently Hadley Wickham and Donald Farmer are big keywords for me so I should be more respectful I guess.
|test drive a chrome notebook||467|
|test drive a chrome notebook.||215|
|wps sas lawsuit||158|
|google maps jet ski||123|
|test drive chrome notebook||96|
|sas wps lawsuit||85|
|chrome notebook test drive||83|
|best statistics software||74|
|google maps jetski||72|
|donald farmer microsoft||51|
|best statistical software||49|
What about outgoing links? Apparently I need to find a way to ask Google to pay me for the free advertising I gave their chrome notebook launch. But since their search engine and browser is free to me, guess we are even steven.
so in 2010,
SAS remained top daddy in business analytics,
R made revolutionary strides in terms of new packages,
JMP launched a new version,
SPSS got integrated with Cognos,
Oracle sued Google and did build a great Data Mining GUI,
Libre Office gave you a non Oracle Open office ( or open even more office)
2011 looks like a fun year. Have safe partying .
Using WP- Stats I set about answering this question-
What search keywords lead here-
Clearly Michael Jackson is down this year
And R GUI, Data Mining is up.
How does that affect my writing- given I get almost 250 visitors by search engines alone daily- assume I write nothing on this blog from now on.
It doesnt- I still write what ever code or poem that comes to my mind. So it is hurtful people misunderstimate the effort in writing and jump to conclusions (esp if I write about a company- I am not on payroll of that company- just like if I write about a poem- I am not a full time poet)
Over to xkcd
|michael jackson history||240|
|wps sas lawsuit||180|
|sas wps lawsuit||100|
|google maps jet ski||94|
|google maps jetski||62|
|sas sues wps||60|
|donald farmer microsoft||45|
|best statistics software||42|
|r gui ubuntu||41|
|tamilnadu advanced technical training institute tatti||37|
|wps sas lawsuit||170|
|sas wps lawsuit||95|
|google maps jet ski||94|
|google maps jetski||62|
|donald farmer microsoft||45|
Here is an Interview with Donald Farmer of Microsoft talking about the passion for the exciting business intelligence projects at MS.
Q Describe your career from high school to your current job responsibilities at Microsoft. How can technology companies in America work together to grow the home pool of American science students ( irrespective of market share battles)
A My background is relatively unusual for a technology professional, although at Microsoft one meets people with a very wide range of backgrounds. I had little interest in studying Computer Science formally. For me, software was always a means to an end: a way of solving what were, for me, “more interesting” problems. Of course, I cannot deny that computer science is a compelling subject in itself, just not for me. Yet, from my early teens in Scotland, I had computers to try (starting with the justly famous Sinclair range) and I used them to store, classify and analyze the data I needed for my other work. So, as I studied philosophy and languages, and as I worked in history, archaeology, forestry, fish-farming and so on (through many variations) before I became more completely involved in Business Intelligence, I used database techniques extensively.
I spent some years as a consultant, building all sorts of applications, My first predictive application enabled fish-farmers with private water supplies to balance the needs of fish production and hydro-electric generation based on past, present and predicted rainfall. I believe that application is still in use today, 15 years later!
Later, I joined an excellent group of developers and analysts at AppsMart, building a data mart rapid-development application. That brought me into the Microsoft sphere, as we built on the SQL Server platform and were actively involved in the SQL Server Data Warehouse ecosystem.
With the dot-com bust of 2000, I happily found an opportunity to work with Microsoft. There I started working on Analysis Services, later leading a team of program managers in Integration Services. In that time, we did some really interesting work along with Zhaohui Tang’s team, integrating Data Mining capabilities with our ETL tool, to enable predictive analytics in the flow of data. The implications of this technique are still only being realized: we have used it for imputing missing data, and have an interesting patent on how to use this technique for detecting outliers in streaming data. In addition, we included fuzzy matching techniques from Surajit Chaudhuri’s team, to give even more flexibility.
More recently I have been working in Data Mining, with a marvelous and energetic team under Jamie MacLennan, and then in the last couple of years I have been managing a super team of Program Managers building the client interfaces for our new PowerPivot application.
My current role is not focused on a single product, but rather I look across all the business intelligence products to see how we can engage our engineering knowledge ever more effectively with customers, partners, analysts and, of course, with other teams across Microsoft.
So, as you can see my background is very varied. In some ways, that means that I am not well placed to speak to how the USA can better grow a pool of science students, as I was never one myself. Yet, I do think there are some lessons I can share. Firstly, we should not make the mistake of focusing only on science and technology as an end in itself. We do need to encourage the use of information science techniques in all appropriate fields, including liberal arts, and also “power professions” such as medicine and law. The USA provides wonderful educational opportunities in these fields, but all too often young people have to choose between science and arts. Many of the best talents I have met in the world of analytics have backgrounds which are very diverse.
Q) Describe the current status of SQL Server and Microsoft Data Mining. What are the areas in Business Intelligence we can see much more excitement and innovation in the coming few months from you guys.
A) Data Mining remains one of the most popular technologies in the SQL Server stack. I have presented recently in China, Germany, The Netherlands and the UK, and at every conference the data mining sessions were among the most popular and the most successful. This speaks volumes about the interest in this field. it also reflects how successfully Microsoft has broadened our user base by shipping the Excel Data Mining Add-ins.
Q) How is Microsoft’s cloud computing venture Azure going? How is Sharepoint doing? What do you personally feel on the remote sharing and computing model.
A) Azure and Sharepoint are, of course, very different beasts. Windows Azure, and especially SQL Azure which we launched at PDC in November, are proving to be very popular. In particular SQL Server Azure is really succeeding with it’s strong development and management story – you design and manage cloud databases with the same tools and techniques as you do for on-premise databases. There has been a fabntastic response to this, especially from emerging economies where the idea of having Microsoft manage your data infrastructure at any scale is very attractive. At TechEd South Africa, for example, David Robinson from the SQL Azure team got a tremendous reception. However, there are difficulties in emerging economies because of poor bandwidth. Shortly after David and I were in South Africa, local businesses held a race: they tied a usb stick with files to the leg of a carrier pigeon and set it off home from Pietermaritzburg to Durban, simultaneously trying to download the same files between the same locations online. The pigeon won!
So, I do think the cloud offers tremendous opportunities for business to scale and manage their resources effectively, but it’s early days.
Q And when can I start do data mining from within my Excel workbook- I remember working on a SQL Server Analysis Plugin for an cloud Excel prototype last year.
A You should be using Excel for data mining right now. Just go to http://www.sqlserverdatamining.com and look for the links, on the right hand side of the page. These are released products. You can also go to http://www.sqlserverdatamining.com/cloud to try an experimental cloud service – but it is only experimental and could be up or down at any time.
For more conventional, OLAP-like, analytics you should also try out PowerPivot in beta. See http://www.powerpivot.com . PowerPivot is an application that plugs into Excel and enables business users to build quite complex models, over basically unlimited data volumes, quickly and easily. It’s proving to be hugely popular already. I am sure it will dominate much of the BI news in 2010.
Q) What are the risks, and challenges in creating new technology when working for an Industry leader like Microsoft where the spotlight is on every step you take and the competition is brutal.
A) I simply don’t think about brutal competition. Even in nature I see far more symbiosis than competition. I personally think competition is a very negative mindset although the term “competitor” is the common shorthand for another vendor in the space and I do use it that way myself – but more from habit than conviction.
In the database world, you might say Oracle are our competitors. Yet most of the Oracle customers I know (and I was an Oracle customer myself once) are also SQL Server customers. Often they use Reporting Services, or Analysis Services. Integration Services had to ship a fast-loading Oracle destination, because so many customers want to use SQL Server tools to load Oracle databases. I see far more cases like that, where the picture is complex and symbiotic, than I do of outright competition.
In the analytic space, almost every tool out there has one feature in common – one feature which everyone uses. Export to Excel.
I genuinely love working with our partners, and I am lucky to have good friends throughout the industry: at SAP, Oracle, IBM, SAS … you name it. We all benefit from empowering businesses with better tools. As the old saying goes, “the rising tide lifts all boats.”
Q) In terms of Lines of Code, Microsoft may have given the maximum number of shared libraries and code away- yet sometimes comes from a perception problem because of vintage. Do you think all cool tech companies become not so cool after some years, even if they dont fundamentally change.
A) I think the idea of a company being “cool” is itself just a phase we’re going through as an industry as we’re growing up. As the tech industry matures, you’ll see more emphasis on value, and net contributution. In many ways, Microsoft, and IBM I think, are ahead of the curve, as companies which are valued for their stability, resources and our ability to continually provide compelling new solutions and services. I travel a lot, and I see classrooms in western China, and emerging businesses in Africa, and women starting to work in new careers in the Middle East, and I don’t see them prioritizing cool. But I do see them doing amazing things with Microsoft technology.
Q) Describe your blogging style and what best tips would you give to technology bloggers.
A) I don’t blog enough, sadly, although I do try.
I have two blogs. One, at http://blogs.technet.com/sqlserverexperts/ is a shared “SQL Server Experts” blog. It’s very focussed on Microsoft technologies, of course. I especially like to blog about trends that I am seeing in my work with customers. My other blog, at http://beyeblogs.com/donaldfarmer/ is more personal, and includes gleanings from my other interests. I especially like doing my first blog of April there – that’s always fun.
My advice to bloggers should probably be “do what I say, not what I do.” However, most important I think, is to be authentic in your voice. My business intelligence bloggers are Jill Dyche, Evan Levy, David Loshin, William McKnight and Neil Raden – all of them blog quite regularly and are always great to read. There are others out there who are just as interesting, but don’t quite have the same rhythm to their blogging. I admire, but sadly fail to emulate, those who blog regularly and effectively.
Q) What do you do when not at work.
A) My wife is an artist, and she keeps me busy helping out with events and projects. We live on a wild couple of acres in Washington and caring for that is a lot of fun too. Otherwise, I mostly read, cook and play the piano. I love cooking, although I’m not sure how good I am – my son is now a professional chef, so perhaps I had some influence. I play the piano badly, but I can lose myself in that. I read very well. I love to read poetry – and I struggle to read Chinese poetry in the original. It’s such a fascinating language, and the poetry is so complex and yet so simple. That will be a lifetime study.
Donald Farmer is the Principal Program Manager, SQL Server Data Mining, at Microsoft Corp.