Common Analytical Tasks

WorldWarII-DeathsByCountry-Barchart
Image via Wikipedia

 

Some common analytical tasks from the diary of the glamorous life of a business analyst-

1) removing duplicates from a dataset based on certain key values/variables
2) merging two datasets based on a common key/variable/s
3) creating a subset based on a conditional value of a variable
4) creating a subset based on a conditional value of a time-date variable
5) changing format from one date time variable to another
6) doing a means grouped or classified at a level of aggregation
7) creating a new variable based on if then condition
8) creating a macro to run same program with different parameters
9) creating a logistic regression model, scoring dataset,
10) transforming variables
11) checking roc curves of model
12) splitting a dataset for a random sample (repeatable with random seed)
13) creating a cross tab of all variables in a dataset with one response variable
14) creating bins or ranks from a certain variable value
15) graphically examine cross tabs
16) histograms
17) plot(density())
18)creating a pie chart
19) creating a line graph, creating a bar graph
20) creating a bubbles chart
21) running a goal seek kind of simulation/optimization
22) creating a tabular report for multiple metrics grouped for one time/variable
23) creating a basic time series forecast

and some case studies I could think of-

 

As the Director, Analytics you have to examine current marketing efficiency as well as help optimize sales force efficiency across various channels. In addition you have to examine multiple sales channels including inbound telephone, outgoing direct mail, internet email campaigns. The datawarehouse is an RDBMS but it has multiple data quality issues to be checked for. In addition you need to submit your budget estimates for next year’s annual marketing budget to maximize sales return on investment.

As the Director, Risk you have to examine the overdue mortgages book that your predecessor left you. You need to optimize collections and minimize fraud and write-offs, and your efforts would be measured in maximizing profits from your department.

As a social media consultant you have been asked to maximize social media analytics and social media exposure to your client. You need to create a mechanism to report particular brand keywords, as well as automated triggers between unusual web activity, and statistical analysis of the website analytics metrics. Above all it needs to be set up in an automated reporting dashboard .

As a consultant to a telecommunication company you are asked to monitor churn and review the existing churn models. Also you need to maximize advertising spend on various channels. The problem is there are a large number of promotions always going on, some of the data is either incorrectly coded or there are interaction effects between the various promotions.

As a modeller you need to do the following-
1) Check ROC and H-L curves for existing model
2) Divide dataset in random splits of 40:60
3) Create multiple aggregated variables from the basic variables

4) run regression again and again
5) evaluate statistical robustness and fit of model
6) display results graphically
All these steps can be broken down in little little pieces of code- something which i am putting down a list of.
Are there any common data analysis tasks that you think I am missing out- any common case studies ? let me know.

 

 

 

Book Reviews- Hindu Myths- Mere Christianity

A statue of Hindu deity Shiva in a temple in B...
Image via Wikipedia

Over the month long break I took, I was helping firm up my ideas for R for Analytics , I also took a break and read some books. Here are brief reviews of two, three of them-

1) Hindu Myths

This is a classical book translated from original Sanskrit written by Professor Wendy O Flaherty of University of Chicago. I found some of the older myths very interesting in terms of contradictions, retelling the same story in a modified way by another classic, the beautiful poetic and fantastic imagery evoked by Hindu myths. Some stories are as relevant in prayers, fasts and religious ceremonies as they were around 11000 years while most have morphed , edited or even distorted.

It should help the non Indian reader understand why hundreds of millions of conservative Indians worship Shiv Ling ( or literally an idol of the Phallus of Shiva), the Hindu two cents of creation of the universe, and the somewhat fantastic stories on super heroes /gods/ in the ancient world.

The book suffers from a few drawbacks in my opinion-

1) Sanskrit is a bit like Latin- you can lose not just the flavor but original meaning of words and situational context. Some of the stories made better sense when i read a more recent Hindi translation.

2) An excessive emphasis on sexual imagery rather than emotional imagery. The author seems wonder struck to read and translate ancient indians were so matter of fact about physical relationships. However the words were always written in discrete poetic than crass soft pornography.

3) Almost no drawings or figures. This makes the book a bit dense to read at 300 pages.

I liked another book on Hindu Myths (Myth= Mithya which I read in 2009) and you can see if you can read it if you find the topic interesting.

A Handbook of Hindu Mythology

Hindus have one God.
They also have 330 million gods: male gods, female gods, personal gods, family gods, household gods, village gods, gods of space and time, gods for specific castes and particular professions, gods who reside in trees, in animals, in minerals, in geometrical patterns and in man-made objects.
Then there are a whole host of demons.
But no Devil.


Mere Christianity by C S Lewis is a classic book on reinterpreting Christianity in modern times. However the author wrote this when World War 2 was on and it seems more like a British or Anglo Saxon interpretation of beliefs of Christ Jesus– who was actually a Jewish teacher born in Middle East Asia.

While the language and reading makes it much easier to read- it is recommended more at Western audiences, than Eastern ones, as it seems some of the parables are a more palatable re interpretation of the New Testament. The Bible is a deceptively easy book to read, the language is short and beautiful-and the original parables in the Gospels remain powerful easy to understand.

C S Lewis tends to emphasize morality than religiosity or faith, and there is not much comparison with any other faith or alternative morality. Dumbing down the Bible so as to market it better to reluctant consumers seems to be Mr Lewis intention and it is not as scholarly a work as an exercise in pure prose.

However it is quite good as a self improvement book and is quite better than the “You Can Win” kind of books or even business concept books.

Note- I find reading books on religion as good exercises in reading the fountain source of philosophies. As a polytheist- I tend to read more than one faith.

2010 in review and WP-Stats

The following is an auto generated post thanks to WordPress.com stats team- clearly they have got some stuff wrong

1) Defining the speedometer quantitatively

2) The busiest day numbers are plain wrong ( 2 views ??)

3) There is still no geographic data in WordPress -com stats (unlike Google Analytics) and I cant enable Google Analytics on a wordpress.com hosted site.

 

The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

Healthy blog!

The Blog-Health-o-Meter™ reads Wow.

Crunchy numbers

Featured image

The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 97,000 times in 2010. If it were an exhibit at The Louvre Museum, it would take 4 days for that many people to see it.

 

In 2010, there were 367 new posts, growing the total archive of this blog to 1191 posts. There were 411 pictures uploaded, taking up a total of 121mb. That’s about 1 pictures per day.

The busiest day of the year was September 22nd with 2 views. The most popular post that day was Top 10 Graphical User Interfaces in Statistical Software.

Where did they come from?

The top referring sites in 2010 were r-bloggers.com, reddit.com, rattle.togaware.com, twitter.com, and Google Reader.

Some visitors came searching, mostly for libre office, facebook analytics, test drive a chrome notebook, test drive a chrome notebook., and wps sas lawsuit.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

1

Top 10 Graphical User Interfaces in Statistical Software April 2010
8 comments and 1 Like on WordPress.com,

2

Wealth = function (numeracy, memory recall) December 2009
1 Like on WordPress.com,

3

Matlab-Mathematica-R and GPU Computing September 2010
1 Like on WordPress.com,

4

About DecisionStats July 2008

5

The Top Statistical Softwares (GUI) May 2010
1 comment and 1 Like on WordPress.com,

Light Cycle of Tron review

comiccon2010-6814.jpg
Image by YGX via Flickr

I really enjoyed the Light Cycle race in Tron- so instead of naming this the Tron Legacy Review- I call this Light cycle review.

The movie is a geek must check it out- and the mix of music, models,cars, and lights can be heady at first. The younger Jeff Bridges looks like a BeoWolf, and his son is ok. But Olivia Wilde is nice- and the cars and bikes are superb. If you like playing video games then check out the free game at http://armagetronad.net/downloads.php Its called Armagedtron.

And boy the 80s was a great time for pop music and video games.

Test Drive a Google Chrome Notebook: Last Two Days left

Main logo and icon for the open source interne...
Image via Wikipedia

Chrome

Test drive a Chrome notebook.

We have a limited number of Chrome notebooks to distribute, and we need to ensure that they find good homes. That’s where you come in. Everything is still very much a work in progress, and it’s users, like you, that often give us our best ideas about what feels clunky or what’s missing. So if you live in the United States, are at least 18 years old, and would like to be considered for our small Pilot program, please fill this out. It should take about 15 minutes. We’ll review the requests that come in and contact you if you’ve been selected.

This application will be open until 11:59:59 PM PST on December 21, 2010.

What type of user are you?

https://services.google.com/fb/forms/cr48advanced/

Business
Education
Non-Profit
Developer
Individual

Test drive a Chrome notebook.

The United States
Image via Wikipedia

Wanna test out the new Chrome OS.

Go to https://services.google.com/fb/forms/cr48basic/

and fill the form

Chrome

Test drive a Chrome notebook.

We have a limited number of Chrome notebooks to distribute, and we need to ensure that they find good homes. That’s where you come in. Everything is still very much a work in progress, and it’s users, like you, that often give us our best ideas about what feels clunky or what’s missing. So if you live in the United States, are at least 18 years old, and would like to be considered for our small Pilot program, please fill this out. We’ll review the requests that come in and contact you if you’ve been selected.

https://services.google.com/fb/forms/cr48basic/

 

Sugar CRM: Forrester Webinar

Analytické CRM
Image via Wikipedia

https://sugarcrmevents.webex.com/mw0306lb/mywebex/default.do?nomenu=true&siteurl=sugarcrmevents&service=6&main_url=https://sugarcrmevents.webex.com/ec0605lb/eventcenter/event/eventAction.do%3FtheAction%3Ddetail%26confViewID%3D279191911%26siteurl%3Dsugarcrmevents%26%26%26

Date and time:

Thursday, December 2, 2010 11:00 am 
Pacific Standard Time (San Francisco, GMT-08:00) 
Change time zone

Thursday, December 2, 2010 2:00 pm 
Eastern Standard Time (New York, GMT-05:00)
Thursday, December 2, 2010 7:00 pm
Western European Time (London, GMT)
Thursday, December 2, 2010 8:00 pm
Europe Time (Berlin, GMT+01:00)
Duration: 1 hour
Description:
Every organization wants to improve the way they manage their customer relationships. But until recently, adding robust CRM tools to your organization was a time consuming and cost prohibitive endeavor for many resources-constrained organizations. Until Now. On December 2 join us to learn how new developments in technology like open source, cloud computing and web 2.0 – are making it easier than ever to add a top notch CRM system to your operations. 

 

This live webinar hosted by SugarCRM will feature Forrester Research, Inc. Vice President William Band, named one CRM Magazine’s 2007 Influential Leaders. Mr. Band will discuss the current state of the market, review the major trends affecting the CRM landscape, and discuss some criteria you can use to ensure your next CRM decision is the right one.

In addition, all attendees of the live webinar will receive a complimentary download a recent Forrester Wave™ Report! Register today!

Speakers:

William Band, Vice President, Forrester Research
Martin Schneider, Sr. Director Communications, SugarCRM

Who Should Attend:
VP Sales, VP Marketing, CIO’s, Head of Customer Support and other technical decision makers