KDNuggets Survey on R

CRISP-DM
Image via Wikipedia

From http://www.kdnuggets.com/2011/03/new-poll-r-in-analytics-data-mining-work.html?k11n07

A new poll/survey on actual usage of R in Data Mining

R has been steadily growing in popularity among data miners and analytic professionals.

In KDnuggets 2010 Data Mining / Analytic Tools Poll, R was used by 30% of respondents.
In 2010 Rexer Analytics Data Miner SurveyR was the most popular tool, used by 43% of the data miners.

Another aspect of tool usefulness is how much does it help with the entire data mining process from data preparation and cleaning, modeling, evaluation, visualization and presentation (excluding deployment).

New KDnuggets Poll is asking:
What part of your analytics / data mining work in the past 12 months was done in R?

http://www.kdnuggets.com/2011/03/new-poll-r-in-analytics-data-mining-work.html?k11n07

 

Assumptions on Guns

This is a very crude yet functional homemade g...
Image via Wikipedia

While sitting in Delhi, India- I sometimes notice that there is one big new worthy gun related incident in the United States every six months (latest incident Gabrielle giffords incident) and the mythical NRA (which seems just as powerful as equally mythical Jewish American or Cuban American lobby ) . As someone who once trained to fire guns (.22 and SLR -rifles actually), comes from a gun friendly culture (namely Punjabi-North Indian), my dad carried a gun sometimes as a police officer during his 30 plus years of service, I dont really like guns (except when they are in a movie). My 3 yr old son likes guns a lot (for some peculiar genetic reason even though we are careful not to show him any violent TV or movie at all).

So to settle the whole guns are good- guns are bad thing I turned to the one resource -Internet

Here are some findings-

1) A lot of hard statistical data on guns is biased by the perspective of the writer- it reminds me of the old saying Lies, True lies and Statistics.

2) There is not a lot of hard data in terms of a universal research which can be quoted- unlike say lung cancer is caused by cigarettes- no broad research which can be definitive in this regards.

3) American , European and Asian attitudes on guns actually seem a function of historical availability , historic crime rates and cultural propensity for guns.

Switzerland and United States are two extreme outlier examples on gun causing violence causal statistics.

4) Lot of old and outdated data quoted selectively.

It seems you can fudge data about guns in the following ways-

1) Use relative per capita numbers vis a vis aggregate numbers

2) Compare and contrast gun numbers with crime numbers selectively

3) Remove drill down of type of firearm- like hand guns, rifles, automatic, semi automatic

Maybe I am being simplistic-but I found it easier to list credible data sources on guns than to summarize all assumptions on guns. Are guns good or bad- i dont know -it depends? Any research you can quote is welcome.

Data Sources on Guns and Firearms and Crime-

1) http://www.justfacts.com/guncontrol.asp

Ownership

* As of 2009, the United States has a population of 307 million people.[5]

* Based on production data from firearm manufacturers,[6] there are roughly 300 million firearms owned by civilians in the United States as of 2010. Of these, about 100 million are handguns.[7]

* Based upon surveys, the following are estimates of private firearm ownership in the U.S. as of 2010:

Households With a Gun Adults Owning a Gun Adults Owning a Handgun
Percentage 40-45% 30-34% 17-19%
Number 47-53 million 70-80 million 40-45 million

[8]

* A 2005 nationwide Gallup poll of 1,012 adults found the following levels of firearm ownership:

Category Percentage Owning 

a Firearm

Households 42%
Individuals 30%
Male 47%
Female 13%
White 33%
Nonwhite 18%
Republican 41%
Independent 27%
Democrat 23%

[9]

* In the same poll, gun owners stated they own firearms for the following reasons:

Protection Against Crime 67%
Target Shooting 66%
Hunting 41%

2) NationMaster.com

http://www.nationmaster.com/graph/cri_mur_wit_fir-crime-murders-with-firearms

VIEW DATA: Totals Per capita
Definition Source Printable version
Bar Graph Pie Chart Map

Showing latest available data.

Rank Countries Amount
# 1 South Africa: 31,918
# 2 Colombia: 21,898
# 3 Thailand: 20,032
# 4 United States: 9,369
# 5 Philippines: 7,708
# 6 Mexico: 2,606
# 7 Slovakia: 2,356
# 8 El Salvador: 1,441
# 9 Zimbabwe: 598
# 10 Peru: 442
# 11 Germany: 269
# 12 Czech Republic: 181
# 13 Ukraine: 173
# 14 Canada: 144
# 15 Albania: 135
# 16 Costa Rica: 131
# 17 Azerbaijan: 120
# 18 Poland: 111
# 19 Uruguay: 109
# 20 Spain: 97
# 21 Portugal: 90
# 22 Croatia: 76
# 23 Switzerland: 68
# 24 Bulgaria: 63
# 25 Australia: 59
# 26 Sweden: 58
# 27 Bolivia: 52
# 28 Japan: 47
# 29 Slovenia: 39
= 30 Hungary: 38
= 30 Belarus: 38
# 32 Latvia: 28
# 33 Burma: 27
# 34 Macedonia, The Former Yugoslav Republic of: 26
# 35 Austria: 25
# 36 Estonia: 21
# 37 Moldova: 20
# 38 Lithuania: 16
= 39 United Kingdom: 14
= 39 Denmark: 14
# 41 Ireland: 12
# 42 New Zealand: 10
# 43 Chile: 9
# 44 Cyprus: 4
# 45 Morocco: 1
= 46 Iceland: 0
= 46 Luxembourg: 0
= 46 Oman: 0
Total: 100,693
Weighted average: 2,097.8

DEFINITION: Total recorded intentional homicides committed with a firearm. Crime statistics are often better indicators of prevalence of law enforcement and willingness to report crime, than actual prevalence.

SOURCE: The Eighth United Nations Survey on Crime Trends and the Operations of Criminal Justice Systems (2002) (United Nations Office on Drugs and Crime, Centre for International Crime Prevention)

3)

Bureau of Justice Statistics

see

http://bjs.ojp.usdoj.gov/dataonline/Search/Homicide/State/RunHomTrendsInOneVar.cfm

or the brand new website (till 2009) on which I CANNOT get gun crime but can get total

http://www.ucrdatatool.gov/

Estimated  murder rate *
Year United States-Total

1960 5.1
1961 4.8
1962 4.6
1963 4.6
1964 4.9
1965 5.1
1966 5.6
1967 6.2
1968 6.9
1969 7.3
1970 7.9
1971 8.6
1972 9.0
1973 9.4
1974 9.8
1975 9.6
1976 8.7
1977 8.8
1978 9.0
1979 9.8
1980 10.2
1981 9.8
1982 9.1
1983 8.3
1984 7.9
1985 8.0
1986 8.6
1987 8.3
1988 8.5
1989 8.7
1990 9.4
1991 9.8
1992 9.3
1993 9.5
1994 9.0
1995 8.2
1996 7.4
1997 6.8
1998 6.3
1999 5.7
2000 5.5
2001 5.6
2002 5.6
2003 5.7
2004 5.5
2005 5.6
2006 5.7
2007 5.6
2008 5.4
2009 5.0
Notes: National or state offense totals are based on data from all reporting agencies and estimates for unreported areas.
* Rates are the number of reported offenses per 100,000 population
  • United States-Total –
    • The 168 murder and nonnegligent homicides that occurred as a result of the bombing of the Alfred P. Murrah Federal Building in Oklahoma City in 1995 are included in the national estimate.
    • The 2,823 murder and nonnegligent homicides that occurred as a result of the events of September 11, 2001, are not included in the national estimates.

     

  • Sources: 


    FBI, Uniform Crime Reports as prepared by the National Archive of Criminal Justice Data


    4) united nation statistics of 2002  were too old in my opinion.
    wikipedia seems too broad based to qualify as a research article but is easily accessible http://en.wikipedia.org/wiki/Gun_violence_in_the_United_States
    to actually buy a gun or see guns available for purchase in United States see
    http://www.usautoweapons.com/

    KDNuggets Poll on SAS: Churn in Analytics Users

    Here are the some surprising results from the Bible of all Data Miners , KDNuggets.com with some interesting comments about SAS being the Microsoft of analytics.

    I believe technically advanced users will probably want to try out R before going in for a commercial license from Revolution Analytics as it is free to try out. Also WPS offers a one month free preview for its software- the latest release of it competes with SAS/Stat and SAS/Access, SAS/Graph and Base SAS- so anyone having these installations on a server would be interested to atleast test it for free. Also WPS would be interested in increasing engines (like they have for Oracle and Teradata).

    One very crucial difference for SAS is it’s ability to pull in data from almost all data formats- so if you are using SAS/Connect to remote submit code- then you may not be able to switch soon.

    Also the more license heavy customers are not the kind of cutomers who have lots of data in their local desktops but is usually pulled and then crunched before analysed. R has recently made some strides with the RevoScaler package from Revolution Analytics but it’s effectiveness would be tested and tried in the coming months- it seems like a great step in the right direction.

    For SAS, the feedback should be a call to improve their product bundling – some of which can feel like over selling at times- but they have been fighting off challenges since past 4 decades and have the pockets and intention to sustain market share battles including discounts ( for repeat customers SAS can be much cheaper than say a first time user of WPS or R)

    http://teamwpc.co.uk/home

    This really should come as a surprise to some people. You can see the comments on WPS and R at the site itself. Interesting stufff and we can see after say 1 year to see how many actually DID switch.

    http://www.kdnuggets.com/polls/2010/switching-from-sas-to-wps.html