Legal Copyrights- Some history

Here is an interesting blog post on why software giants like google ,microsoft will be rich foroever. And ironically Microsoft has given away the maximum number of free programs, dlls, extensions,patches. When I mean free I mean really free, they did not sell your identity to advertisers .

Back in 1998, representatives of the Walt Disney Company came to Washington looking for help. Disney’s copyright on Mickey Mouse, who made his screen debut in the 1928 cartoon short “Steamboat Willie,” was due to expire in 2003, and Disney’s rights to Pluto, Goofy and Donald Duck were to expire a few years later.

Rather than allow Mickey and friends to enter the public domain, Disney and its friends – a group of Hollywood studios, music labels, and PACs representing content owners – told Congress that they wanted an extension bill passed.

Prompted perhaps by the Disney group’s lavish donations of campaign cash – more than $6.3 million in 1997-98, according to the nonprofit Center for Responsive Politics – Congress passed and President Clinton signed the Sonny Bono Copyright Term Extension Act.

The CTEA extended the term of protection by 20 years for works copyrighted after January 1, 1923. Works copyrighted by individuals since 1978 got “life plus 70” rather than the existing “life plus 50”. Works made by or for corporations (referred to as “works made for hire”) got 95 years. Works copyrighted before 1978 were shielded for 95 years, regardless of how they were produced.

How to do Logistic Regression

Logistic regression is a widely used technique in database marketing for creating scoring models and in risk classification . It helps develop propensity to buy, and propensity to default scores (and even propensity to fraud ) .

This is more of a practical approach to make the model than a theory based approach.(I was never good at the theory 😉 )

If you need to do Logistic Regression using SPSS, a very good tutorial ia available here

http://www2.chass.ncsu.edu/garson/PA765/logistic.htm

(Note -Copyright 1998, 2008 by G. David Garson.
Last update 5/21/08.)

For SAS a very good tutorial is here –

SAS Annotated Output
Ordered Logistic Regression. UCLA: Academic Technology Services, Statistical Consulting Group.

from http://www.ats.ucla.edu/stat/sas/output/sas_ologit_output.htm (accessed July 23, 2007).

For R the documentation (note :Still searching for R ‘s Logistic Regression ) is here
http://lib.stat.cmu.edu/S/Harrell/help/Design/html/lrm.html

lrm(formula, data, subset, na.action=na.delete, method=”lrm.fit”, model=FALSE, x=FALSE, y=FALSE, linear.predictors=TRUE, se.fit=FALSE, penalty=0, penalty.matrix, tol=1e-7, strata.penalty=0, var.penalty=c(‘simple’,’sandwich’), weights, normwt, …)

For linear models in R –
http://datamining.togaware.com/survivor/Linear_Model0.html

An extremely good book if you want to work with R , and do not have time to learn it is to use the GUI
rattle and look at this book

http://datamining.togaware.com/survivor/Contents.html

SAS Fun: Sudoko

Here is a SAS program to help you beat others at Sudoko, and impress people. It was written by a chap named Ryan Howard in 2006, and I am thankful for him in allowing me in sharing this.You can let us know if you find a puzzle it could not solve , or if you tweak the program a bit. The code is pasted below.

Have fun !

And the SAS paper on this was at SAS Global Forum 2007- the resulting
paper, “SAS and Sudoku”, was written by Richard DeVenezia, John Gerlach,
Larry Hoyle, Talbot Katz and Rick Langston, and can be viewed at
http://www2.sas.com/proceedings/forum2007/011-2007.pdf.

(p.s. I haven’t tested this on WPS , they still dont have the SAS Macro language ,but let me know if you have any equivalent in SPSS or R 🙂   )

*=============================================================================;
* sudoku.sas                                                                  ;
* Written by: Ryan Howard                                                     ;
* Date: Sept. 2006                                                            ;
*-----------------------------------------------------------------------------;
* Summary: This program solves sudoku puzzles consisting of a 9X9 matrix.     ;
*-----------------------------------------------------------------------------;
* Upgrade Ideas:  1. Add a GUI to collect the input numbers and display output;
*                 2. Expand logic to work for 16X16 matrices                  ;
*=============================================================================;

title;
options nodate nonumber;

data _null_;

    *-----------------------------------------------------------------------------;
    * input  inital values for each cell from puzzle                              ;
    *-----------------------------------------------------------------------------;

    _1111=9; _1112=.; _1113=.;   _1211=.; _1212=.; _1213=.;   _1311=1; _1312=.; _1313=.;
    _1121=5; _1122=.; _1123=.;   _1221=.; _1222=6; _1223=.;   _1321=.; _1322=4; _1323=2;
    _1131=.; _1132=.; _1133=.;   _1231=7; _1232=1; _1233=.;   _1331=5; _1332=.; _1333=.;

    _2111=.; _2112=.; _2113=2;   _2211=.; _2212=.; _2213=.;   _2311=.; _2312=1; _2313=.;
    _2121=.; _2122=3; _2123=.;   _2221=.; _2222=.; _2223=.;   _2321=2; _2322=9; _2323=.;
    _2131=.; _2132=7; _2133=.;   _2231=.; _2232=.; _2233=6;   _2331=.; _2332=.; _2333=3;

    _3111=.; _3112=2; _3113=.;   _3211=.; _3212=.; _3213=8;   _3311=.; _3312=.; _3313=.;
    _3121=.; _3122=.; _3123=4;   _3221=5; _3222=.; _3223=.;   _3321=.; _3322=.; _3323=.;
    _3131=.; _3132=.; _3133=.;   _3231=.; _3232=3; _3233=.;   _3331=8; _3332=.; _3333=9;

    %macro printmatrix;
    *----------------------------------------------------------------;
    * print the result matrix                                        ;
    *----------------------------------------------------------------;
       *---------------------------------------------------;
       * Assign column positions for printing matrix       ;
       *---------------------------------------------------;
       c1=1;
       c2=10;
       c3=20;
 Continue reading "SAS Fun: Sudoko"

Review-The Dark knight

The much anticipated sequel to Batman begins- and Heath Ledger’s last movie is evrything you waited for and more. Ledger makes his Joker so very own, you forget Jack ever did that at as funny man. A gaunt looking Batman battles media, politicians ,hype, mafia and Joker to the point you almost feel worried for the caped fellow.

Some references to Telecom companies used for spying, some ground zero like shots of buildings collapsed, some Chinese thugs who steal money, and you wish Hollywood escaped reality and politics. But enter the Joker and he captures the screen space to the point of overshadowing anyone else. Many a critic has predicted Oscars for best actors. Thats for you to judge as you say hello to the Dark Knight and goodbye to Heath Ledger in this most splendid action hero movie of the year. There is no Katie holmes to distract you while the otherwise perfect cast returns with Bale,Caine,Oldman and Freeman to push the franchise forward.

 

The action is breathtaking and unrelenting, the suspense builds up till it echoes the lovely soundtrack and Batman returns as the Dark Knight of Gotham.

Review – HanCock(y)

When Will Smith releases a movie, we all sit up and notice. Apart from the Wild Wild West, the fresh prince of Hollywood has been havings hits after enjoyable hits . Hes got a bit of Denzel looks (and a smaller bit of  the acting), some of Chris Tucker motor mouth (but not too much) and some Wesley Snipes action appeal. He is the perfect leading man as of Today – with no conroversies too. And this time he comes back with an unconventional script ( I ,Robot, -I am Legend…) on a super hero who drinks too much, fumbles his timing,hits on women (!) and generally acts like a ….(Dont call him that !!!) but you need to see the movie at this point.

Charlize Theron is great as the sceptical wife of the Public Relations executive who tries to clean up the man.The move is both a spoof on action heroes and an action hero movie at the same time.Memorable scenes include a drunk and flying John Hancock (well Superman never drank !!) chasing bad guys, a super villian (!) and prison scenes. Enough said.The movie strikes you for the realistic special effects (yes it looks like he is actually flying not swinging by a rope )…and great lovely Will Smith cocky humour.

Watch the movie– its gonna be a cool cool summer and the action heroes are coming back. Bring Em On.

Cloud Computing across LAN’s ?

The concept of cloud computing is interesting and actually quite old. It lacked major backing till Google came along and is now increasingly seen as the alternative to PC (given that other alternatives like Tablet PC came and went).

This diagram and definition is from Wikipedia of course ”

Cloud computing refers to computing resources being accessed which are typically owned and operated by a third-party provider on a consolidated basis in Data Center locations. Consumers of cloud computing services purchase computing capacity on-demand and are not generally concerned with the underlying technologies used to achieve the increase in server capability. There are however increasing options for developers that allow for platform services in the cloud where developers do care about the underlying technology.”

What prevents local area networks from enforcing clouds beats me. Put all the apps and ALL the storage on the server.Since most PC OEMS insist on their standard 80 gb hard disk configuration, the IT team of a company has to work harder to enforce it, but once done – They have lower tickets to attend to. Just put thin shell ubuntu PC’s with open office on each local machine. This also makes compliance and productivity tracking much easier to do- just check the server logs. Bottlenecks of course remain that IT Compliance in companies rarely seeks to maximize business value, thus ensuring they are the first to be transferred  to other teams or downsized in downturns as a cost unit not as a core unit.

You can also try Google Apps for enterprise for such initiatives. The software is now ready which wasnt the case a few years back.

More Advanced SAS Modeling Procs

A special thanks to Peter Flom ( www.peterflom.com )for suggesting the following –

5) Proc NLMIXED

PROC NLMIXED can be viewed as generalizations of the random coefficient models fit by the MIXED procedure. This generalization allows the random coefficients to enter the model nonlinearly, whereas in PROC MIXED they enter linearly. With PROC MIXED you can perform both maximum likelihood and restricted maximum likelihood (REML) estimation, whereas PROC NLMIXED only implements maximum likelihood. This is because the analog to the REML method in PROC NLMIXED would involve a high dimensional integral over all of the fixed-effects parameters, and this integral is typically not available in closed form. Finally, PROC MIXED assumes the data to be normally distributed, whereas PROC NLMIXED enables you to analyze data that are normal, binomial, or Poisson or that have any likelihood programmable with SAS statements.

http://aerg.canberra.edu.au/envirostats/bm/SASHelp/stat/chap46/sect4.htm

6) Proc Glimmix

PROC GLIMMIX fits statistical models to data with correlations or nonconstant variability and where the response is not necessarily normally distributed. These generalized linear mixed models (GLMM), like linear mixed models, assume normal (Gaussian) random effects. Conditional on these random effects, data can have any distribution in the exponential family. The binary, binomial, Poisson, and negative binomial distributions, for example, are discrete members of this family. The normal, beta, gamma, and chi-square distributions are representatives of the continuous distributions in this family.

Some PROC GLIMMIX features are:

  • Flexible covariance structures for random effects and correlated errors
  • Programmable link and variance functions
  • Bias-adjusted empirical covariance estimators
  • Univariate and multivariate low-rank smoothing
  • Joint modeling for multivariate data

Besides including performance enhancements and various fixes, the production release of the GLIMMIX procedure provides numerous additional features. These include:

  • ODS statistical graphics to display LS-means and confidence limits
  • Analysis of Means
  • Odds ratios
  • Custom hypotheses concerning LS-means with the LSMESTIMATE statement
  • New multiplicity adjustments
  • Beta regression

www2.sas.com/proceedings/sugi30/196-30.pdf

http://support.sas.com/rnd/app/da/glimmix.html

3) Proc QUANTREG

www.stat.uiuc.edu/~x-he/ENAR-Tutorial.pdf

Ordinary least squares regression models the relationship between one or more covariates X and the conditional mean of the response variable Y given X=x. Quantile regression extends the regression model to conditional quantiles of the response variable, such as the 90th percentile. Quantile regression is particularly useful when the rate of change in the conditional quantile, expressed by the regression coefficients, depends on the quantile. The main advantage of quantile regression over least squares regression is its flexibility for modeling data with heterogeneous conditional distributions. Data of this type occur in many fields, including biomedicine, econometrics, and ecology.

Some PROC QUANTREG features are:

  • Implements the simplex, interior point, and smoothing algorithms for estimation
  • Provides three methods to compute confidence intervals for the regression quantile parameter: sparsity, rank, and resampling.
  • Provides two methods to compute the covariance and correlation matrices of the estimated parameters: an asymptotic method and a bootstrap method
  • Provides two tests for the regression parameter estimates: the Wald test and a likelihood ratio test
  • Uses robust multivariate location and scale estimates for leverage point detection
  • Multithreaded for parallel computing when multiple processors are available

4) Proc Catmod-

http://www.uidaho.edu/ag/statprog/sas/workshops/catmod/outline.html

Categorical data with more than two factors are referred to as multi-dimensional distributions. Procedure CATMOD will be used for analyses concerning such data. PROC CATMOD may also be used to analyze one-and two-way data structures , however it is an effective means to approach more complex data structures.

PROC CATMOD utilizes a different technique to do categorical analysis than the ‘Pearson type’ chi-square. The analysis is based on a transformation of the cell probabilities. This transformation is called the response function. The exact form of the response function depends on the data type and it is normally motivated by certain theoretical considerations. SAS offers many different forms of response functions and even allows the user to specify their own, however, the most common (default) is the Generalized Logit. This function is defined as:

Generalized Logit = LOG(pi/pk),
where pi is the ith cell probability and pk is the last cell probability. The ratio of pi/pk is called an odds ratio and the log of the odds ratio is just a comparison of the ith category to the last, on a log scale. The logit can be rewritten as:
Generalized Logit = LOG(pi) – LOG(pk).
It should be noted that if there are k categories, then there will be only k-1 response functions since the kth one will be zero.