Ten ways to build a wrong scoring model


Some ways to build a wrong scoring model are below- The author doesn’t take any guarantee if your modeling team is using one of these and still getting a correct model.

1) Over fit the model to the sample. This over fitting can be checked by taking a random sample again and fitting the scoring equation and compared predicted conversion rates versus actual conversion rates. The over fit model does not rank order deciles with lower average probability may show equal or more conversions than deciles with higher probability scores.

2) Choose non random samples for building and validating the scoring equation. Read over fitting above.

3) Use Multicollinearity (http://en.wikipedia.org/wiki/Multicollinearity ) without business judgment to remove variables which may make business sense.Usually happens a few years after you studied and forgot Multicollinearity.

If you dont know the difference between Multicollinearity , Heteroskedasticity http://en.wikipedia.org/wiki/Heteroskedasticity this could be the real deal breaker for you

4) Using legacy codes for running scoring usually with step wise forward and backward  regression .Happens usually on Fridays and when in a hurry to make models.

5) Ignoring signs or magnitude of parameter estimates ( thats the output or the weightage of the variable in the equation).

6) Not knowing the difference between Type 1 and Type 2 error especially when rejecting variables based on P value. ( Not knowing P value means you may kindly stop reading and click the You Tube video in the right margin )

7) Excessive zeal in removing variables. Why ? Ask yourself this question every time you are removing a variable.

8) Using the wrong causal event (like mailings for loans) for predicting the future with scoring model (for mailings of deposit accounts) . or using the right causal event in the wrong environment ( rapid decline/rise of sales due to factors not present in model like competitor entry/going out of business ,oil prices, credit shocks sob sob sigh)

9) Over fitting

10) Learning about creating models from blogs and not  reading and refreshing your old statistics textbooks

Author: Ajay Ohri


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s