# KXEN – Automated Regression Modeling

I have used KXEN many times for building and testing propensity models. The regression modeling feature of KXEN is awesome in the sense it can make model building very easy to build and deliver.

The KXEN package K2R is the package responsible for this and uses robust regression. A word of the basic mathematical theory behind KXEN’s automated modeling – the technique is called Structural Risk Minimization. You can read more on the basic mathematical technique here or http://www.svms.org/srm/. The following is an extract from the same source.

Structural risk minimization (SRM) (Vapnik and Chervonekis, 1974) is an inductive principle for model selection used for learning from finite training data sets. It describes a general model of capacity control and provides a trade-off between hypothesis space complexity (the VC dimension of approximating functions) and the quality of fitting the training data (empirical error). The procedure is outlined below.

1. Using a priori knowledge of the domain, choose a class of functions, such as polynomials of degree n, neural networks having n hidden layer neurons, a set of splines with n nodes or fuzzy logic models having n rules.
2. Divide the class of functions into a hierarchy of nested subsets in order of increasing complexity. For example, polynomials of increasing degree.
3. Perform empirical risk minimization on each subset (this is essentially parameter selection).
4. Select the model in the series whose sum of empirical risk and VC confidence is minimal.

Sewell (2006) SVMs use the spirit of the SRM principle.

Structural risk minimization (SRM) (Vapnik 1995) uses a set of models ordered in terms of their complexities. An example is polynomials of increasing order. The complexity is generally given by the number of free parameters. VC dimension is another measure of model complexity. In equation 4.37, we can have a set of decreasing ?i to get a set of models ordered in increasing complexity. Model selection by SRM then corresponds to finding the model simplest in terms of order and best in terms of empirical error on the data.”
Alpaydin (2004), pages 80-81

Now back to the automated regression modeling.

Robust Regression

(K2R) is a universal solution for Classification, Regression, and Attribute Importance. It enables the prediction of behaviors (nominal targets) or quantities (continuous targets).

Unlike traditional regression algorithms, K2R can safely handle a very high numbers of input attributes (over 10,000) in an automated fashion. K2R provides indicators and graphs to ensure that the quality and robustness of trained models can be easily assessed. K2R graphically displays the attribute importance, which provides the relative importance of each attribute for explaining a given business question. At the same time it gives a clear indication of which attributes either contain no relevant information or are redundant with other attributes.

Benefits: The business value of a data mining project is increased by either training more models or completing the project faster. The ability to train more models allows a larger number of scenarios to be tested at a higher level of granularity. For example, if a direct marketing campaign benefits from separate models trained per region, per customer, segment, per month, the automation of K2R allows all of these models to be trained and safely deployed using the same amount or fewer resources than with traditional tools. learn more

What: K2R is a regression algorithm that allows building models to predict categories or continuous variables.

Why: Traditionally, building robust predictive models required a lot of time and expertise, which prevented companies from using data mining as part of their every day business decisions. K2R makes it easy to build and deploy predictive models in the fraction of the time it takes using classical statistical tools.

How: K2R maps a set of descriptive attributes (model inputs) and target attributes (model output). It uses an algorithm patented by KXEN, which is a derivation of a principle described by V. Vapnik as “Structured Risk Minimization.” Instead of looking for the best performance on a known dataset, K2R automatically finds the best compromise between quality and robustness. The resulting models are expressed as a polynomial expression of the input numbers. The only element specified by the user is the polynomial degree. To improve modeling speed, K2R can also build multi-target models.

Benefits for the business user: K2R allows the business user to easily build and understand advanced predictive models without statistical knowledge. A model can be created in a matter of minutes. Two performance indicators describe model quality (Ki) and model reliability or the ability to produce similar on new data (Kr).

K2R graphically displays the individual variable contribution to the model, which helps to select the most important variables explaining a given business question. At the same time it avoids focusing on data that contains no information.

Models can directly be applied in a simulation mode for a single input dataset predicting the score for an individual business question in real time.

Benefits for the Data Mining expert: K2R frees time for Data Mining professionals to apply their expertise in areas where they add more value instead of spending several days to tune a model. K2R produces results within minutes (less than 15 seconds on a laptop with 50,000 lines and 20 variables).

Here is a case study from the company itself.

Marketing campaign usage scenario

* Send a “Test mailing” to 5000 customers to offer them a new product,
* Collect the results of your test mailing to build a “Training” data set that associates things you know about customers prior to the mailing with the answers to your business question
* Train a model to “predict” the Yes/No answer
* Check the quality and robustness of your model (Ki, Kr)
* Apply the model to the 1,000,000 other customers in your database: this model associates each individual customer with a probability for answering Yes. Because you are using a robust model, the sum of probabilities is a good indicator of how many people will answer yes to this mail
* Send your mailing only to those customers with a high probability to respond positively, or use our built-in profit curves to optimize your return on the campaign

Example: Regression: Dealer evaluation usage scenario

* Collect information about the past performance of your dealers two years ago and associate how much of your product they sold 1 year ago
* Train a model to predict how much a dealer will sell based on the available information
* Check the quality and robustness of the model (Ki, Kr)
* Apply the model to all of your dealers today: the model associates each dealer with an estimation of how many products he will sell,
* Sum up the estimates to predict how much you will sell next year. This is the base line for your sales forecast.

In my next post I would include screenshots on how to build an automated regression model using KXEN.

Ajay Disclaimer- I am a consultant to KXEN for social networks.