RFM is a method used for analyzing customer behavior and defining market segments. It is commonly used in database marketing and direct marketing and has received particular attention in retail.
RFM stands for
- Recency – How recently did the customer purchase?
- Frequency – How often do they purchase?
- Monetary Value – How much do they spend?
To create an RFM analysis, one creates categories for each attribute. For instance, the Recency attribute might be broken into three categories: customers with purchases within the last 90 days; between 91 and 365 days; and longer than 365 days. Such categories may be arrived at by applying business rules, or using a data mining technique, such as CHAID, to find meaningful breaks.
If you are new to RFM or need more step by step help, please read here
and here is R code- note for direct marketing you need to compute Monetization based on response rates (based on offer date) as well
##Creating Random Sales Data of the format CustomerId (unique to each customer), Sales.Date,Purchase.Value sales=data.frame(sample(1000:1999,replace=T,size=10000),abs(round(rnorm(10000,28,13)))) names(sales)=c("CustomerId","Sales Value") sales.dates <- as.Date("2010/1/1") + 700*sort(stats::runif(10000)) #generating random dates sales=cbind(sales,sales.dates) str(sales) sales$recency=round(as.numeric(difftime(Sys.Date(),sales[,3],units="days")) ) library(gregmisc) ##if you have existing sales data you need to just shape it in this format rename.vars(sales, from="Sales Value", to="Purchase.Value")#Renaming Variable Names ## Creating Total Sales(Monetization),Frequency, Last Purchase date for each customer salesM=aggregate(sales[,2],list(sales$CustomerId),sum) names(salesM)=c("CustomerId","Monetization") salesF=aggregate(sales[,2],list(sales$CustomerId),length) names(salesF)=c("CustomerId","Frequency") salesR=aggregate(sales[,4],list(sales$CustomerId),min) names(salesR)=c("CustomerId","Recency") ##Merging R,F,M test1=merge(salesF,salesR,"CustomerId") salesRFM=merge(salesM,test1,"CustomerId") ##Creating R,F,M levels salesRFM$rankR=cut(salesRFM$Recency, 5,labels=F) #rankR 1 is very recent while rankR 5 is least recent salesRFM$rankF=cut(salesRFM$Frequency, 5,labels=F)#rankF 1 is least frequent while rankF 5 is most frequent salesRFM$rankM=cut(salesRFM$Monetization, 5,labels=F)#rankM 1 is lowest sales while rankM 5 is highest sales ##Looking at RFM tables
table(salesRFM[,5:6]) table(salesRFM[,6:7]) table(salesRFM[,5:7])
Code Highlighted by Pretty R at inside-R.org
Note-you can also use quantile function instead of cut function. This changes cut to equal length instead of equal interval. or see other methods for finding breaks for categories.