Professional Documents
Culture Documents
Used for marketing to customers Always improves response and profits Better than any demographic model The most powerful segmentation method for predicting response
RFM Analysis
RFM analysis is a common approach for understanding customer purchase behavior. It is quite popular, especially in the retail industry. As its name implies, it involves the calculation and the examination of three KPIs recency, frequency, and monetary that summarize the corresponding dimensions of the customer relationship with the organization. The Recency measurement indicates the time since the last purchase transaction of the customer. Frequency denotes the number and rate of purchase transactions. Monetary indicates the value of the purchases. These indicators are typically calculated at a customer (card ID) level through simple data processing of the available transactional data.
RFM Analysis
RFM analysis can be used to identify good customers with the best scores in the relevant KPIs, who generally tend to be good prospects for additional purchases. It can also identify other purchasing patterns and respective customer types of interest, such as infrequent big-spenders or customers with small but frequent purchases who might also have sales perspectives, depending on the market and the specific product promoted.
In the retail industry, the RFM dimensions are usually defined as follows: Recency: The time (in units such as days/months/years) since the most recent purchase transaction or shopping visit. Frequency: The total number of purchase transactions or shopping visits in the period examined. An alternative, and probably a better defined, approach that also takes into account the tenure of the customer calculates frequency as the average number of transactions per unit of time, for instance the monthly average number of transactions. Monetary: The total value of the purchases within the period examined or the average value (e.g., monthly average value) per time unit. According to an alternative, but not so popular, definition, the monetary indicator is defined as the average transaction value (average value per purchase transaction). Since the total value tends to be correlated with the frequency of the transactions, the reasoning behind this alternative definition is to capture a different and supplementary aspect of purchase behavior. 4
The derived R, F, and M bins become the components for the RFM cell assignment. These bins are combined with a simple concatenation to provide the cell assignment. Customers with the top RFM values and quintile values of 5 are assigned to cell 555. Similarly, customers with the average recency (quintile 3), top frequency (quintile 5), and lowest monetary values (quintile 1) form cell 351, and so on.
The total RFM cells in the case of binning into quintiles (groups of 20%).
8
RFM score = (recency score recency weight) +(frequency score frequency weight) +(monetary score monetary weight).
As an example let us consider the case of the customer previously assigned to RFM cell 351. With equal weights of 10.0 in all the RFM individual components, this customer would receive a score of 90, in a scale ranging from 30 to 150. These scores can be rescaled to the 01 range
According to the following formula: Rescaled RFM score = RFM scoreminimum RFM score maximum RFM scoreminimum RFM score.
10
The IBM SPSS Modeler Aggregate node for summarizing purchase data at a customer level.
12
Date of the latest (maximum in the case of a date or timestamp field) purchase transaction. This information was then used to derive recency as the number of days since the most recent purchase transaction. The IBM SPSS Modeler Derive node and a date function were used to return the number of days from the last transaction to the current date (represented by IBM SPSS Modelers @TODAY function), as displayed in the screenshot
13
Deriving the recency component of the RFM score with a date function in IBM SPSS Modeler.
14
Monthly average number of distinct purchase transactions. This information defined the frequency component of the RFM analysis. As shown in Screen shot a conditional derive node in IBM SPSS Modeler was used to divide the total number of transactions by the appropriate number of months: six months for old customers and the number of months as registered customers (tenure) for new customers. Monthly average amount spent defined the monetary component. This component was calculated with a formula similar to the one used for the frequency measure.
15
Deriving the frequency component with a conditional derive node in IBM SPSS Modeler.
16
17
18
4. Development of the RFM cells through binning: Customers were sorted independently according to each of the individual RFM components and then binned into five groups of 20%. The resulting bins (quintiles) were then combined (concatenated) to form the RFM cell assignment.
19
IBM SPSS Modeler also offers a tool, named the RFM Analysis node, that can directly group the R, F, and M measures into the selected number of quintiles. Users can then collate the derived ordinal scores to form the corresponding cell-based segmentation. Additionally, this node also sums the individual components by using user-defined weights to produce a continuous RFM score.
20
5. Development of the RFM segments through clustering The majority of customers with average RFM patterns were assigned to the Typical segment. The Dormant segment contained customers on the verge of inactivity with very low purchase rates and the worst RFM profile. At the other end stood the Superstars and the Golden customers. These were high-value customers with increased frequency of transactions. Superstars in particular seemed to be the most loyal ones, with an increased number of visits/transactions and a high preference for private label products.
22
Everyday shoppers made frequent but low-value transactions, probably to cover their daily needs. They also showed increased preference for private label brands. Occasional customers on the other hand made infrequent visits to the store branches but of high average value. A deployment procedure was also developed to support future updating of the segments. The clustering model was supplemented with a classification model, a decision tree in particular, which identified the input patterns associated with each revealed RFM segment. The tree rules were saved for the segment assignment of new records. The deployment plan also included a periodic cohort analysis, a type of beforeafter examination of the customer base, with simple reports that could identify the migrations of customers across segments over time.
23
Superstars
The most loyal customers Highest value Highest frequency High spending on private labels
Golden customers
Typical customers
24
The second lowest frequency after Dormant customers Large basket Low recency values (long time since their last visit)
Everyday shoppers
Increased frequency of transactions Small basket Private labels Medium to low value
Dormant customers
Lowest frequency and value Long time since their last visit (lowest recency values)
25
26
An obvious drawback of this approach is that it usually ends up with almost the same target list of good customers, who could become annoyed with repeated contacts. Although useful, the RFM approach, when not combined with other important customer attributes such as product preferences, fails to provide a complete understanding of customer behavior. An enterprise should have a complete view of the customer and use all the available information to guide its business decisions.
27