Thursday, December 29, 2016

33. SCORING MODELS

OBJECTIVE
Define the priority of action concerning customers, employees, products, and so on.


DESCRIPTION
Scoring models help to decide which elements to act on as a priority based on the score that they obtain. For example, we can create a scoring model to prevent employees leaving the company in which the score depends on both the probability of leaving and the performance (we will act first on those employees who have a higher probability of leaving and are important to the company). Scoring models are also quite useful in marketing; for example, we can score customers based on their probability of responding positively to a telemarketing call and, based on our resources, call just the first “X” customers.

The model that I will propose concerns a scoring model of customers’ value based on the probability of purchasing a product and on the amount that they are likely to spend. This model is the result of two sub-models:

  • -          Purchase probability: We will use a logistic regression to estimate the purchase probability of a customer in the next period (see 26. RFM MODEL and 60. LOGISTIC REGRESSION);
  • -          Amount: We will use a linear regression to estimate the amount that each customer is likely to spend on his or her next purchase (see 38. LINEAR REGRESSION).

The first step is to choose the predictor variables. In our case I suggest using recency, first purchase, frequency, average amount, and maximum amount of year -2, but we could try additional or different variables. The target variable will be a binary variable that represents whether the client made a purchase during the following period (year -1). A logistic regression is run with the eventual transformation of variables and after verifying that all the necessary assumptions are met (see 36. INTRODUCTION TO REGRESSIONS and 60. LOGISTIC REGRESSION).

scoring model logistic regression

Coefficients of the Logistic and Linear Regressions

In the second part of the model, we can use for example only the average amount and the maximum amount of year -2, and the total amount spent in year -1 is used as the target variable. We run a multivariate linear regression with the eventual transformation of the variables, after verifying that all the necessary assumptions are met (see 36. INTRODUCTION TO REGRESSIONS, 38. LINEAR REGRESSION, and 39. OTHER REGRESSIONS). It is important to note that in this regression we will not use the whole customer database but select only those customers who realized a purchase in year -1.

The last step is to put together the two regressions to score customers based on both their purchase probability and the likely amount that they will spend. We use the regression coefficients for the estimates of each customer. In the linear regression, we directly sum the intercept and multiply the variables’ coefficients (Figure below) by the actual values of each customer to estimate the amount.[1] However, in the logistic regression we should use the exponential function to calculate the real odds of purchasing:

Probability = 1 / (1 + exp(- (intercept coefficient + variable 1 coefficient * variable 1 + variable n coefficient * variable n)))


Result Table with the Purchase Probability, Estimated Amount, and Final Score

Now that we have two more columns in our database, we just need to add a third one for the final score, which will be the purchase probability times the estimated amount (Figure above). With this indicator we can either rank our customers (to prioritize marketing and resource allocation for some customers) or use this indicator to estimate next-period revenues.


Download the Scoring Models Template




[1] Estimated amount = Intercept + Coefficient 1 * Variable 1 + Coefficient 2 * Variable 2.
Be aware that, if we have transformed some of the variables, we cannot simply multiply the coefficient but should make some additional calculations.

No comments:

Post a Comment