Customer segmentation – LifeCycle Grids, CLV and CAC with R（转）

We studied a very powerful approach for customer segmentation in the previous post, which is based on the customer’s lifecycle. We used two metrics: frequency and recency. It is also possible and very helpful to add monetary value to our segmentation. If you havecustomer acquisition cost (CAC) and customer lifetime value (CLV), you can easily add these data to the calculations.

We will create the same data sample as in the previous post, but with two added data frames:

cac, our expenses for each customer acquisition,
gr.margin, gross margin of each product.

# loading libraries

library(dplyr)

library(reshape2)

library(ggplot2)

# creating data sample

set.seed(10)

data <- data.frame(orderId=sample(c(1:1000), 5000, replace=TRUE),

product=sample(c('NULL','a','b','c'), 5000, replace=TRUE,

prob=c(0.15, 0.65, 0.3, 0.15)))

order <- data.frame(orderId=c(1:1000),

clientId=sample(c(1:300), 1000, replace=TRUE))

gender <- data.frame(clientId=c(1:300),

gender=sample(c('male', 'female'), 300, replace=TRUE, prob=c(0.40, 0.60)))

date <- data.frame(orderId=c(1:1000),

orderdate=sample((1:100), 1000, replace=TRUE))

orders <- merge(data, order, by='orderId')

orders <- merge(orders, gender, by='clientId')

orders <- merge(orders, date, by='orderId')

orders <- orders[orders$product!='NULL', ]

orders$orderdate <- as.Date(orders$orderdate, origin="2012-01-01")

# creating data frames with CAC and Gross margin

cac <- data.frame(clientId=unique(orders$clientId), cac=sample(c(10:15), 289, replace=TRUE))

gr.margin <- data.frame(product=c('a', 'b', 'c'), grossmarg=c(1, 2, 3))

rm(data, date, order, gender)

Next, we will calculate CLV to date (actual amount that we earned) using gross margin values and orders of the products. We will use the following code:

# reporting date

today <- as.Date('2012-04-11', format='%Y-%m-%d')

# calculating customer lifetime value

orders <- merge(orders, gr.margin, by='product')

clv <- orders %>%

group_by(clientId) %>%

summarise(clv=sum(grossmarg))

# processing data

orders <- dcast(orders, orderId + clientId + gender + orderdate ~ product, value.var='product', fun.aggregate=length)

orders <- orders %>%

group_by(clientId) %>%

mutate(frequency=n(),

recency=as.numeric(today-orderdate)) %>%

filter(orderdate==max(orderdate)) %>%

filter(orderId==max(orderId))

orders.segm <- orders %>%

mutate(segm.freq=ifelse(between(frequency, 1, 1), '1',

ifelse(between(frequency, 2, 2), '2',

ifelse(between(frequency, 3, 3), '3',

ifelse(between(frequency, 4, 4), '4',

ifelse(between(frequency, 5, 5), '5', '>5')))))) %>%

mutate(segm.rec=ifelse(between(recency, 0, 6), '0-6 days',

ifelse(between(recency, 7, 13), '7-13 days',

ifelse(between(recency, 14, 19), '14-19 days',

ifelse(between(recency, 20, 45), '20-45 days',

ifelse(between(recency, 46, 80), '46-80 days', '>80 days')))))) %>%

# creating last cart feature

mutate(cart=paste(ifelse(a!=0, 'a', ''),

ifelse(b!=0, 'b', ''),

ifelse(c!=0, 'c', ''), sep='')) %>%

arrange(clientId)

# defining order of boundaries

orders.segm$segm.freq <- factor(orders.segm$segm.freq, levels=c('>5', '5', '4', '3', '2', '1'))

orders.segm$segm.rec <- factor(orders.segm$segm.rec, levels=c('>80 days', '46-80 days', '20-45 days', '14-19 days', '7-13 days', '0-6 days'))

Note: if you prefer to use potential/expected/predicted CLV or total CLV (sum of CLV to date and potential CLV) you can adapt this code or find the example in the next post.

In addition, we need to merge orders.segm with the CAC and CLV data, and combine the data with the segments. We will calculate total CAC and CLV to date, as well as their average with the following code:

orders.segm <- merge(orders.segm, cac, by='clientId')

orders.segm <- merge(orders.segm, clv, by='clientId')

lcg.clv <- orders.segm %>%

group_by(segm.rec, segm.freq) %>%

summarise(quantity=n(),

# calculating cumulative CAC and CLV

cac=sum(cac),

clv=sum(clv)) %>%

ungroup() %>%

# calculating CAC and CLV per client

mutate(cac1=round(cac/quantity, 2),

clv1=round(clv/quantity, 2))

lcg.clv <- melt(lcg.clv, id.vars=c('segm.rec', 'segm.freq', 'quantity'))

Ok, let’s plot two charts: the first one representing the totals and the second one representing the averages:

ggplot(lcg.clv[lcg.clv$variable %in% c('clv', 'cac'), ], aes(x=variable, y=value, fill=variable)) +

theme_bw() +

theme(panel.grid = element_blank())+

geom_bar(stat='identity', alpha=0.6, aes(width=quantity/max(quantity))) +

geom_text(aes(y=value, label=value), size=4) +

facet_grid(segm.freq ~ segm.rec) +

ggtitle("LifeCycle Grids - CLV vs CAC (total)")

ggplot(lcg.clv[lcg.clv$variable %in% c('clv1', 'cac1'), ], aes(x=variable, y=value, fill=variable)) +

theme_bw() +

theme(panel.grid = element_blank())+

geom_bar(stat='identity', alpha=0.6, aes(width=quantity/max(quantity))) +

geom_text(aes(y=value, label=value), size=4) +

facet_grid(segm.freq ~ segm.rec) +

ggtitle("LifeCycle Grids - CLV vs CAC (average)")

You can find in the grid that the width of bars depends on the number of customers. I think these visualizations are very helpful. You can see the difference between CLV to dateand CAC and make decisions about on paid campaigns or initiatives like:

does it make sense to spend extra money to reactivate some customers (e.g. those who are in the “1 order / >80 days“ grid or those who are in the “>5 orders / 20-45 days“ grid)?,
how much money is appropriate to spend?,
and so on.

Therefore, we have got a very interesting visualization. We can analyze and make decisions based on the three customer lifecycle metrics: recency, frequency andmonetary value.

Thank you for reading this!

转自：http://analyzecore.com/2015/02/19/customer-segmentation-lifecycle-grids-clv-and-cac-with-r/

Customer segmentation – LifeCycle Grids, CLV and CAC with R（转）的更多相关文章

Customer segmentation – LifeCycle Grids with R（转）
I want to share a very powerful approach for customer segmentation in this post. It is based on cust ...
Cohort Analysis and LifeCycle Grids mixed segmentation with R（转）
This is the third post about LifeCycle Grids. You can find the first post about the sense of LifeCyc ...
Appboy 基于 MongoDB 的数据密集型实践
摘要:Appboy 正在过手机等新兴渠道尝试一种新的方法,让机构可以与顾客建立更好的关系,可以说是市场自动化产业的一个前沿探索者.在移动端探索上,该公司已经取得了一定的成功,知名产品有 iHeartM ...
ML.NET教程之客户细分(聚类问题)
理解问题客户细分需要解决的问题是按照客户之间的相似特征区分不同客户群体.这个问题的先决条件中没有可供使用的客户分类列表,只有客户的人物画像. 数据集已有的数据是公司的历史商业活动记录以及客户的购买 ...
CRM 建设方案(01)：CRM基础
CRM 客户关系管理系统基础客户关系管理简称CRM(Customer Relationship Management).CRM概念引入中国已有数年,其字面意思是客户关系管理,但其深层的内涵却有着许多 ...
python excel 文件合并
Combining Data From Multiple Excel Files Introduction A common task for python and pandas is to auto ...
Ninject之旅之六：Ninject约定
摘要在小的应用系统中一个一个注册一些服务类型不怎么困难.但是,如果是一个实际的有上百个服务的应用程序呢?约定配置允许我们使用约定绑定一组服务,而不用一个一个分别绑定. 要使用约定配置,需要添加Nin ...
沈阳润才教育CRM
一.CRM初始 CRM,客户关系管理系统(Customer Relationship Management).企业用CRM技术来管理与客户之间的关系,以求提升企业成功的管理方式,其目的是协助企业管理销 ...
python 全栈开发，Day107(CRM初始,权限组件之权限控制,权限系统表设计)
一.CRM初始 CRM,客户关系管理系统(Customer Relationship Management).企业用CRM技术来管理与客户之间的关系,以求提升企业成功的管理方式,其目的是协助企业管理销 ...

随机推荐

jmeter参数化随机取值实现
jmeter能用来做参数化的组件有几个,但是都没有随机取值的功能,遇到随机取值的需求怎么办呢? 突发奇想,可以用函数__CSVRead()来实现: __CSVRead() CSV file to ge ...
【G】开源的分布式部署解决方案文档 - 部署Console & 控制负载均衡 & 跳转持续集成控制台
G.系列导航 [G]开源的分布式部署解决方案 - 导航设置项目部署流程项目类型:选择Console,这个跟功能无关,只是做项目分类,后面会有后续功能宿主:选择Console 部署方式:选择原始, ...
爬楼梯问题-斐波那契序列的应用.md
N 阶楼梯,一次可以爬1.2.3...n步,求爬楼梯的种类数 /** * 斐波那契序列 */ public class ClimbingStairs { // Sol 1: 递归 // 递归公式:F ...
实现五种分组加密模式ECB，CBC，CFB，OFB，CTR
没什么好说的,简单无脑! #include<iostream>using namespace std; int ECB(){ int duan[4]; int messageLen = 1 ...
跟着刚哥梳理java知识点——运算符（五）
运算符:是一种特殊的符号,用以表示数据的运算.赋值和比较. 1.算数运算符(+.-.*./.%.++.--) a)除: int i = 12; double d1 = i / 5; //2.0 dou ...
php常用的优化手段
由于工作码成狗,抽闲整理了下内容,以下是网上流传比较广泛的30种SQL查询语句优化方法: 1.应尽量避免在 where 子句中使用!=或<>操作符,否则将引擎放弃使用索引而进行全表扫描. ...
在Ubuntu中使用JAVA与tomcat搭建web服务器
一:材料 1.操作系统:ubuntu16.04 2.JAVA: jdk1.8.0 3.Tomcat:tomcat 8 4.域名:zhuandshao.cn 二:过程 1.安装java 1)在官网下载j ...
破解Linux系统开机密码
在我们使用Linux虚拟机的时候,经常会忘记自己设置的开机密码,无奈之下只有重新建一个虚拟机,然而新建往往会浪费掉我们很多时间,这时候,知道如何破解Linux系统密码就显得很重要了. 下面我们使用bo ...
androidStudio通过svn进行版本控制
andoridStudio配置使用svn(以windows为例) 1.先安装svn客户端程序,TortoiseSVN,注意安装过程中要勾选command line client tools(默认是不安 ...
Jsoup抓取、解析网页和poi存取excel综合案例——采集网站的联系人信息
需求:采集网站中每一页的联系人信息一.创建maven工程,添加jsoup和poi的依赖包 <!-- https://mvnrepository.com/artifact/org.apache. ...

Customer segmentation – LifeCycle Grids, CLV and CAC with R（转）

Customer segmentation – LifeCycle Grids, CLV and CAC with R（转）的更多相关文章

随机推荐

热门专题