Customer segmentation – LifeCycle Grids, CLV and CAC with R(转)
We studied a very powerful approach for customer segmentation in the previous post, which is based on the customer’s lifecycle. We used two metrics: frequency and recency. It is also possible and very helpful to add monetary value to our segmentation. If you havecustomer acquisition cost (CAC) and customer lifetime value (CLV), you can easily add these data to the calculations.
We will create the same data sample as in the previous post, but with two added data frames:
- cac, our expenses for each customer acquisition,
- gr.margin, gross margin of each product.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
# loading libraries library (dplyr) library (reshape2) library (ggplot2) # creating data sample set.seed (10) data <- data.frame (orderId= sample ( c (1:1000), 5000, replace= TRUE ), product= sample ( c ( 'NULL' , 'a' , 'b' , 'c' ), 5000, replace= TRUE , prob= c (0.15, 0.65, 0.3, 0.15))) order <- data.frame (orderId= c (1:1000), clientId= sample ( c (1:300), 1000, replace= TRUE )) gender <- data.frame (clientId= c (1:300), gender= sample ( c ( 'male' , 'female' ), 300, replace= TRUE , prob= c (0.40, 0.60))) date <- data.frame (orderId= c (1:1000), orderdate= sample ((1:100), 1000, replace= TRUE )) orders <- merge (data, order, by= 'orderId' ) orders <- merge (orders, gender, by= 'clientId' ) orders <- merge (orders, date, by= 'orderId' ) orders <- orders[orders$product!= 'NULL' , ] orders$orderdate <- as.Date (orders$orderdate, origin= "2012-01-01" ) # creating data frames with CAC and Gross margin cac <- data.frame (clientId= unique (orders$clientId), cac= sample ( c (10:15), 289, replace= TRUE )) gr.margin <- data.frame (product= c ( 'a' , 'b' , 'c' ), grossmarg= c (1, 2, 3)) rm (data, date, order, gender) |
Next, we will calculate CLV to date (actual amount that we earned) using gross margin values and orders of the products. We will use the following code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
# reporting date today <- as.Date ( '2012-04-11' , format= '%Y-%m-%d' ) # calculating customer lifetime value orders <- merge (orders, gr.margin, by= 'product' ) clv <- orders %>% group_by (clientId) %>% summarise (clv= sum (grossmarg)) # processing data orders <- dcast (orders, orderId + clientId + gender + orderdate ~ product, value.var= 'product' , fun.aggregate=length) orders <- orders %>% group_by (clientId) %>% mutate (frequency= n (), recency= as.numeric (today-orderdate)) %>% filter (orderdate== max (orderdate)) %>% filter (orderId== max (orderId)) orders.segm <- orders %>% mutate (segm.freq= ifelse ( between (frequency, 1, 1), '1' , ifelse ( between (frequency, 2, 2), '2' , ifelse ( between (frequency, 3, 3), '3' , ifelse ( between (frequency, 4, 4), '4' , ifelse ( between (frequency, 5, 5), '5' , '>5' )))))) %>% mutate (segm.rec= ifelse ( between (recency, 0, 6), '0-6 days' , ifelse ( between (recency, 7, 13), '7-13 days' , ifelse ( between (recency, 14, 19), '14-19 days' , ifelse ( between (recency, 20, 45), '20-45 days' , ifelse ( between (recency, 46, 80), '46-80 days' , '>80 days' )))))) %>% # creating last cart feature mutate (cart= paste ( ifelse (a!=0, 'a' , '' ), ifelse (b!=0, 'b' , '' ), ifelse (c!=0, 'c' , '' ), sep= '' )) %>% arrange (clientId) # defining order of boundaries orders.segm$segm.freq <- factor (orders.segm$segm.freq, levels= c ( '>5' , '5' , '4' , '3' , '2' , '1' )) orders.segm$segm.rec <- factor (orders.segm$segm.rec, levels= c ( '>80 days' , '46-80 days' , '20-45 days' , '14-19 days' , '7-13 days' , '0-6 days' )) |
Note: if you prefer to use potential/expected/predicted CLV or total CLV (sum of CLV to date and potential CLV) you can adapt this code or find the example in the next post.
In addition, we need to merge orders.segm with the CAC and CLV data, and combine the data with the segments. We will calculate total CAC and CLV to date, as well as their average with the following code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
orders.segm <- merge (orders.segm, cac, by= 'clientId' ) orders.segm <- merge (orders.segm, clv, by= 'clientId' ) lcg.clv <- orders.segm %>% group_by (segm.rec, segm.freq) %>% summarise (quantity= n (), # calculating cumulative CAC and CLV cac= sum (cac), clv= sum (clv)) %>% ungroup () %>% # calculating CAC and CLV per client mutate (cac1= round (cac/quantity, 2), clv1= round (clv/quantity, 2)) lcg.clv <- melt (lcg.clv, id.vars= c ( 'segm.rec' , 'segm.freq' , 'quantity' )) |
Ok, let’s plot two charts: the first one representing the totals and the second one representing the averages:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
ggplot (lcg.clv[lcg.clv$variable % in % c ( 'clv' , 'cac' ), ], aes (x=variable, y=value, fill=variable)) + theme_bw () + theme (panel.grid = element_blank ())+ geom_bar (stat= 'identity' , alpha=0.6, aes (width=quantity/ max (quantity))) + geom_text ( aes (y=value, label=value), size=4) + facet_grid (segm.freq ~ segm.rec) + ggtitle ( "LifeCycle Grids - CLV vs CAC (total)" ) ggplot (lcg.clv[lcg.clv$variable % in % c ( 'clv1' , 'cac1' ), ], aes (x=variable, y=value, fill=variable)) + theme_bw () + theme (panel.grid = element_blank ())+ geom_bar (stat= 'identity' , alpha=0.6, aes (width=quantity/ max (quantity))) + geom_text ( aes (y=value, label=value), size=4) + facet_grid (segm.freq ~ segm.rec) + ggtitle ( "LifeCycle Grids - CLV vs CAC (average)" ) |
You can find in the grid that the width of bars depends on the number of customers. I think these visualizations are very helpful. You can see the difference between CLV to dateand CAC and make decisions about on paid campaigns or initiatives like:
- does it make sense to spend extra money to reactivate some customers (e.g. those who are in the “1 order / >80 days“ grid or those who are in the “>5 orders / 20-45 days“ grid)?,
- how much money is appropriate to spend?,
- and so on.
Therefore, we have got a very interesting visualization. We can analyze and make decisions based on the three customer lifecycle metrics: recency, frequency andmonetary value.
Thank you for reading this!
转自:http://analyzecore.com/2015/02/19/customer-segmentation-lifecycle-grids-clv-and-cac-with-r/
Customer segmentation – LifeCycle Grids, CLV and CAC with R(转)的更多相关文章
- Customer segmentation – LifeCycle Grids with R(转)
I want to share a very powerful approach for customer segmentation in this post. It is based on cust ...
- Cohort Analysis and LifeCycle Grids mixed segmentation with R(转)
This is the third post about LifeCycle Grids. You can find the first post about the sense of LifeCyc ...
- Appboy 基于 MongoDB 的数据密集型实践
摘要:Appboy 正在过手机等新兴渠道尝试一种新的方法,让机构可以与顾客建立更好的关系,可以说是市场自动化产业的一个前沿探索者.在移动端探索上,该公司已经取得了一定的成功,知名产品有 iHeartM ...
- ML.NET教程之客户细分(聚类问题)
理解问题 客户细分需要解决的问题是按照客户之间的相似特征区分不同客户群体.这个问题的先决条件中没有可供使用的客户分类列表,只有客户的人物画像. 数据集 已有的数据是公司的历史商业活动记录以及客户的购买 ...
- CRM 建设方案(01):CRM基础
CRM 客户关系管理系统基础 客户关系管理简称CRM(Customer Relationship Management).CRM概念引入中国已有数年,其字面意思是客户关系管理,但其深层的内涵却有着许多 ...
- python excel 文件合并
Combining Data From Multiple Excel Files Introduction A common task for python and pandas is to auto ...
- Ninject之旅之六:Ninject约定
摘要 在小的应用系统中一个一个注册一些服务类型不怎么困难.但是,如果是一个实际的有上百个服务的应用程序呢?约定配置允许我们使用约定绑定一组服务,而不用一个一个分别绑定. 要使用约定配置,需要添加Nin ...
- 沈阳润才教育CRM
一.CRM初始 CRM,客户关系管理系统(Customer Relationship Management).企业用CRM技术来管理与客户之间的关系,以求提升企业成功的管理方式,其目的是协助企业管理销 ...
- python 全栈开发,Day107(CRM初始,权限组件之权限控制,权限系统表设计)
一.CRM初始 CRM,客户关系管理系统(Customer Relationship Management).企业用CRM技术来管理与客户之间的关系,以求提升企业成功的管理方式,其目的是协助企业管理销 ...
随机推荐
- java线程池ThreadPoolExector源码分析
java线程池ThreadPoolExector源码分析 今天研究了下ThreadPoolExector源码,大致上总结了以下几点跟大家分享下: 一.ThreadPoolExector几个主要变量 先 ...
- Lambda&Java多核编程-7-类型检查
本篇主要介绍Lambda的类型检查机制以及周边的一些知识. 类型检查 在前面的实践中,我们发现表达式的类型能够被上下文所推断.即使同一个表达式,在不同的语境下也能够被推断成不同类型. 这几天在码一个安 ...
- Apache+mod_encoding解决URL中文编码问题
我们经常在论坛上看到这样的求救贴: 为什么我看不了网站上中文文件名的文件?这时一定会有好心的大侠告诉说,到IE6的工具,Internet选项, 高级里,把"总是以UTF-8发送URL&qu ...
- 【Tomcat源码学习】-4.连接管理
前面几节主要针对于Tomcat容器以及内容加载进行了讲解,本节主要针对于连接器-Connector进行细化,作为连接器主要的目的是监听外围网络访问请求,而连接器在启动相关监听进程后,是通过NIO方式进 ...
- linux 程序调用system执行命令
正确使用system方法,判断返回值 int exeCmd(const char *cmd) { pid_t status; status = system(cmd); == status) { Wr ...
- CentOS6.5 配置本地Yum源
一.Yum简介 1.Yum(全称为 Yellow dog Updater, Modified)是一个在Fedora和RedHat以及CentOS中的Shell前端软件包管理器. 2.基于RPM包管理, ...
- 让你的JS代码更具可读性
一.合理的添加注释 函数和方法--每个函数或方法都应该包含一个注释,描述其目的和用于完成任务所可能使用 的算法.陈述事先的假设也非常重要,如参数代表什么,函数是否有返回值(因为这不能从函 数定义中推断 ...
- Html5-audio标签简介及手机端不自动播放问题
1.audio:html5音频标签 <audio loop src="/photo/aa.mp3" id="audio" autoplay preload ...
- spring学习总结二-----面向切面编程(AOP)思想
上一篇spring博客简总结了spring控制反转和依赖注入的相关思想知识点,这篇博文对spring的面向切的编程思想进行简单的梳理和总结. 一.面向切面的思想 与面向对象的纵向关系概念不同,面向切面 ...
- JS设计模式---缓存代理
缓存代理可以为一些开销大的运算结果提供暂时的存储,在下次运算的时候,传进来的参数跟上次是一致, 则可以直接返回前面存储的结果. 运行上面的代码我们发现,当第二次再调用proxyMult(1,2,3)的 ...