I want to consider an approach of forecasting I really like and frequently use. It allows to include the promo campaigns (or another activities and other variables as well) effect into the prediction of total amount. I will use a fictitious example and data in this post, but it works really good with my real data.  So, you can adapt this algorithm for your requirements and test it. Also, it seems simple for non-math people because ofcomplete automation.

Suppose we sell some service and our business depends on the number of subscribers we attract. Definitely, we measure the number of customers and want to predict their quantity. If we know the customer’s life time value (CLV) it allows us to predict the total revenue based on quantity of customers and CLV. So, this case looks like justified.

The first way we can use for solving this problem is multiple regression. We can find a great number of relevant indicators which influence the number of subscribers. It can be service price, seasonality, promotional activities, even S&P or Dow Jones index, etc. After we found all parameters affecting number of customers we can calculate formula and predict number of customers.

This approach has disadvantages:

  • we should collect all these indicators in one place and have historical data of all of them,
  • they should be measured at the same time intervals,
  • most importantly, we should predict all of these indicators as well. If our customers buy several different packages of our service and for different periods, even our average price doesn’t look like it can be easily predicted (not mentioning S&P index). If we use predicted indicators, their prediction errors will affect the final prediction as well.

On the other hand, stock market analysts use time-series forecasting. They are resigned by the fact that stock prices are influenced by a great number of indicators. Thus, they are looking for dependence inside the price curve. This approach is not fully suitable for us too. In case we regularly attracted extra customers via promos (we can see some peaks on curve) the time-series algorithms can identify peaks as seasonality and draw future curve with the same peaks, but what we can do if we are not planning promos in these periods or we are going to make extra promos or change their intensity.

And final statement before we start working on our prediction algorithm. I’m sure it is important for marketers to see how their promos or activities affect the number of customers / revenue (or subscribers in our case).

So, our task is to create the model which doesn’t depend on a great number of predictors from one side (looks like time-series forecasting) and on the other side includes promos effect on total number of subscribers from the other side (looks like regression).

My answer is extended ARIMA model. ARIMA is Auto Regression Integrated Moving Average. “Extended” means we can include some other information in time-series forecasting based on ARIMA model. In our case, other information is the result of promos we had and we are going to get in the future. In case we repeat promo campaigns every year at the same period and get approximately the same number of new customers ARIMA model (not extended) would be enough. It should recognize peaks as a seasonality. This example we won’t review.

Let’s start. Suppose our data is:

We have (from the left to the right):

  • # of period,
  • year,
  • month,
  • number of subscribers,
  • monthly growth (difference between number of subscribers in the next month and number of subscribers in the previous month),
  • extended (sum of promos effect),
  • several types of promo campaigns which affected the number of customers (promo1, promo2, etc.). Also, you can see that some subscribers from particular promo are gone (negative number). When we run some special low pricing promos we realize that part of these customers won’t extend their subscriptions. So, this is the example which includes negative effect of promo campaigns as well.

We need only two variables to make the prediction (‘growth’ and ‘extended’). There are other variables just for your information. Also we have two last months without number of subscribers (we are going to predict these values), but we should have promos effect which we are planning to get in future. Further, the heat-map of growth and extended variables look alike. Thus, we can make conclusion that they are connected.

In the example we will predict values from the 37th to the 42nd to see accuracy of prediction on factual data.

The code in R can be the next:

#load libraries
library(forecast)
library(TSA)
#load data set
df <- read.csv(file='data.csv')
#define periods (convenient for future, you can just change values for period you want to predict or include to factual)
s.date <- c(2010,1) #start date - factual
e.date <- c(2011,12) #end date - factual
f.s.date <- c(2013,1) #start date - prediction
f.e.date <- c(2013,12) #end date - prediction #transform values to time-series and define past and future periods
growth <- ts(df$Growth, start=s.date, end=e.date, frequency=12)
ext <- ts(df$Extended, start=s.date, end=f.e.date, frequency=12)
past <- window(ext, s.date, e.date)
future <- window(ext, f.s.date, f.e.date)
#ARIMA model
fit <- auto.arima(growth, xreg=past, stepwise=FALSE, approximation=FALSE) #determine model
forecast <- forecast(fit, xreg=future) #make prediction
plot(forecast) #plot chart
summary(forecast) #print predicted values

We should get chart and values:

As you remember we have factual data for Jan.2013-Apr.2013 which we can compare: 384 vs 451, 1224 vs 1271, 709 vs 796 and 699 vs 753. Although values are not very close, we can see that February promo affected and we saw a peak. After we add Jan.2013-Mar.2013 to factual periods, our prediction for April will be 718 which is closer to 699 than 753. That means once we have factual data we should recalculate and precise the prediction.

Thus, we have predicted number of subscribers including promo campaigns effect. If we are not satisfied with this number we can add some activity and measure new prediction. Suppose we add new activity for attracting 523 new customers in April 2013 (this means Extended will be 500 instead of -23). In this case our prediction will be:

We got the new peak 1295 in April instead of 753 (in previous prediction). Thus, we have tool for targeting number of subscribers, the only thing we need is to attract these subscribers which we are going to use for prediction ;).

Note, for making prediction for more periods just add values of extended variable in the initial data and change prediction period in the R code.

In case when described approach works poorly I can recommend you this great book written by ‘forecast’ package creator prof. Rob J Hyndman to deepen into forecasting.

Have an accurate predictions!

转自:http://analyzecore.com/2014/06/27/include-promo-effect-into-prediction/

Include promo/activity effect into the prediction (extended ARIMA model with R)的更多相关文章

  1. Module中引用Module中的Activity时报错了,错误是找不到R文件中的id引用

    1.好像库modul和主modul不能有相同名字和layout文件 2.资源文件名冲突导致的

  2. STATS 326 Applied Time Series

    STATS 326Applied Time SeriesASSIGNMENT THREEDue: 2 May 2019, 11.00 am(Worth 6% of your final grade)H ...

  3. Android官方文档翻译 十七 4.1Starting an Activity

    Starting an Activity 开启一个Activity This lesson teaches you to 这节课教给你 Understand the Lifecycle Callbac ...

  4. Android布局优化之include、merge、ViewStub的使用

    本文针对include.merge.ViewStub三个标签如何在布局复用.有效减少布局层级以及如何可以按需加载三个方面进行介绍的. 复用布局可以帮助我们创建一些可以重复使用的复杂布局.这种方式也意味 ...

  5. Android窗口管理服务WindowManagerService显示Activity组件的启动窗口(Starting Window)的过程分析

    文章转载至CSDN社区罗升阳的安卓之旅,原文地址:http://blog.csdn.net/luoshengyang/article/details/8577789 在Android系统中,Activ ...

  6. 【转】关于Activity和Task的设计思路和方法

    Activity和Task是Android Application Framework架构中最基础的应用,开发者必须清楚它们的用法和一些开发技巧.本文用大量的篇幅并通过引用实例的方式一步步深入全面讲解 ...

  7. uva 1560 - Extended Lights Out(枚举 | 高斯消元)

    题目链接:uva 1560 - Extended Lights Out 题目大意:给定一个5∗6的矩阵,每一个位置上有一个灯和开关,初始矩阵表示灯的亮暗情况,假设按了这个位置的开关,将会导致周围包含自 ...

  8. Android布局优化之ViewStub、include、merge使用与源码分析

    在开发中UI布局是我们都会遇到的问题,随着UI越来越多,布局的重复性.复杂度也会随之增长.Android官方给了几个优化的方法,但是网络上的资料基本上都是对官方资料的翻译,这些资料都特别的简单,经常会 ...

  9. android布局中使用include及需注意点

    在android布局中,使用include,将另一个xml文件引入,可作为布局的一部分,但在使用include时,需注意以下问题: 一.使用include引入 如现有标题栏布局block_header ...

随机推荐

  1. 用MPLAB IDE编程时,软件总是弹出一个窗口提示: “the extended cpu mode configuration bit is enabled,but the program that was loaded was not built using extended cpu instructions. therefore,your code may not work properly

    用MPLAB IDE编程时,软件总是弹出一个窗口提示:"the extended cpu mode configuration bit is enabled,but the program ...

  2. HTML基础的基础

    今天咱们来看一下有关HTML的相关基础内容 学过.net的对HTML不会陌生 但是对于想单纯的了解下HTML的可能对他不是很了解 男的可以这么理解HTML=How To Make Love 咳咳,请上 ...

  3. Java--定时器问题

    定时器问题 定时器属于基本的基础组件,不管是用户空间的程序开发,还是内核空间的程序开发,很多时候都需要有定时器作为基础组件的支持.一个定时器的实现需要具备以下四种基本行为:添加定时器.取消定时器.定时 ...

  4. 浅析c++/java/c#三大热门编程语言的运行效率

    从安全角度考虑,C#是这几中语言中最为安全的,它其中定义的相关安全机制很好的确保了系统的安全... 今天和同学们一起探讨下c++/java/c# 三大热门语言的运行效率情况,以及各自的用途. 估计有很 ...

  5. HTML5 进阶系列:indexedDB 数据库

    前言 在 HTML5 的本地存储中,有一种叫 indexedDB 的数据库,该数据库是一种存储在客户端本地的 NoSQL 数据库,它可以存储大量的数据.从上篇:HTML5 进阶系列:web Stora ...

  6. DirectFB 之 字体显示

    通过本文,可以简单地了解directfb字体内部运行机制. 简介 SetFont函数,是每次写字体前必须调用的一个函数,否则directfb程序将会报错.这个函数是将某种字体与某个surface相关联 ...

  7. 使用 rsync 同步

    原文地址 http://www.howtocn.org/rsync:use_rsync 选项 说明 -a, ––archive 归档模式,表示以递归方式传输文件,并保持所有文件属性,等价于 -rlpt ...

  8. PHP 学习笔记(4)

    声明类属性或方法为静态,就可以不实例化类而直接访问.静态属性不能通过一个类已实例化的对象来访问(但静态方法可以). PHP 5 支持抽象类和抽象方法.定义为抽象的类不能被实例化 使用接口(interf ...

  9. 蓝桥杯-括号问题-java

    /* (程序头部注释开始) * 程序的版权和版本声明部分 * Copyright (c) 2016, 广州科技贸易职业学院信息工程系学生 * All rights reserved. * 文件名称: ...

  10. php函数每日学习二十个

    数学函数 1,abs() 求绝对值 2,ceil() 进一法取整 3,floor() 舍去法取整 4,fmod()对浮点数进行取余 例如fmod(5.7,1.3) 5,pow() 返回数的n次方 po ...