Time Series Analysis

Best MSE (Mean Square Error) Predictor

对于所有可能的预测函数 \(f(X_{n})\),找到一个使 \(\mathbb{E}\big[\big(X_{n} - f(X_{n})\big)^{2} \big]\) 最小的 \(f\) 的 predictor。这样的 predictor 假设记为 \(m(X_{n})\), 称作 best MSE predictor,i.e.,

\[m(X_{n}) = \mathop{\arg\min}\limits_{f} \mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} \big]
\]

我们知道:\(\mathop{\arg\min}\limits_{f} \mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} \big]\) 的解即为:

\[\mathbb{E}\big[ X_{n+h} ~ \big| ~ X_{n} \big]
\]

证明:

基于 \(X_{n}\) 求 \(\mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} \big]\) 的最小值,实际上:

\[\mathop{\arg\min}\limits_{f} \mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} \big] \iff \mathop{\arg\min}\limits_{f} \mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} ~ \big| ~ X_{n} \big]
\]

  • 私以为更严谨的写法是 \(\mathop{\text{argmin}}\limits_{f} ~ \mathbb{E}\Big[\Big(X_{n+h} - f\big( X_{n}\big)\Big)^{2} ~ | ~ \mathcal{F}_{n}\Big]\),其中 \(\left\{ \mathcal{F}_{t}\right\}_{t\geq 0}\) 为 \(\left\{ X_{t} \right\}_{t\geq 0}\) 相关的 natural filtration,but whatever。

等式右侧之部分:

\[\begin{align*}
\mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} ~ \big| ~ X_{n} \big] & = \mathbb{E}[X_{n+h}^{2} ~ | ~ X_{n}] - 2f(X_{n})\mathbb{E}[X_{n+h} ~ | ~ X_{n}] + f^{2}(X_{n}) \\
\end{align*}
\]

其中由于:

\[\begin{align*}
Var(X_{n+h} ~ | ~ X_{n}) & = \mathbb{E}\Big[ \big( X_{n+h} - \mathbb{E}\big[ X_{n+h}^{2} ~ | ~ X_{n} \big] \big)^{2} ~ \Big| ~ X_{n} \Big] \\
& = \mathbb{E}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] - 2\mathbb{E}^{2}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] + \mathbb{E}^{2}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] \\
& = \mathbb{E}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] - \mathbb{E}^{2}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big]
\end{align*}
\]

which gives that:

\[\implies Var(X_{n+h} ~ | ~ X_{n}) = \mathbb{E}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] - \mathbb{E}^{2}\big[ X_{n+h} ~ \big| ~ X_{n} \big]
\]

因此,

\[\begin{align*}
\mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} ~ \big| ~ X_{n} \big] & = Var(X_{n+h} ~ | ~ X_{n}) + \mathbb{E}^{2}\big[ X_{n+h} ~ \big| ~ X_{n}\big] - 2f(X_{n})\mathbb{E}[X_{n+h} ~ | ~ X_{n}] + f^{2}(X_{n}) \\
& = Var(X_{n+h} ~ | ~ X_{n}) + \Big( \mathbb{E}\big[ X_{n+h} ~ \big| ~ X_{n}\big] - f(X_{n}) \Big)^{2}
\end{align*}
\]

方差 \(Var(X_{n+h} ~ | ~ X_{n})\) 为定值,那么 optimal solution \(m(X_{n})\) 显而易见:

\[m(X_{n}) = \mathbb{E}\big[ X_{n+h} ~ \big| ~ X_{n} \big]
\]

此时 \(\left\{ X_{t} \right\}\) 为一个 Stationary Gaussian Time Series, i.e.,

\[\begin{pmatrix}
X_{n+h}\\
X_{n}
\end{pmatrix} \sim N \begin{pmatrix}
\begin{pmatrix}
\mu \\
\mu
\end{pmatrix}, ~ \begin{pmatrix}
\gamma(0) & \gamma(h) \\
\gamma(h) & \gamma(0)
\end{pmatrix}
\end{pmatrix}
\]

那么我们有:

\[X_{n+h} ~ | ~ X_{n} \sim N\Big( \mu + \rho(h)\big(X_{n} - \mu\big), ~ \gamma(0)\big(1 - \rho^{2}(h)\big) \Big)
\]

其中 \(\rho(h)\) 为 \(\left\{ X_{t} \right\}\) 的 ACF,因此,

\[\mathbb{E}\big[ X_{n+h} ~ \big| ~ X_{n} \big] = m(X_{n}) = \mu + \rho(h) \big( X_{n} - \mu \big)
\]

注意:

若 \(\left\{ X_{t} \right\}\) 是一个 Gaussian time series,则一定能计算 best MSE predictor。而若 \(\left\{ X_{t} \right\}\) 并非 Gaussian time series,则计算通常十分复杂。

因此,我们通常不找 best MSE predictor,而寻找 best linear predictor。


Best Linear Predictor (BLP)

在 BLP 假设下,我们寻找一个形如 \(f(X_{n}) \propto aX_{n} + b\) 的 predictor。

则目标为:

\[\text{minimize: } ~ S(a,b) = \mathbb{E} \big[ \big( X_{n+h} - aX_{n} -b \big)^{2} \big]
\]

推导:

分别对 \(a, b\) 求偏微分:

\[\begin{align*}
\frac{\partial}{\partial b} S(a, b) & = \frac{\partial}{\partial b} \mathbb{E} \big[ \big( X_{n+h} - aX_{n} -b \big)^{2} \big] \\
& = -2 \mathbb{E} \big[ X_{n+h} - aX_{n} - b \big] \\
\end{align*}
\]

令:

\[\frac{\partial}{\partial b} S(a, b) = 0
\]

则:

\[\begin{align*}
-2 \cdot & \mathbb{E} \big[ X_{n+h} - aX_{n} - b \big] = 0 \\
\implies & \qquad \mathbb{E}[X_{n+h}] - a\mathbb{E}[X_{n}] - b = 0\\
\implies & \qquad \mu - a\mu - b = 0 \\
\implies & \qquad b^{\star} = (1 - a^{\star}) \mu
\end{align*}
\]

回代并 take partial derivative on \(a\):

\[\begin{align*}
\frac{\partial}{\partial a} S(a, b) & = \frac{\partial}{\partial a} \mathbb{E} \big[ \big( X_{n+h} - aX_{n} - (1 - a)\mu \big)^{2} \big] \\
& = \frac{\partial}{\partial a} \mathbb{E} \Big[ \Big( \big(X_{n+h} - \mu \big) - \big( X_{n} - \mu \big) a \Big)^{2} \Big] \\
& = \mathbb{E} \Big[ - \big( X_{n} - \mu \big) \Big( \big(X_{n+h} - \mu \big) - \big( X_{n} - \mu \big) a \Big)\Big] \\
\end{align*}
\]

令:

\[\frac{\partial}{\partial a} S(a, b) = 0
\]

则:

\[\begin{align*}
& \mathbb{E} \Big[ - \big( X_{n} - \mu \big) \Big( \big(X_{n+h} - \mu \big) - \big( X_{n} - \mu \big) a \Big)\Big] = 0 \\
\implies & \qquad \mathbb{E} \Big[\big( X_{n} - \mu \big) \Big( \big(X_{n+h} - \mu \big) - \big( X_{n} - \mu \big) a \Big)\Big] = 0 \\
\implies & \qquad \mathbb{E} \Big[\big( X_{n} - \mu \big) \big(X_{n+h} - \mu \big) - a \big( X_{n} - \mu \big) \big( X_{n} - \mu \big) \Big] = 0 \\
\implies & \qquad \mathbb{E} \Big[\big( X_{n} - \mu \big) \big(X_{n+h} - \mu \big) \Big] = a \cdot \mathbb{E} \Big[\big( X_{n} - \mu \big) \big( X_{n} - \mu \big) \Big] \\
\implies & \qquad \mathbb{E} \Big[\big( X_{n} - \mathbb{E}[X_{n}] \big) \big(X_{n+h} - \mathbb{E}[X_{n+h}] \big) \Big] = a \cdot \mathbb{E} \Big[\big( X_{n} - \mathbb{E}[X_{n}] \big)^{2} \Big] \\
\implies & \qquad \text{Cov}(X_{n}, X_{n+h}) = a \cdot \text{Var}(X_{n}) \\
\implies & \qquad a^{\star} = \frac{\gamma(h)}{\gamma(0)} = \rho(h)
\end{align*}
\]

综上,time series \(\left\{ X_{n} \right\}\) 的 BLP 为:

\[f(X_{n}) = l(X_{n}) = \mu + \rho(h) \big( X_{n} - \mu \big)
\]

且 BLP 相关的 MSE 为:

\[\begin{align*}
\text{MSE} & = \mathbb{E}\big[ \big( X_{n+h} - l(X_{n}) \big)^{2} \big] \\
& = \mathbb{E} \Big[ \Big( X_{n+h} - \mu - \rho(h) \big( X_{n} - \mu \big) \Big)^{2} \Big] \\
& = \rho(0) \cdot \big( 1 - \rho^{2}(h) \big)
\end{align*}
\]

Time Series Analysis (Best MSE Predictor & Best Linear Predictor)的更多相关文章

  1. PP: Multilevel wavelet decomposition network for interpretable time series analysis

    Problem: the important frequency information is lack of effective modelling. ?? what is frequency in ...

  2. A New Recurrence-Network-Based Time Series Analysis Approach for Characterizing System Dynamics - Guangyu Yang, Daolin Xu * and Haicheng Zhang

    Purpose: characterize the evolution of dynamical systems. In this paper, a novel method based on eps ...

  3. survey on Time Series Analysis Lib

    (1)I spent my 4th year Computing project on implementing time series forecasting for Java heap usage ...

  4. time series analysis

    1 总体介绍 在以下主题中,我们将回顾有助于分析时间序列数据的技术,即遵循非随机顺序的测量序列.与在大多数其他统计数据的上下文中讨论的随机观测样本的分析不同,时间序列的分析基于数据文件中的连续值表示以 ...

  5. predict.glm -> which class does it predict?

    Jul 10, 2009; 10:46pm predict.glm -> which class does it predict? 2 posts Hi, I have a question a ...

  6. Visibility Graph Analysis of Geophysical Time Series: Potentials and Possible Pitfalls

    Tasks: invest papers  3 篇. 研究主动权在我手里.  I have to.  1. the benefit of complex network: complex networ ...

  7. Regression analysis

    Source: http://wenku.baidu.com/link?url=9KrZhWmkIDHrqNHiXCGfkJVQWGFKOzaeiB7SslSdW_JnXCkVHsHsXJyvGbDv ...

  8. Bayesian generalized linear model (GLM) | 贝叶斯广义线性回归实例

    一些问题: 1. 什么时候我的问题可以用GLM,什么时候我的问题不能用GLM? 2. GLM到底能给我们带来什么好处? 3. 如何评价GLM模型的好坏? 广义线性回归啊,虐了我快几个月了,还是没有彻底 ...

  9. Time Series data 与 sequential data 的区别

    It is important to note the distinction between time series and sequential data. In both cases, the ...

  10. 7、RNAseq Downstream Analysis

    Created by Dennis C Wylie, last modified on Jun 29, 2015 Machine learning methods (including cluster ...

随机推荐

  1. Optional用法与争议点

    原创:扣钉日记(微信公众号ID:codelogs),欢迎分享,转载请保留出处. 简介 要说Java中什么异常最容易出现,我想NullPointerException一定当仁不让,为了解决这种null值 ...

  2. Pycharm自定义实时模板

    pycharm添加模板 添加装饰器模板 # 1.file-->Setting-->Editor-->Code Style -->Live Templates# 2." ...

  3. select中DISTINCT的应用-过滤表中重复数据

    在表中,一个列可能会包含多个重复值,有时也许希望仅仅列出不同(distinct)的值. DISTINCT 关键词用于返回唯一不同的值. SQL SELECT DISTINCT 语法 SELECT DI ...

  4. qtcreator修改界面但是没有更新

    原因 我之前修改了项目名(简单的修改文件夹和.pro文件名),但是项目构建的位置还是之前目录. 解决 将 build directory改为新的目录即可.

  5. SSH(二)框架配置文件

    在引入了宽假所需要的jar包后,引入相应配置文件. 一.Struts2的配置文件: 1.Struts2的黑心过滤器,在web.xml中引入: <!-- struts2框架的核心过滤器  clas ...

  6. day11 枚举类enum & 单例模式 & 异常以及抛出

    day11 枚举enum 用enum关键字定义枚举类 特点 1.用enum关键字定义枚举类 2.枚举类默认继承java.lang.Enum类 3.枚举类的构造方法只能使用private修饰,省略则默认 ...

  7. Tekton 设计简介 及 实践

    本文是我对Tekton的实现原理和背后技术逻辑的理解,以及在实践过程中的一些总结. 简介 Tekton 是一个基于 Kubernetes 的云原生 CI/CD 开源(https://cd.founda ...

  8. Python 缩进语法的起源:上世纪 60-70 年代的大胆创意!

    上个月,Python 之父 Guido van Rossum 在推特上转发了一篇文章<The Origins of Python>,引起了我的强烈兴趣. 众所周知,Guido 在 1989 ...

  9. Linux 下使用Docker 安装 LNMP环境 超详细

    首先在阿里云购买了一台服务器 选择了华南-深圳地区 操作系统选用了 CentOS8.0 64位 1. 初始化账号密码 登陆xshell,开始装Docker 一.安装docker 1.Docker 要求 ...

  10. war包形式安装jenkins

    (1)下载war包 输入命令:java -jar jenkins.war --httpPort=8080,更改端口 重新登录之后,输入密码创建用户等完成设置 (2)结合Tomcat安装: 将jenki ...