Time Series Analysis

Best MSE (Mean Square Error) Predictor

对于所有可能的预测函数 \(f(X_{n})\),找到一个使 \(\mathbb{E}\big[\big(X_{n} - f(X_{n})\big)^{2} \big]\) 最小的 \(f\) 的 predictor。这样的 predictor 假设记为 \(m(X_{n})\), 称作 best MSE predictor,i.e.,

\[m(X_{n}) = \mathop{\arg\min}\limits_{f} \mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} \big]
\]

我们知道:\(\mathop{\arg\min}\limits_{f} \mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} \big]\) 的解即为:

\[\mathbb{E}\big[ X_{n+h} ~ \big| ~ X_{n} \big]
\]

证明:

基于 \(X_{n}\) 求 \(\mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} \big]\) 的最小值,实际上:

\[\mathop{\arg\min}\limits_{f} \mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} \big] \iff \mathop{\arg\min}\limits_{f} \mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} ~ \big| ~ X_{n} \big]
\]

  • 私以为更严谨的写法是 \(\mathop{\text{argmin}}\limits_{f} ~ \mathbb{E}\Big[\Big(X_{n+h} - f\big( X_{n}\big)\Big)^{2} ~ | ~ \mathcal{F}_{n}\Big]\),其中 \(\left\{ \mathcal{F}_{t}\right\}_{t\geq 0}\) 为 \(\left\{ X_{t} \right\}_{t\geq 0}\) 相关的 natural filtration,but whatever。

等式右侧之部分:

\[\begin{align*}
\mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} ~ \big| ~ X_{n} \big] & = \mathbb{E}[X_{n+h}^{2} ~ | ~ X_{n}] - 2f(X_{n})\mathbb{E}[X_{n+h} ~ | ~ X_{n}] + f^{2}(X_{n}) \\
\end{align*}
\]

其中由于:

\[\begin{align*}
Var(X_{n+h} ~ | ~ X_{n}) & = \mathbb{E}\Big[ \big( X_{n+h} - \mathbb{E}\big[ X_{n+h}^{2} ~ | ~ X_{n} \big] \big)^{2} ~ \Big| ~ X_{n} \Big] \\
& = \mathbb{E}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] - 2\mathbb{E}^{2}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] + \mathbb{E}^{2}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] \\
& = \mathbb{E}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] - \mathbb{E}^{2}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big]
\end{align*}
\]

which gives that:

\[\implies Var(X_{n+h} ~ | ~ X_{n}) = \mathbb{E}\big[ X_{n+h}^{2} ~ \big| ~ X_{n} \big] - \mathbb{E}^{2}\big[ X_{n+h} ~ \big| ~ X_{n} \big]
\]

因此,

\[\begin{align*}
\mathbb{E}\big[ \big( X_{n+h} - f(X_{n}) \big)^{2} ~ \big| ~ X_{n} \big] & = Var(X_{n+h} ~ | ~ X_{n}) + \mathbb{E}^{2}\big[ X_{n+h} ~ \big| ~ X_{n}\big] - 2f(X_{n})\mathbb{E}[X_{n+h} ~ | ~ X_{n}] + f^{2}(X_{n}) \\
& = Var(X_{n+h} ~ | ~ X_{n}) + \Big( \mathbb{E}\big[ X_{n+h} ~ \big| ~ X_{n}\big] - f(X_{n}) \Big)^{2}
\end{align*}
\]

方差 \(Var(X_{n+h} ~ | ~ X_{n})\) 为定值,那么 optimal solution \(m(X_{n})\) 显而易见:

\[m(X_{n}) = \mathbb{E}\big[ X_{n+h} ~ \big| ~ X_{n} \big]
\]

此时 \(\left\{ X_{t} \right\}\) 为一个 Stationary Gaussian Time Series, i.e.,

\[\begin{pmatrix}
X_{n+h}\\
X_{n}
\end{pmatrix} \sim N \begin{pmatrix}
\begin{pmatrix}
\mu \\
\mu
\end{pmatrix}, ~ \begin{pmatrix}
\gamma(0) & \gamma(h) \\
\gamma(h) & \gamma(0)
\end{pmatrix}
\end{pmatrix}
\]

那么我们有:

\[X_{n+h} ~ | ~ X_{n} \sim N\Big( \mu + \rho(h)\big(X_{n} - \mu\big), ~ \gamma(0)\big(1 - \rho^{2}(h)\big) \Big)
\]

其中 \(\rho(h)\) 为 \(\left\{ X_{t} \right\}\) 的 ACF,因此,

\[\mathbb{E}\big[ X_{n+h} ~ \big| ~ X_{n} \big] = m(X_{n}) = \mu + \rho(h) \big( X_{n} - \mu \big)
\]

注意:

若 \(\left\{ X_{t} \right\}\) 是一个 Gaussian time series,则一定能计算 best MSE predictor。而若 \(\left\{ X_{t} \right\}\) 并非 Gaussian time series,则计算通常十分复杂。

因此,我们通常不找 best MSE predictor,而寻找 best linear predictor。


Best Linear Predictor (BLP)

在 BLP 假设下,我们寻找一个形如 \(f(X_{n}) \propto aX_{n} + b\) 的 predictor。

则目标为:

\[\text{minimize: } ~ S(a,b) = \mathbb{E} \big[ \big( X_{n+h} - aX_{n} -b \big)^{2} \big]
\]

推导:

分别对 \(a, b\) 求偏微分:

\[\begin{align*}
\frac{\partial}{\partial b} S(a, b) & = \frac{\partial}{\partial b} \mathbb{E} \big[ \big( X_{n+h} - aX_{n} -b \big)^{2} \big] \\
& = -2 \mathbb{E} \big[ X_{n+h} - aX_{n} - b \big] \\
\end{align*}
\]

令:

\[\frac{\partial}{\partial b} S(a, b) = 0
\]

则:

\[\begin{align*}
-2 \cdot & \mathbb{E} \big[ X_{n+h} - aX_{n} - b \big] = 0 \\
\implies & \qquad \mathbb{E}[X_{n+h}] - a\mathbb{E}[X_{n}] - b = 0\\
\implies & \qquad \mu - a\mu - b = 0 \\
\implies & \qquad b^{\star} = (1 - a^{\star}) \mu
\end{align*}
\]

回代并 take partial derivative on \(a\):

\[\begin{align*}
\frac{\partial}{\partial a} S(a, b) & = \frac{\partial}{\partial a} \mathbb{E} \big[ \big( X_{n+h} - aX_{n} - (1 - a)\mu \big)^{2} \big] \\
& = \frac{\partial}{\partial a} \mathbb{E} \Big[ \Big( \big(X_{n+h} - \mu \big) - \big( X_{n} - \mu \big) a \Big)^{2} \Big] \\
& = \mathbb{E} \Big[ - \big( X_{n} - \mu \big) \Big( \big(X_{n+h} - \mu \big) - \big( X_{n} - \mu \big) a \Big)\Big] \\
\end{align*}
\]

令:

\[\frac{\partial}{\partial a} S(a, b) = 0
\]

则:

\[\begin{align*}
& \mathbb{E} \Big[ - \big( X_{n} - \mu \big) \Big( \big(X_{n+h} - \mu \big) - \big( X_{n} - \mu \big) a \Big)\Big] = 0 \\
\implies & \qquad \mathbb{E} \Big[\big( X_{n} - \mu \big) \Big( \big(X_{n+h} - \mu \big) - \big( X_{n} - \mu \big) a \Big)\Big] = 0 \\
\implies & \qquad \mathbb{E} \Big[\big( X_{n} - \mu \big) \big(X_{n+h} - \mu \big) - a \big( X_{n} - \mu \big) \big( X_{n} - \mu \big) \Big] = 0 \\
\implies & \qquad \mathbb{E} \Big[\big( X_{n} - \mu \big) \big(X_{n+h} - \mu \big) \Big] = a \cdot \mathbb{E} \Big[\big( X_{n} - \mu \big) \big( X_{n} - \mu \big) \Big] \\
\implies & \qquad \mathbb{E} \Big[\big( X_{n} - \mathbb{E}[X_{n}] \big) \big(X_{n+h} - \mathbb{E}[X_{n+h}] \big) \Big] = a \cdot \mathbb{E} \Big[\big( X_{n} - \mathbb{E}[X_{n}] \big)^{2} \Big] \\
\implies & \qquad \text{Cov}(X_{n}, X_{n+h}) = a \cdot \text{Var}(X_{n}) \\
\implies & \qquad a^{\star} = \frac{\gamma(h)}{\gamma(0)} = \rho(h)
\end{align*}
\]

综上,time series \(\left\{ X_{n} \right\}\) 的 BLP 为:

\[f(X_{n}) = l(X_{n}) = \mu + \rho(h) \big( X_{n} - \mu \big)
\]

且 BLP 相关的 MSE 为:

\[\begin{align*}
\text{MSE} & = \mathbb{E}\big[ \big( X_{n+h} - l(X_{n}) \big)^{2} \big] \\
& = \mathbb{E} \Big[ \Big( X_{n+h} - \mu - \rho(h) \big( X_{n} - \mu \big) \Big)^{2} \Big] \\
& = \rho(0) \cdot \big( 1 - \rho^{2}(h) \big)
\end{align*}
\]

Time Series Analysis (Best MSE Predictor & Best Linear Predictor)的更多相关文章

  1. PP: Multilevel wavelet decomposition network for interpretable time series analysis

    Problem: the important frequency information is lack of effective modelling. ?? what is frequency in ...

  2. A New Recurrence-Network-Based Time Series Analysis Approach for Characterizing System Dynamics - Guangyu Yang, Daolin Xu * and Haicheng Zhang

    Purpose: characterize the evolution of dynamical systems. In this paper, a novel method based on eps ...

  3. survey on Time Series Analysis Lib

    (1)I spent my 4th year Computing project on implementing time series forecasting for Java heap usage ...

  4. time series analysis

    1 总体介绍 在以下主题中,我们将回顾有助于分析时间序列数据的技术,即遵循非随机顺序的测量序列.与在大多数其他统计数据的上下文中讨论的随机观测样本的分析不同,时间序列的分析基于数据文件中的连续值表示以 ...

  5. predict.glm -> which class does it predict?

    Jul 10, 2009; 10:46pm predict.glm -> which class does it predict? 2 posts Hi, I have a question a ...

  6. Visibility Graph Analysis of Geophysical Time Series: Potentials and Possible Pitfalls

    Tasks: invest papers  3 篇. 研究主动权在我手里.  I have to.  1. the benefit of complex network: complex networ ...

  7. Regression analysis

    Source: http://wenku.baidu.com/link?url=9KrZhWmkIDHrqNHiXCGfkJVQWGFKOzaeiB7SslSdW_JnXCkVHsHsXJyvGbDv ...

  8. Bayesian generalized linear model (GLM) | 贝叶斯广义线性回归实例

    一些问题: 1. 什么时候我的问题可以用GLM,什么时候我的问题不能用GLM? 2. GLM到底能给我们带来什么好处? 3. 如何评价GLM模型的好坏? 广义线性回归啊,虐了我快几个月了,还是没有彻底 ...

  9. Time Series data 与 sequential data 的区别

    It is important to note the distinction between time series and sequential data. In both cases, the ...

  10. 7、RNAseq Downstream Analysis

    Created by Dennis C Wylie, last modified on Jun 29, 2015 Machine learning methods (including cluster ...

随机推荐

  1. 基于SqlSugar的开发框架循序渐进介绍(21)-- 在工作流列表页面中增加一些转义信息的输出,在后端进行内容转换

    有时候,为了给前端页面输出内容,有时候我们需要准备和数据库不一样的实体信息,因为数据库可能记录的是一些引用的ID或者特殊字符,那么我们为了避免前端单独的进行转义处理,我们可以在后端进行统一的格式化后再 ...

  2. 6个tips缓解第三方访问风险

    随着开发和交付的压力越来越大,许多企业选择依赖第三方来帮助运营和发展业务.值得重视的是,第三方软件及服务供应商和合作伙伴也是云环境攻击面的重要组成部分.尽管企业无法完全切断与第三方的关联,但可以在向他 ...

  3. python解释器下载与安装指导手册

    python解释器下载与安装指导手册 1.python解释器 1.1下载地址 1 https://www.python.org/ 1.2.python解释器下载 1.3.python解释器主流版本 p ...

  4. Opengl ES之YUV数据渲染

    YUV回顾 记得在音视频基础知识介绍中,笔者专门介绍过YUV的相关知识,可以参考: <音视频基础知识-YUV图像> YUV数据量相比RGB较小,因此YUV适用于传输,但是YUV图不能直接用 ...

  5. <二>派生类的构造过程

    派生类从继承可以继承来所有的成员(变量和方法) 除了构造函数和析构函数 派生类怎么初始化从基类继承来的成员变量的呢?通过调用基类的构造函数来初始化 派生类的构造函数和析构函数,负责初始化和清理派生类部 ...

  6. 【极客时间】大数据概述及HDFS介绍

  7. 【离线数仓】Day04-即席查询(Ad Hoc):Presto链接不同数据源查询、Druid建多维表、Kylin使用cube快速查询

    一.Presto 1.简介 概念:大数据量.秒级.分布式SQL查询engine[解析SQL但不是数据库] 架构 不同worker对应不同的数据源(各数据源有对应的connector连接适配器) 优缺点 ...

  8. Day34.2:Calendar详解

    Calendar 1.1 概述 Date类中很多方法被Calendar所取代,Calendar类提供了获取和设置各种日历的方法. 1.2 方法 构造方法:Calendar类的构造器被protected ...

  9. ubuntu1804搭建FTP服务器的方法

    搭建FTP服务器 FTP的工作原理: FTP:File Transfer Protocol ,文件传输协议.属于NAS存储的一种协议,基于CS结构. ftp采用的是双端口模式,分为命令端口和数据端口, ...

  10. 如何使用Abstract类?抽象类的威力

    简介: 今天我想谈谈如何使用抽象类,以及抽象类真正的威力.本文将结合具体业务来说明如何使用抽象类. 业务简述: 本人目前只接触过PMS(物业管理系统),公司主要业务的是美国的租房业务.由于美国租房和中 ...