Autocorrelation in Time Series Data
Why Time Series Data Is Unique
A time series is a series of data points indexed in time. The fact that time series data is ordered makes it unique in the data space because it often displays serial dependence序列依赖. Serial dependence occurs when the value of a datapoint at one time is statistically dependent on another datapoint in another time. However, this attribute of time series data violates违反 one of the fundamental assumptions of many statistical analyses — that data is statistically independent.
What Is Autocorrelation?
Autocorrelation is a type of serial dependence. Specifically, autocorrelation is when a time series is linearly related to a lagged version of itself. By contrast, correlation is simply when two independent variables are linearly related.
Why Autocorrelation Matters
Often, one of the first steps in any data analysis is performing regression analysis. However, one of the assumptions of regression analysis is that the data has no autocorrelation. This can be frustrating because if you try to do a regression analysis on data with autocorrelation, then your analysis will be misleading.
Additionally, some time series forecasting methods (specifically regression modeling) rely on the assumption that there isn’t any autocorrelation in the residuals (the difference between the fitted model and the data). People often use the residuals to assess whether their model is a good fit while ignoring that assumption that the residuals have no autocorrelation (or that the errors are independent and identically distributed or i.i.d). This mistake can mislead people into believing that their model is a good fit when in fact it isn’t. I highly recommend reading this article about How (not) to use Machine Learning for time series forecasting: Avoiding the pitfalls in which the author demonstrates how the increasingly popular LSTM (Long Short Term Memory) Network can appear to be an excellent univariate time series predictor, when in reality it’s just overfitting the data. He goes further to explain how this misconception is the result of accuracy metrics failing due to the presence of autocorrelation.
Finally, perhaps the most compelling aspect of autocorrelation analysis is how it can help us uncover hidden patterns in our data and help us select the correct forecasting methods. Specifically, we can use it to help identify seasonality and trend in our time series data. Additionally, analyzing the autocorrelation function (ACF) and partial autocorrelation function (PACF) in conjunction is necessary for selecting the appropriate ARIMA model for your time series prediction.
How to Determine if Your Time Series Data Has Autocorrelation by python
For this exercise, I’m using InfluxDB and the InfluxDB Python CL. I am using available data from the National Oceanic and Atmospheric Administration’s (NOAA) Center for Operational Oceanographic Products and Services. Specifically, I will be looking at the water levels and water temperatures of a river in Santa Monica.
Autocorrelation in Time Series Data的更多相关文章
- 3.1.7. Cross validation of time series data
3.1.7. Cross validation of time series data Time series data is characterised by the correlation bet ...
- 增长中的时间序列存储(Scaling Time Series Data Storage) - Part I
本文摘译自 Netflix TechBlog : Scaling Time Series Data Storage - Part I 重点:扩容.缓存.冷热分区.分块. 时序数据 - 会员观看历史 N ...
- 时间序列大数据平台建设(Time Series Data,简称TSD)
来源:https://blog.csdn.net/bluishglc/article/details/79277455 引言在大数据的生态系统里,时间序列数据(Time Series Data,简称T ...
- vehicle time series data analysis
以HADOOP为代表的云计算提供的仅仅是一个算法执行环境,为大数据的并行计算提供了在现有软硬件水平下最好的(近似)方法.并不能解决大数据应用中的全部问题.从详细应用而言,通过物联网方式接入IT圈的数据 ...
- PP: Tripoles: A new class of relationships in time series data
Problem: ?? mining relationships in time series data; A new class of relationships in time series da ...
- Time Series data 与 sequential data 的区别
It is important to note the distinction between time series and sequential data. In both cases, the ...
- PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
From: Stanford University; Jure Leskovec, citation 6w+; Problem: subsequence clustering. Challenging ...
- Anomaly Detection for Time Series Data with Deep Learning——本质分类正常和异常的行为,对于检测异常行为,采用预测正常行为方式来做
A sample network anomaly detection project Suppose we wanted to detect network anomalies with the un ...
- Time series data mining
from here 论文Timeseries data mining(2012)中提出:时间序列数据挖掘包括7个基本任务和3个基础问题: 7 tasks: query by content clust ...
随机推荐
- Charles抓包工具的破解以及使用
一.破解 官网下载Charles 下载Charles.jar ,然后按照后在Charles→lib中替换掉Charles.jar 链接:https://pan.baidu.com/s/1XZ-aZI5 ...
- jQuery---jQuery插件
jQuery插件 使用插件的步骤 1. 引入jQuery文件 2. 引入插件(如果有用到css的话,需要引入css) 3. 使用插件 <!--1. 引入jquery的js文件--> < ...
- stream重复Key的处理
Map<String, List<Model>> modelMap = modelList .stream() .collect(Collectors .toMap(model ...
- 基于90nm CMOS技术的功能齐全的64Mb DDR3 STT-MRAM
自旋转矩磁阻随机存取存储器(ST-MRAM)有望成为一种快速,高密度的非易失性存储器,可以增强各种应用程序的性能,特别是在用作数据存储中的非易失性缓冲器时设备和系统.为此,everspin开发了基于9 ...
- Function and Function
If we define , do you know what function means? Actually, calculates the total number of enclosed ...
- 一个抓猫的游戏 消遣GAME 持续更新中!
一个抓猫的游戏 版本 Catch_Cat_V0.30 https://files-cdn.cnblogs.com/files/send-off-a-friend/Catch_Cat_V0.3.rar ...
- Java【第二课 扫描仪 & 布尔数据类型】
一.Java扫描仪 为了更加方便的理解,我先将逻辑框图 这个有点像C语言的scan()的用法 import java.util.Scanner; //导入扫描仪 public class demo{ ...
- PS_0002:改变曲线,改变色阶
1,ctrl + m 改变曲线 2,ctrl + l 改变色阶
- Callablestatement与JavaBean及其实例
一. Callablestatement:调用 数据库中的存储过程.存储函数 connection.prepareCall(参数:存储过程/存储函数名)参数格式:存储过程:(无返回值return,用O ...
- Error: cannot fetch last explain plan from PLAN_TABLE
最近遇到了错误"Error: cannot fetch last explain plan from PLAN_TABLE",于是稍微研究了一下哪些场景下碰到这种错误,具体参考下面 ...