Autocorrelation in Time Series Data
Why Time Series Data Is Unique
A time series is a series of data points indexed in time. The fact that time series data is ordered makes it unique in the data space because it often displays serial dependence序列依赖. Serial dependence occurs when the value of a datapoint at one time is statistically dependent on another datapoint in another time. However, this attribute of time series data violates违反 one of the fundamental assumptions of many statistical analyses — that data is statistically independent.
What Is Autocorrelation?
Autocorrelation is a type of serial dependence. Specifically, autocorrelation is when a time series is linearly related to a lagged version of itself. By contrast, correlation is simply when two independent variables are linearly related.
Why Autocorrelation Matters
Often, one of the first steps in any data analysis is performing regression analysis. However, one of the assumptions of regression analysis is that the data has no autocorrelation. This can be frustrating because if you try to do a regression analysis on data with autocorrelation, then your analysis will be misleading.
Additionally, some time series forecasting methods (specifically regression modeling) rely on the assumption that there isn’t any autocorrelation in the residuals (the difference between the fitted model and the data). People often use the residuals to assess whether their model is a good fit while ignoring that assumption that the residuals have no autocorrelation (or that the errors are independent and identically distributed or i.i.d). This mistake can mislead people into believing that their model is a good fit when in fact it isn’t. I highly recommend reading this article about How (not) to use Machine Learning for time series forecasting: Avoiding the pitfalls in which the author demonstrates how the increasingly popular LSTM (Long Short Term Memory) Network can appear to be an excellent univariate time series predictor, when in reality it’s just overfitting the data. He goes further to explain how this misconception is the result of accuracy metrics failing due to the presence of autocorrelation.
Finally, perhaps the most compelling aspect of autocorrelation analysis is how it can help us uncover hidden patterns in our data and help us select the correct forecasting methods. Specifically, we can use it to help identify seasonality and trend in our time series data. Additionally, analyzing the autocorrelation function (ACF) and partial autocorrelation function (PACF) in conjunction is necessary for selecting the appropriate ARIMA model for your time series prediction.
How to Determine if Your Time Series Data Has Autocorrelation by python
For this exercise, I’m using InfluxDB and the InfluxDB Python CL. I am using available data from the National Oceanic and Atmospheric Administration’s (NOAA) Center for Operational Oceanographic Products and Services. Specifically, I will be looking at the water levels and water temperatures of a river in Santa Monica.
Autocorrelation in Time Series Data的更多相关文章
- 3.1.7. Cross validation of time series data
3.1.7. Cross validation of time series data Time series data is characterised by the correlation bet ...
- 增长中的时间序列存储(Scaling Time Series Data Storage) - Part I
本文摘译自 Netflix TechBlog : Scaling Time Series Data Storage - Part I 重点:扩容.缓存.冷热分区.分块. 时序数据 - 会员观看历史 N ...
- 时间序列大数据平台建设(Time Series Data,简称TSD)
来源:https://blog.csdn.net/bluishglc/article/details/79277455 引言在大数据的生态系统里,时间序列数据(Time Series Data,简称T ...
- vehicle time series data analysis
以HADOOP为代表的云计算提供的仅仅是一个算法执行环境,为大数据的并行计算提供了在现有软硬件水平下最好的(近似)方法.并不能解决大数据应用中的全部问题.从详细应用而言,通过物联网方式接入IT圈的数据 ...
- PP: Tripoles: A new class of relationships in time series data
Problem: ?? mining relationships in time series data; A new class of relationships in time series da ...
- Time Series data 与 sequential data 的区别
It is important to note the distinction between time series and sequential data. In both cases, the ...
- PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
From: Stanford University; Jure Leskovec, citation 6w+; Problem: subsequence clustering. Challenging ...
- Anomaly Detection for Time Series Data with Deep Learning——本质分类正常和异常的行为,对于检测异常行为,采用预测正常行为方式来做
A sample network anomaly detection project Suppose we wanted to detect network anomalies with the un ...
- Time series data mining
from here 论文Timeseries data mining(2012)中提出:时间序列数据挖掘包括7个基本任务和3个基础问题: 7 tasks: query by content clust ...
随机推荐
- ECharts展示后台数据
/** * Created by Administrator on 2015/11/10 010. */ var home = function () { //项目预警分析 var getProAla ...
- jQuery---$冲突的解决方案
$冲突的解决方案 遇到其他js文件也用$包装了函数.可以把jQuery放在后面,并释放下$的控制权,也可以换个字符替代原来的$,例如$$ 或者,jQuery //jQuery释放$的控制权 $$ = ...
- Linux命令详解之–chmod命令
在Linux中,一般使用chmod命令来修改文件的属性. 利用 chmod 可以藉以控制文件如何被他人所调用.此命令所有使用者都可使用. 一.Linux chmod命令语法Linux chmod 命令 ...
- CentOS7安装docker和docker-compose
1.安装docker # 使用yum安装docker yum -y install docker # 启动 systemctl start docker.service # 设置为开机自启动 syst ...
- js模拟form提交 导出数据
//创建模拟提交formfunction dataExport(option) { var form = $("<form method='get'></form>& ...
- 数据结构(集合)学习之Map(一)
集合 框架关系图: 补充:HashTable父类是Dictionary,不是AbstractMap. Map: Map(接口)和Collection都属于集合,但是Map不是Collection的子类 ...
- ubuntu--- tracker/libdeepsort.so 找不到cv报错
一.刚开始解决尝试:因为“删掉lib下的libdeepsort.so报错”,原先以为是 libdeepsort.so 需要拷贝到 /lib路径下的问题,可是因为后来的工程有的好使,又的不好使了.''' ...
- (node:7584) UnhandledPromiseRejectionWarning: MongooseTimeoutError: Server selection timed out after 30000 ms
记录一次学习node.js犯的低级错误 这里遇到一个这样的问题 express连接mongoose时报错(node:7584) UnhandledPromiseRejectionWarning: Mo ...
- Mac下git的安装配置以及gerrit初次使用
1.Mac下git下载 在终端首次运行git命令,若未安装,会提示下载开发者工具Xcode,根据提示下载即可: 2.查看git版本 git version 2.首次使用git配置 git config ...
- 白面系列 docker
在讲docker之前,首先区分2个概念,容器和虚拟机. 容器: 虚拟机: 简单来说,容器虚拟化操作系统:虚拟机虚拟化硬件. 容器粒度更小更灵活:虚拟机包含资源更多更大. docker就是用来做容器化的 ...