Autocorrelation in Time Series Data
Why Time Series Data Is Unique
A time series is a series of data points indexed in time. The fact that time series data is ordered makes it unique in the data space because it often displays serial dependence序列依赖. Serial dependence occurs when the value of a datapoint at one time is statistically dependent on another datapoint in another time. However, this attribute of time series data violates违反 one of the fundamental assumptions of many statistical analyses — that data is statistically independent.
What Is Autocorrelation?
Autocorrelation is a type of serial dependence. Specifically, autocorrelation is when a time series is linearly related to a lagged version of itself. By contrast, correlation is simply when two independent variables are linearly related.
Why Autocorrelation Matters
Often, one of the first steps in any data analysis is performing regression analysis. However, one of the assumptions of regression analysis is that the data has no autocorrelation. This can be frustrating because if you try to do a regression analysis on data with autocorrelation, then your analysis will be misleading.
Additionally, some time series forecasting methods (specifically regression modeling) rely on the assumption that there isn’t any autocorrelation in the residuals (the difference between the fitted model and the data). People often use the residuals to assess whether their model is a good fit while ignoring that assumption that the residuals have no autocorrelation (or that the errors are independent and identically distributed or i.i.d). This mistake can mislead people into believing that their model is a good fit when in fact it isn’t. I highly recommend reading this article about How (not) to use Machine Learning for time series forecasting: Avoiding the pitfalls in which the author demonstrates how the increasingly popular LSTM (Long Short Term Memory) Network can appear to be an excellent univariate time series predictor, when in reality it’s just overfitting the data. He goes further to explain how this misconception is the result of accuracy metrics failing due to the presence of autocorrelation.
Finally, perhaps the most compelling aspect of autocorrelation analysis is how it can help us uncover hidden patterns in our data and help us select the correct forecasting methods. Specifically, we can use it to help identify seasonality and trend in our time series data. Additionally, analyzing the autocorrelation function (ACF) and partial autocorrelation function (PACF) in conjunction is necessary for selecting the appropriate ARIMA model for your time series prediction.
How to Determine if Your Time Series Data Has Autocorrelation by python
For this exercise, I’m using InfluxDB and the InfluxDB Python CL. I am using available data from the National Oceanic and Atmospheric Administration’s (NOAA) Center for Operational Oceanographic Products and Services. Specifically, I will be looking at the water levels and water temperatures of a river in Santa Monica.
Autocorrelation in Time Series Data的更多相关文章
- 3.1.7. Cross validation of time series data
3.1.7. Cross validation of time series data Time series data is characterised by the correlation bet ...
- 增长中的时间序列存储(Scaling Time Series Data Storage) - Part I
本文摘译自 Netflix TechBlog : Scaling Time Series Data Storage - Part I 重点:扩容.缓存.冷热分区.分块. 时序数据 - 会员观看历史 N ...
- 时间序列大数据平台建设(Time Series Data,简称TSD)
来源:https://blog.csdn.net/bluishglc/article/details/79277455 引言在大数据的生态系统里,时间序列数据(Time Series Data,简称T ...
- vehicle time series data analysis
以HADOOP为代表的云计算提供的仅仅是一个算法执行环境,为大数据的并行计算提供了在现有软硬件水平下最好的(近似)方法.并不能解决大数据应用中的全部问题.从详细应用而言,通过物联网方式接入IT圈的数据 ...
- PP: Tripoles: A new class of relationships in time series data
Problem: ?? mining relationships in time series data; A new class of relationships in time series da ...
- Time Series data 与 sequential data 的区别
It is important to note the distinction between time series and sequential data. In both cases, the ...
- PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
From: Stanford University; Jure Leskovec, citation 6w+; Problem: subsequence clustering. Challenging ...
- Anomaly Detection for Time Series Data with Deep Learning——本质分类正常和异常的行为,对于检测异常行为,采用预测正常行为方式来做
A sample network anomaly detection project Suppose we wanted to detect network anomalies with the un ...
- Time series data mining
from here 论文Timeseries data mining(2012)中提出:时间序列数据挖掘包括7个基本任务和3个基础问题: 7 tasks: query by content clust ...
随机推荐
- css基础-定位+网页布局案例
position:static 忽略top/bottom/left/right或者z-index position:relative 设置相对定位的元素不会脱离文档流 position:fixed 不 ...
- 剑指offer-面试题5-替换空格-字符串
/* 题目: 请实现一个函数,把字符串中的每个空格替换成'%20'. 例如输入“We are happy",则输出 ”We%20are%happy". */ /* 结题思路: 考虑 ...
- chorme输入框autocomplete(移动端)
输入框自动填充密码即使是type是text也别填充,尝试了 https://developer.mozilla.org/zh-CN/docs/Web/Security/Securing_your_si ...
- mysql版本报错
IntelliJIdea2019.3打开原项目报mysql版本报错: Error opening zip file or JAR manifest missing : /C:/Users/flycat ...
- 小白的java学习之路 “ 二重循环”
二重循环: 1.什么是二重循环: 一个循环体内又包含另一个完整的循环结构 语法: while(循环条件1) { //循环操作1 while(循环条件2) { //循环操作2 } } do { //循环 ...
- gulp常用插件之autoprefixer使用
更多gulp常用插件使用请访问:gulp常用插件汇总 autoprefixer这是一款自动管理浏览器前缀的插件,它可以解析CSS文件并且添加浏览器前缀到CSS内容里. 更多使用文档请点击访问autop ...
- 注解配置springMVC
在随笔“springMVC项目配置文件”的基础上,进行优化,使用注解配置,控制器类得以简化: 一.注解配置springMVC 1.在HelloController类中,去除实现的Controller接 ...
- Bootstrap4一些零散的知识点
·Bootstrap 是全球最受欢迎的前端组件库,用于开发响应式布局.移动设备优先的 WEB 项目. Bootstrap4 目前是 Bootstrap 的最新版本,是一套用于 HTML.CSS 和 J ...
- 用 ArcMap 发布 ArcGIS Server Feature Server Feature Access 服务
1. 安装Desktop, 2. 安装ArcGIS Server 3. 安装PostgreSQL 9.5 从 C:\Program Files (x86)\ArcGIS\Desktop10.5\Dat ...
- 844. 走迷宫(bfs模板)
给定一个n*m的二维整数数组,用来表示一个迷宫,数组中只包含0或1,其中0表示可以走的路,1表示不可通过的墙壁. 最初,有一个人位于左上角(1, 1)处,已知该人每次可以向上.下.左.右任意一个方向移 ...