3.1.7. Cross validation of time series data

Time series data is characterised by the correlation between observations that are near in time (autocorrelation). However, classical cross-validation techniques such as KFold and ShuffleSplit assume the samples are independent and identically distributed, and would result in unreasonable correlation between training and testing instances (yielding poor estimates of generalisation error) on time series data. Therefore, it is very important to evaluate our model for time series data on the “future” observations least like those that are used to train the model. To achieve this, one solution is provided by TimeSeriesSplit.

3.1.7.1. Time Series Split

TimeSeriesSplit is a variation of k-fold which returns first  folds as train set and the  th fold as test set. Note that unlike standard cross-validation methods, successive training sets are supersets of those that come before them. Also, it adds all surplus data to the first training partition, which is always used to train the model.

This class can be used to cross-validate time series data samples that are observed at fixed time intervals.

Example of 3-split time series cross-validation on a dataset with 6 samples:

>>>

>>> from sklearn.model_selection import TimeSeriesSplit

>>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([1, 2, 3, 4, 5, 6])
>>> tscv = TimeSeriesSplit(n_splits=3)
>>> print(tscv)
TimeSeriesSplit(n_splits=3)
>>> for train, test in tscv.split(X):
... print("%s %s" % (train, test))
[0 1 2] [3]
[0 1 2 3] [4]
[0 1 2 3 4] [5]

3.1.7. Cross validation of time series data的更多相关文章

  1. 交叉验证(Cross Validation)原理小结

    交叉验证是在机器学习建立模型和验证模型参数时常用的办法.交叉验证,顾名思义,就是重复的使用数据,把得到的样本数据进行切分,组合为不同的训练集和测试集,用训练集来训练模型,用测试集来评估模型预测的好坏. ...

  2. 交叉验证 Cross validation

    来源:CSDN: boat_lee 简单交叉验证 hold-out cross validation 从全部训练数据S中随机选择s个样例作为训练集training set,剩余的作为测试集testin ...

  3. Cross Validation done wrong

    Cross Validation done wrong Cross validation is an essential tool in statistical learning 1 to estim ...

  4. 交叉验证(cross validation)

    转自:http://www.vanjor.org/blog/2010/10/cross-validation/ 交叉验证(Cross-Validation): 有时亦称循环估计, 是一种统计学上将数据 ...

  5. 10折交叉验证(10-fold Cross Validation)与留一法(Leave-One-Out)、分层采样(Stratification)

    10折交叉验证 我们构建一个分类器,输入为运动员的身高.体重,输出为其从事的体育项目-体操.田径或篮球. 一旦构建了分类器,我们就可能有兴趣回答类似下述的问题: . 该分类器的精确率怎么样? . 该分 ...

  6. Cross Validation(交叉验证)

    交叉验证(Cross Validation)方法思想 Cross Validation一下简称CV.CV是用来验证分类器性能的一种统计方法. 思想:将原始数据(dataset)进行分组,一部分作为训练 ...

  7. S折交叉验证(S-fold cross validation)

    S折交叉验证(S-fold cross validation) 觉得有用的话,欢迎一起讨论相互学习~Follow Me 仅为个人观点,欢迎讨论 参考文献 https://blog.csdn.net/a ...

  8. 交叉验证(Cross Validation)简介

    参考    交叉验证      交叉验证 (Cross Validation)刘建平 一.训练集 vs. 测试集 在模式识别(pattern recognition)与机器学习(machine lea ...

  9. cross validation笔记

    preface:做实验少不了交叉验证,平时常用from sklearn.cross_validation import train_test_split,用train_test_split()函数将数 ...

随机推荐

  1. 《linux系统及其编程》实验课记录(四)

    实验4:组织目录和文件 实验目标: 熟悉几个基本的操作系统文件和目录的命令的功能.语法和用法, 整理出一个更有条理的主目录,每个文件都位于恰当的子目录. 实验背景: 你的主目录中已经积压了一些文件,你 ...

  2. python 处理抓取网页乱码

    python 处理抓取网页乱码问题一招鲜   相信用python的人一定在抓取网页时,被编码问题弄晕过一阵 前几天写了一个测试网页的小脚本,并查找是否包含指定的信息. 在html = urllib2. ...

  3. 创建4个线程,两个对j加一,两个对j减一(j两同两内)

    package multithread; public class MyThread { //j变量私有 private int j; //同步的+1方法 private synchronized v ...

  4. Java多线程实现自然同步(内含演示案例)

    1.准备一个生产者类: public class Producer extends Thread{ private String name; private Market mkt; static in ...

  5. java基础---->数组的基础使用(一)

    数组是一种效率最高的存储和随机访问对象引用序列的方式,我们今天来对数组做简单的介绍.手写瑶笺被雨淋,模糊点画费探寻,纵然灭却书中字,难灭情人一片心. 数组的简单使用 一.数组的赋值 String[] ...

  6. ios UITableView中Cell重用机制导致内容重复解决方法

    UITableView继承自UIScrollview,是苹果为我们封装好的一个基于scroll的控件.上面主要是一个个的 UITableViewCell,可以让UITableViewCell响应一些点 ...

  7. POI导出EXCEL经典实现(转载)

    1.Apache POI简介 Apache POI是Apache软件基金会的开放源码函式库,POI提供API给Java程式对Microsoft Office格式档案读和写的功能. .NET的开发人员则 ...

  8. git sourcetree忽略某些文件提交

    打开sourcetree 点击edit按钮,在文件中加入如下内容.*.iws*.iml*.iprtarget/.settings.project.classpath.externalToolBuild ...

  9. Java--运算符的优先级表

    Java运算符的优先级表:

  10. MySQL备份1356错误提示修复办法

    mysqldump备份出现错误提示 mysqldump: Couldn't execute 'SHOW FIELDS FROM `view_videos`': View 'hekegame_video ...