3.1.7. Cross validation of time series data
3.1.7. Cross validation of time series data
Time series data is characterised by the correlation between observations that are near in time (autocorrelation). However, classical cross-validation techniques such as KFold
and ShuffleSplit
assume the samples are independent and identically distributed, and would result in unreasonable correlation between training and testing instances (yielding poor estimates of generalisation error) on time series data. Therefore, it is very important to evaluate our model for time series data on the “future” observations least like those that are used to train the model. To achieve this, one solution is provided by TimeSeriesSplit
.
3.1.7.1. Time Series Split
TimeSeriesSplit
is a variation of k-fold which returns first folds as train set and the
th fold as test set. Note that unlike standard cross-validation methods, successive training sets are supersets of those that come before them. Also, it adds all surplus data to the first training partition, which is always used to train the model.
This class can be used to cross-validate time series data samples that are observed at fixed time intervals.
Example of 3-split time series cross-validation on a dataset with 6 samples:
>>> from sklearn.model_selection import TimeSeriesSplit >>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([1, 2, 3, 4, 5, 6])
>>> tscv = TimeSeriesSplit(n_splits=3)
>>> print(tscv)
TimeSeriesSplit(n_splits=3)
>>> for train, test in tscv.split(X):
... print("%s %s" % (train, test))
[0 1 2] [3]
[0 1 2 3] [4]
[0 1 2 3 4] [5]
3.1.7. Cross validation of time series data的更多相关文章
- 交叉验证(Cross Validation)原理小结
交叉验证是在机器学习建立模型和验证模型参数时常用的办法.交叉验证,顾名思义,就是重复的使用数据,把得到的样本数据进行切分,组合为不同的训练集和测试集,用训练集来训练模型,用测试集来评估模型预测的好坏. ...
- 交叉验证 Cross validation
来源:CSDN: boat_lee 简单交叉验证 hold-out cross validation 从全部训练数据S中随机选择s个样例作为训练集training set,剩余的作为测试集testin ...
- Cross Validation done wrong
Cross Validation done wrong Cross validation is an essential tool in statistical learning 1 to estim ...
- 交叉验证(cross validation)
转自:http://www.vanjor.org/blog/2010/10/cross-validation/ 交叉验证(Cross-Validation): 有时亦称循环估计, 是一种统计学上将数据 ...
- 10折交叉验证(10-fold Cross Validation)与留一法(Leave-One-Out)、分层采样(Stratification)
10折交叉验证 我们构建一个分类器,输入为运动员的身高.体重,输出为其从事的体育项目-体操.田径或篮球. 一旦构建了分类器,我们就可能有兴趣回答类似下述的问题: . 该分类器的精确率怎么样? . 该分 ...
- Cross Validation(交叉验证)
交叉验证(Cross Validation)方法思想 Cross Validation一下简称CV.CV是用来验证分类器性能的一种统计方法. 思想:将原始数据(dataset)进行分组,一部分作为训练 ...
- S折交叉验证(S-fold cross validation)
S折交叉验证(S-fold cross validation) 觉得有用的话,欢迎一起讨论相互学习~Follow Me 仅为个人观点,欢迎讨论 参考文献 https://blog.csdn.net/a ...
- 交叉验证(Cross Validation)简介
参考 交叉验证 交叉验证 (Cross Validation)刘建平 一.训练集 vs. 测试集 在模式识别(pattern recognition)与机器学习(machine lea ...
- cross validation笔记
preface:做实验少不了交叉验证,平时常用from sklearn.cross_validation import train_test_split,用train_test_split()函数将数 ...
随机推荐
- python XlsxWriter Example: Hello World
http://xlsxwriter.readthedocs.io/example_hello_world.html The simplest possible spreadsheet. This is ...
- jQuery功能函数详解
jQuery通过$.browser对象获取浏览器信息. 属性 说明msie 如果是ie为true,否则为falsemozilla 如果是mozilla相关的浏览器为true,否则为falsesafar ...
- jQuery实现提交按钮点击后变成正在处理字样并禁止点击的方法
本文实例讲述了jQuery实现提交按钮点击后变成正在处理字样并禁止点击的方法.分享给大家供大家参考.具体实现方法如下: 这里主要通过val方法设置按钮的文字,并用attr方法修改disabled属性实 ...
- C语言关系运算符
在上节<C语言if else语句>中看到,if 的判断条件中使用了<=.>.!=等符号,它们专门用在判断条件中,让程序决定下一步的操作,称为关系运算符(Relational O ...
- 谷歌浏览器chrome://inspect/#devices调试webview的页面和控制台布局错乱问题
谷歌浏览器chrome://inspect/#devices调试webview的页面和控制台布局错乱问题 : 谷歌浏览器的版本过高,选择60版本即可: 版本 60.0.3080.5(正式版本)
- Oracle中与日期时间有关的运算函数
1 ADD_MONTHS 格式:ADD_MONTHS(D,N) 说明:返回日期时间D加N月后对应的日期时间.N为正时则表示D之后:N为负时则表示为D之前:N为小数则会自动先删除小 ...
- vue 阻止表单默认的提交事件
form <form autocomplete="off" @submit.prevent="onSubmit"> <input type=& ...
- angular4 *ngFor获取index
angular 中的*ngFor指令的使用 <ul> <li *ngFor="let item in items">{{item}}</li> ...
- Groovy中的脚本与类
包名 当你在groovy中定义类的时候需要指定包名,这和java中类似不多介绍. 导入 groovy中的导入也跟java类似,有一下五种: 默认导入 groovy默认导入了一下几个包和类: impor ...
- 谨防in、or 公用性能问题
今天遇到一个奇葩的问题:在where条件中用了 m in(×××) or m>=10,查询直接超时,我看了一下,数据库中就2万条数据 我将查询改为了union all 结果就不超时了