PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
From: Stanford University; Jure Leskovec, citation 6w+;
Problem:
subsequence clustering.
Challenging:
discover patterns is challenging because it requires simultaneous segmentation and clustering of the time series + interpreting the cluster results is difficult.
Why discover time series patterns is a challenge?? thinking by yourself!! there are already so many distance measures(DTW, manifold distance) and clustering methods(knn,k-means etc.). But I admit the interpretation is difficult.
Introduction:
long time series ----breakdown-----> a sequence of states/patterns ------> so time series can be expressed as a sequential timeline of a few key states. -------> discover repeated patterns/ understand trends/ detect anomalies/ better interpret large and high-dimensional datasets.
Key steps: simultaneously segment and cluster the time series.
Unsupervised learning: hard to interpretation, after clustering, you have to view data itself.
how to discover interpretable structure in the data?
Traditional clustering methods are not particularly well-suited to discover interpretable structure in the data. This is because they typically rely on distance-based metrics
distance-based metrics, DTW.
距离式的算法,在处理multivariate time series上有劣势,看不到细微的数据结构相似性。
Propose a new method for multivariate time series clustering TICC:
- define each cluster as a dependency network showing the relationships between the different sensors in a short subsequence.
- each cluster is a markov random field.
- In thes MRFs, an edge represents a partial correlation between two variables.
- learn each cluster's MRF by estimating a sparse Gaussian inverse covariance matrix.
- This network has multiple layers.
- the number of layers corresponds to the window size of a short subsequence.
- 逆协方差矩阵定义了MRF dependency network 的adjaccency matrix.
Related work:
time series clustering and convex optimization;
variations of dtw; symbolic representations; rule-based motif discovery;
However, these methods generally rely on distance-based metrics.
TICC ------ a model-based clustering method, like ARMA, Gaussian mixture or hidden markov models.
- define each cluster by a Gaussian inverse covariance.
- so the Gaussian inverse covariance defines a Markov random field encoding the structural representation.
- K clusters/ inverse covariances.
selecting the number of clusters: cross-validation; mornalized mutual information; BIC or silhouette score.
看不懂哇 T T
Supplementary knowledge:
1. 对于unsupervised learning, 目前对结果的解释或者中间参数的选取,全是靠经验。
2. Aarhus data, Martin, 做多变量time series 预测。
3. Toeplitz Matrices: 常对角矩阵。
4. ticc code
Reference:
PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data的更多相关文章
- PP: Tripoles: A new class of relationships in time series data
Problem: ?? mining relationships in time series data; A new class of relationships in time series da ...
- 图Lasso求逆协方差矩阵(Graphical Lasso for inverse covariance matrix)
图Lasso求逆协方差矩阵(Graphical Lasso for inverse covariance matrix) 作者:凯鲁嘎吉 - 博客园 http://www.cnblogs.com/ka ...
- PP: Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network
PROBLEM: OmniAnomaly multivariate time series anomaly detection + unsupervised 主体思想: input: multivar ...
- PP: Deep r -th Root of Rank Supervised Joint Binary Embedding for Multivariate Time Series Retrieval
from: Dacheng Tao 悉尼大学 PROBLEM: time series retrieval: given the current multivariate time series se ...
- PP: Unsupervised deep embedding for clustering analysis
Problem: unsupervised clustering represent data in feature space; learn a non-linear mapping from da ...
- [转]Multivariate Time Series Forecasting with LSTMs in Keras
1. Air Pollution Forecasting In this tutorial, we are going to use the Air Quality dataset. This is ...
- PP: A dual-stage attention-based recurrent neural network for time series prediction
Problem: time series prediction The nonlinear autoregressive exogenous model: The Nonlinear autoregr ...
- PP: Deep clustering based on a mixture of autoencoders
Problem: clustering A clustering network transforms the data into another space and then selects one ...
- PP: Time series clustering via community detection in Networks
Improvement can be done in fulture:1. the algorithm of constructing network from distance matrix. 2. ...
随机推荐
- java设计模式--迪米特法则
基本介绍 1.一个对象应该对其他对象保持最少的了解 2.类与类关系越密切,耦合度越大 3.迪米特法则又叫最少知道原则,即一个类对自己依赖的类知道的越少越好.也就是说,对于被依赖的类不管多么复杂,都尽量 ...
- 医院信息集成平台(ESB)实施、建设方案
医院信息集成平台(ESB)实施.建设方案 基于中立.标准.开放的IT架构和数据标准,打造插拔式医院应用生态. 解决方案 基于ESB集成总线,构建医院信息化建设顶层设计. ...
- bs 网站获取电子秤重量方案
1:开发一个winform小程序专门用来读取电子秤数据 电子秤链接串口开发需要注意的是 端口名称跟波特率,校验位 (本样例设置的是7)一定要对,不然取出来的是错的, 还有串口取出来数据是反的,需要转过 ...
- myeclipce 按 Alt + / 代码提示无法感应自己定义的类 解决方案
解决方案:如图把这些选项全部都勾选即可.(注意 :需先排除是不是快捷键冲突,我遇到的问题是Alt+/可以提示jdk内置的对象和方法,但是无法提示自定义的类和方法.如Alt+/无法提示任何信息 需重新设 ...
- Q函数和值函数
Q函数:奖励和 总奖励是在状态st采取行为at的奖励的期望和 值函数:奖励和 总奖励是在状态st下获得的奖励的期望和 下面是值函数另外的定义,在at行为下采取策略的Q函数的期望 是RL的目标函数,我理 ...
- P1089题解 津津的储蓄计划
来水一篇题解 #include <iostream> using namespace std; int main() { int month[12]; int mother=0,have= ...
- Vue中进度条的使用
1. 安装npm install --save nprogress 2.导入js和css import NProgress from 'nprogress'import 'nprogress/npro ...
- Flex布局如何实现最后一个元素右对齐,或者第一个元素左对齐
先来看看一个例子 在一个div我们把四个按钮全部放到右边去了,看下效果↓ 这个时候我们想把第一个按钮左对齐,其他保持不变 这时候我们来个第一个按钮样式上加上 :margin-right: auto; ...
- Redis入门-01
目录 使用场景 支持的数据类型 主从复制 原理 配置 哨兵机制 持久化 RDB(Redis Database) AOF(Append Only File) redis(Remote DIctionar ...
- ABS与PC材质
PC材料和ABS材料都是最常用的塑料材质,它们在材质.价格.性能上都有不同点. (一)PC材料 优点:PC是一种综合性能优良的非晶型热塑性树脂,具有优异的电绝缘性.延伸性.尺寸稳定性及耐化学腐蚀性,较 ...