PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
From: Stanford University; Jure Leskovec, citation 6w+;
Problem:
subsequence clustering.
Challenging:
discover patterns is challenging because it requires simultaneous segmentation and clustering of the time series + interpreting the cluster results is difficult.
Why discover time series patterns is a challenge?? thinking by yourself!! there are already so many distance measures(DTW, manifold distance) and clustering methods(knn,k-means etc.). But I admit the interpretation is difficult.
Introduction:
long time series ----breakdown-----> a sequence of states/patterns ------> so time series can be expressed as a sequential timeline of a few key states. -------> discover repeated patterns/ understand trends/ detect anomalies/ better interpret large and high-dimensional datasets.
Key steps: simultaneously segment and cluster the time series.
Unsupervised learning: hard to interpretation, after clustering, you have to view data itself.
how to discover interpretable structure in the data?
Traditional clustering methods are not particularly well-suited to discover interpretable structure in the data. This is because they typically rely on distance-based metrics
distance-based metrics, DTW.
距离式的算法,在处理multivariate time series上有劣势,看不到细微的数据结构相似性。
Propose a new method for multivariate time series clustering TICC:
- define each cluster as a dependency network showing the relationships between the different sensors in a short subsequence.
- each cluster is a markov random field.
- In thes MRFs, an edge represents a partial correlation between two variables.
- learn each cluster's MRF by estimating a sparse Gaussian inverse covariance matrix.
- This network has multiple layers.
- the number of layers corresponds to the window size of a short subsequence.
- 逆协方差矩阵定义了MRF dependency network 的adjaccency matrix.
Related work:
time series clustering and convex optimization;
variations of dtw; symbolic representations; rule-based motif discovery;
However, these methods generally rely on distance-based metrics.
TICC ------ a model-based clustering method, like ARMA, Gaussian mixture or hidden markov models.
- define each cluster by a Gaussian inverse covariance.
- so the Gaussian inverse covariance defines a Markov random field encoding the structural representation.
- K clusters/ inverse covariances.
selecting the number of clusters: cross-validation; mornalized mutual information; BIC or silhouette score.
看不懂哇 T T
Supplementary knowledge:
1. 对于unsupervised learning, 目前对结果的解释或者中间参数的选取,全是靠经验。
2. Aarhus data, Martin, 做多变量time series 预测。
3. Toeplitz Matrices: 常对角矩阵。
4. ticc code
Reference:
PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data的更多相关文章
- PP: Tripoles: A new class of relationships in time series data
Problem: ?? mining relationships in time series data; A new class of relationships in time series da ...
- 图Lasso求逆协方差矩阵(Graphical Lasso for inverse covariance matrix)
图Lasso求逆协方差矩阵(Graphical Lasso for inverse covariance matrix) 作者:凯鲁嘎吉 - 博客园 http://www.cnblogs.com/ka ...
- PP: Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network
PROBLEM: OmniAnomaly multivariate time series anomaly detection + unsupervised 主体思想: input: multivar ...
- PP: Deep r -th Root of Rank Supervised Joint Binary Embedding for Multivariate Time Series Retrieval
from: Dacheng Tao 悉尼大学 PROBLEM: time series retrieval: given the current multivariate time series se ...
- PP: Unsupervised deep embedding for clustering analysis
Problem: unsupervised clustering represent data in feature space; learn a non-linear mapping from da ...
- [转]Multivariate Time Series Forecasting with LSTMs in Keras
1. Air Pollution Forecasting In this tutorial, we are going to use the Air Quality dataset. This is ...
- PP: A dual-stage attention-based recurrent neural network for time series prediction
Problem: time series prediction The nonlinear autoregressive exogenous model: The Nonlinear autoregr ...
- PP: Deep clustering based on a mixture of autoencoders
Problem: clustering A clustering network transforms the data into another space and then selects one ...
- PP: Time series clustering via community detection in Networks
Improvement can be done in fulture:1. the algorithm of constructing network from distance matrix. 2. ...
随机推荐
- 11种常用css样式之开篇文本字体学习
常见css样式:1.字体与颜色2.背景属性3.文本属性4.边框属性5.鼠标光标属性6.列表样式7.定位属性8.内外边距9.浮动和清除浮动10.滚动条11.显示和隐藏 文本:1.letter-spaci ...
- 学习css常用基本层级伪类属性选择器
常见的css选择器包含:常用选择器.基本选择器.层级选择器.伪类选择器.属性选择器,其中常用选择器分为:1.html选择符*{}//给页面上所有的标签设置模式:2.类选择符.hcls{}//给clas ...
- 关系模式范式分解教程 3NF与BCNF口诀
https://blog.csdn.net/sumaliqinghua/article/details/86246762 [通俗易懂]关系模式范式分解教程 3NF与BCNF口诀!小白也能看懂原创置顶 ...
- pip 自己的源 搭建
1 安装工具 pip install pip2pi 2 下载 所需要的包 pip2tgz /application/nginx/html/yum/python/ apscheduler (172 ...
- Anaconda 包管理与环境管理
包管理命令 conda命令 安装包 conda install 包名称 卸载包 conda remove 包名称 更新包 conda update 包名称 模糊查询 conda search 包名称 ...
- Spring中@Value用法
Spring中可以通过@Value注解,将properties配置文件中的属性值注入到java成员变量,配置和使用方法如下(大部分转自csdn,也有自己实验部分): 一.配置 首先,@value需要参 ...
- Mysql字符串截取,去掉时间,匹配日期等于今日
Mysql字符串截取,去掉时间,匹配日期等于今日 方案一 select time from jsb where date(time)=date(now()); 方案二 ));
- Mysql连接字符,字段函数concat()
Mysql连接字符,字段函数concat() 可将多个字符串或字段连接,多个参数以逗号隔开 select concat('现在是:',new_date) from work
- win10环境下安装mysql-8.0.18-winx64
下载mysql安装包,然后解压到你想安装的目录下,我下载的是mysql-8.0.18-winx64 Windows 上安装 MySQL 相对来说会较为简单,最新版本可以在 MySQL 下载 中下载中查 ...
- LCT[Link-Cut-Tree学习笔记]
部分摘抄于 FlashHu candy99 所以文章篇幅较长 请有足够的耐心(不是 其实不用学好splay再学LCT的-/kk (至少现在我平衡树靠fhq) 如果学splay的话- 也许我菜吧-LCT ...