PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
From: Stanford University; Jure Leskovec, citation 6w+;
Problem:
subsequence clustering.
Challenging:
discover patterns is challenging because it requires simultaneous segmentation and clustering of the time series + interpreting the cluster results is difficult.
Why discover time series patterns is a challenge?? thinking by yourself!! there are already so many distance measures(DTW, manifold distance) and clustering methods(knn,k-means etc.). But I admit the interpretation is difficult.
Introduction:
long time series ----breakdown-----> a sequence of states/patterns ------> so time series can be expressed as a sequential timeline of a few key states. -------> discover repeated patterns/ understand trends/ detect anomalies/ better interpret large and high-dimensional datasets.
Key steps: simultaneously segment and cluster the time series.
Unsupervised learning: hard to interpretation, after clustering, you have to view data itself.
how to discover interpretable structure in the data?
Traditional clustering methods are not particularly well-suited to discover interpretable structure in the data. This is because they typically rely on distance-based metrics
distance-based metrics, DTW.
距离式的算法,在处理multivariate time series上有劣势,看不到细微的数据结构相似性。
Propose a new method for multivariate time series clustering TICC:
- define each cluster as a dependency network showing the relationships between the different sensors in a short subsequence.
- each cluster is a markov random field.
- In thes MRFs, an edge represents a partial correlation between two variables.
- learn each cluster's MRF by estimating a sparse Gaussian inverse covariance matrix.
- This network has multiple layers.
- the number of layers corresponds to the window size of a short subsequence.
- 逆协方差矩阵定义了MRF dependency network 的adjaccency matrix.
Related work:
time series clustering and convex optimization;
variations of dtw; symbolic representations; rule-based motif discovery;
However, these methods generally rely on distance-based metrics.
TICC ------ a model-based clustering method, like ARMA, Gaussian mixture or hidden markov models.
- define each cluster by a Gaussian inverse covariance.
- so the Gaussian inverse covariance defines a Markov random field encoding the structural representation.
- K clusters/ inverse covariances.
selecting the number of clusters: cross-validation; mornalized mutual information; BIC or silhouette score.
看不懂哇 T T
Supplementary knowledge:
1. 对于unsupervised learning, 目前对结果的解释或者中间参数的选取,全是靠经验。
2. Aarhus data, Martin, 做多变量time series 预测。
3. Toeplitz Matrices: 常对角矩阵。
4. ticc code
Reference:
PP: Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data的更多相关文章
- PP: Tripoles: A new class of relationships in time series data
Problem: ?? mining relationships in time series data; A new class of relationships in time series da ...
- 图Lasso求逆协方差矩阵(Graphical Lasso for inverse covariance matrix)
图Lasso求逆协方差矩阵(Graphical Lasso for inverse covariance matrix) 作者:凯鲁嘎吉 - 博客园 http://www.cnblogs.com/ka ...
- PP: Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network
PROBLEM: OmniAnomaly multivariate time series anomaly detection + unsupervised 主体思想: input: multivar ...
- PP: Deep r -th Root of Rank Supervised Joint Binary Embedding for Multivariate Time Series Retrieval
from: Dacheng Tao 悉尼大学 PROBLEM: time series retrieval: given the current multivariate time series se ...
- PP: Unsupervised deep embedding for clustering analysis
Problem: unsupervised clustering represent data in feature space; learn a non-linear mapping from da ...
- [转]Multivariate Time Series Forecasting with LSTMs in Keras
1. Air Pollution Forecasting In this tutorial, we are going to use the Air Quality dataset. This is ...
- PP: A dual-stage attention-based recurrent neural network for time series prediction
Problem: time series prediction The nonlinear autoregressive exogenous model: The Nonlinear autoregr ...
- PP: Deep clustering based on a mixture of autoencoders
Problem: clustering A clustering network transforms the data into another space and then selects one ...
- PP: Time series clustering via community detection in Networks
Improvement can be done in fulture:1. the algorithm of constructing network from distance matrix. 2. ...
随机推荐
- Android布局管理器-使用FrameLayout帧布局管理器显示层叠的正方形以及前景照片
场景 Android布局管理器-使用LinearLayout实现简单的登录窗口布局: https://blog.csdn.net/BADAO_LIUMANG_QIZHI/article/details ...
- 使用 setTimeout 来模拟一个 setInterval
setTimeout 超时调用:在多少时间 在执行: setinterval 每隔多少时间 就调用 例如: setTimeout这个的值是1000,也就是说在页面刷新后,1000毫秒之后才调用这个函数 ...
- 经济学人精读笔记7:动乱当道,你还想买LV吗?
2020/2/24 经济学人精读笔记7:动乱当道,你还想买LV吗? 标签(空格分隔): 经济学人 Part 1 Luxury goods A tale of two handbags Purveyor ...
- 【gRPC】如何便捷的调试gRPC程序
前言 gRPC是一款广泛应用的rpc框架,因为基于C/S架构,服务启动之后,需要编写对应的客户端才能调用,调试起来相对麻烦一些,这里主要介绍一下如何通过swagger-ui来调试grpc服务. grp ...
- .net 父类值赋给子类
1.最简单的方式,反射+泛型 优点:字段修改时,无需更改代码,只需要更新实体即可 缺点:因为用到反射,可能效率会稍微弱那么一点点,没有实际用太多字段测试 public static cClass Pa ...
- Fiddler: AutoResponder 构建模拟测试场景
AutoResponder 可用于拦截某一请求,并重定向到本地的资源,或者使用Fiddler的内置响应.可用于调试服务器端代码而无需修改服务器端的代码和配置,因为拦截和重定向后,实际上访问的是本地的文 ...
- 安装canvas
本方法仅适用用于window系统 安装canvas需要当前工作环境拥有python环境,且只能适用python2.7版本,v3.x.x版本会造成系统报错 1.在管理员权限下 使用choco insta ...
- bootstrap 兼容 IE8
在 html 中引用 <!-- bootstrap 兼容 IE8 --> <script src="../../jsapi/js/html5shiv.min.js" ...
- Chrome 插件 postman 可以在线post
地址:https://chrome.google.com/webstore/detail/fhbjgbiflinjbdggehcddcbncdddomop
- Node.js_1.1
Node.js简介 Node.js是一个能够在服务器端运行JavaScript的开源代码.跨平台JavaScript运行环境 Node采用Google开发的V8引擎运行js代码,使用事件驱动.非阻塞和 ...