Problem: clustering

A clustering network transforms the data into another space and then selects one of the clusters. Next, the autoencoder associated with this cluster is used to reconstruct the data-point.

Introduction:

traditional method: data------> extract a feature vector from each object --------> aggregate groups of vectors in a feature space.

cluster is represented by an autoencoder network. ??how

common method: k-means; but for the high-dimensional dataset, it's less useful because inter-point distances become less informative in high-dimensional spaces.

如果对于找一个序列的pattern来说,是不是就是时间维度作为高维情况,每个pattern作为一个cluster,而有的子序列不能归到cluster当中。

representation learning has been used to map the input data into a low-dimensional feature space.

Attempts: apply unsupervised deep learning approaches for clustering.  ??how

However, most focus on clustering over a low-dimensional feature space.

Transform the data into more clustering-friendly representations:

A deep version of k-means is based on learning a data representation and applying k-means in the embedded space.

How to represent a cluster:

a vector VS an autoencoder network.

Data collapsing problem: 数据崩溃问题,对于每个数据库,你必须重新调一遍程序。

for multivariate time series, how to find patterns.

1. find patterns: SAX; TICC; slide windows; 导数

2. VG, statistic features.

3.

Supplementary knowledge: 

1. Pattern recognition and clustering

Pattern recognition is a mature field in computer science with well-established techniques for the assignment of unknown patterns to categories, or classes. A pattern is defined as a vector of some number of measurements, called features. Usually, a pattern recognition system uses training samples from known categories to form a decision rule for unknown patterns. The unknown pattern is assigned to one of the categories according to the decision rule. Since we are interested in the classes of documents that have been assigned by the user, we can use pattern recognition techniques to try to classify previously unseen documents into the user's categories. While pattern recognition techniques require that the number and labels of categories are known, clustering techniques are unsupervised, requiring no external knowledge of categories. Clustering methods simply try to group similar patterns into clusters whose members are more similar to each other (according to some distance measure) than to members of other clusters. There is no a priori knowledge of patterns that belong to certain groups, or even how many groups are appropriate. Refer to basic pattern recognition and clustering texts such as [567] for further information.

We first employ pattern recognition techniques on documents to attempt to find features for classification, then focus on clustering the raw features of the documents.

PP: Deep clustering based on a mixture of autoencoders的更多相关文章

  1. 基于图嵌入的高斯混合变分自编码器的深度聚类(Deep Clustering by Gaussian Mixture Variational Autoencoders with Graph Embedding, DGG)

    基于图嵌入的高斯混合变分自编码器的深度聚类 Deep Clustering by Gaussian Mixture Variational Autoencoders with Graph Embedd ...

  2. 论文翻译:2021_Towards model compression for deep learning based speech enhancement

    论文地址:面向基于深度学习的语音增强模型压缩 论文代码:没开源,鼓励大家去向作者要呀,作者是中国人,在语音增强领域 深耕多年 引用格式:Tan K, Wang D L. Towards model c ...

  3. 论文解读SDCN《Structural Deep Clustering Network》

    前言 主体思想:深度聚类需要考虑数据内在信息以及结构信息. 考虑自身信息采用 基础的 Autoencoder ,考虑结构信息采用 GCN. 1.介绍 在现实中,将结构信息集成到深度聚类中通常需要解决以 ...

  4. 【论文阅读】Deep Clustering for Unsupervised Learning of Visual Features

    文章:Deep Clustering for Unsupervised Learning of Visual Features 作者:Mathilde Caron, Piotr Bojanowski, ...

  5. 【RS】Deep Learning based Recommender System: A Survey and New Perspectives - 基于深度学习的推荐系统:调查与新视角

    [论文标题]Deep Learning based Recommender System: A Survey and New Perspectives ( ACM Computing Surveys  ...

  6. 论文笔记: Deep Learning based Recommender System: A Survey and New Perspectives

    (聊两句,突然记起来以前一个学长说的看论文要能够把论文的亮点挖掘出来,合理的进行概括23333) 传统的推荐系统方法获取的user-item关系并不能获取其中非线性以及非平凡的信息,获取非线性以及非平 ...

  7. Predicting effects of noncoding variants with deep learning–based sequence model | 基于深度学习的序列模型预测非编码区变异的影响

    Predicting effects of noncoding variants with deep learning–based sequence model PDF Interpreting no ...

  8. Deep Clustering Algorithms

    Deep Clustering Algorithms 作者:凯鲁嘎吉 - 博客园 http://www.cnblogs.com/kailugaji/ 本文研究路线:深度自编码器(Deep Autoen ...

  9. Paper Reading——LEMNA:Explaining Deep Learning based Security Applications

    Motivation: The lack of transparency of the deep  learning models creates key barriers to establishi ...

随机推荐

  1. Idea自定义代码块【学习笔记】

    前言 idea有一个自定义代码块的功能,可以自定义代码块,方便以后工作中减少一些重复操作,这里就简单记录一下idea好用的模板吧,现在有一个关于日志的模板,用于写一个ServiceImpl方法的时候, ...

  2. 1058 - Parallelogram Counting 计算几何

    1058 - Parallelogram Counting There are n distinct points in the plane, given by their integer coord ...

  3. JVM源码分析之临门一脚的OutOfMemoryError完全解读

    概述 OutOfMemoryError,说的是java.lang.OutOfMemoryError,是JDK里自带的异常,顾名思义,说的就是内存溢出,当我们的系统内存严重不足的时候就会抛出这个异常(P ...

  4. vue路由--命名路由

    有时我们通过一个名称来标识一个路由显得更方便一些,特别是在链接一个路由,或者是执行一些跳转的时候.你可以在创建 Router 实例的时候,在 routes 配置中给某个路由设置名称. 我们直接在路由下 ...

  5. Python学习小记(2)---[list, iterator, and, or, zip, dict.keys]

    1.List行为 可以用 alist[:] 相当于 alist.copy() ,可以创建一个 alist 的 shallo copy,但是直接对 alist[:] 操作却会直接操作 alist 对象 ...

  6. js面试相关

    〇,字符串,数值,数组的转化 (0)检测数据类型 参考连接:http://www.cnblogs.com/onepixel/p/5126046.html 1,, typeof 操作符 :  能检测到( ...

  7. [红日安全]Web安全Day1 - SQL注入实战攻防

    本文由红日安全成员: Aixic 编写,如有不当,还望斧正. 大家好,我们是红日安全-Web安全攻防小组.此项目是关于Web安全的系列文章分享,还包含一个HTB靶场供大家练习,我们给这个项目起了一个名 ...

  8. SAP VL10B 报错 - 4500000317 000010 交付 $ 1 的交付项目 000010 与 POD 无关-

    SAP VL10B 报错 - 4500000317 000010 交付 $ 1 的交付项目 000010 与 POD 无关- 如下公司间STO单据, 业务背景是货物从公司代码LYSP转入公司代码BTS ...

  9. 微信小程序如何下载超过大小限制(10M)的视频?(苹果用户仔细看,安卓用户快速看)

    众所周知,微信小程序对下载的文件大小有限制,目前是最大支持10M.我们在用去水印小程序保存视频的时候,如果遇到长视频,视频大小可能就超过限制.遇到这种情况,我们如何才能把视频保存到手机相册呢? 首先, ...

  10. Pycharm每次新建工程都要重新安装相关库的解决办法

    之前自己每次重建工程时,都不厌其烦的重新安装了第三方的库,直接在pycharm的terminal中利用pip安装,或者鼠标放在所需库的红色波浪线上 直接点击Install Package XXX 后面 ...