https://en.wikipedia.org/wiki/Cluster_analysis

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.

Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς "grape") and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest.

Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Robert Tryon in 1939[1][2] and famously used by Cattell beginning in 1943[3] for trait theory classification in personality psychology.

Cluster analysis的更多相关文章

  1. cluster analysis in data mining

    https://en.wikipedia.org/wiki/K-means_clustering k-means clustering is a method of vector quantizati ...

  2. UVALive 6906 Cluster Analysis 并查集

    Cluster Analysis 题目连接: https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemi ...

  3. UVALive 6906 A - Cluster Analysis

    思路:排个序,依次选就好了. #include <bits/stdc++.h> #define PB push_back #define MP make_pair using namesp ...

  4. K-Means 聚类算法

    K-Means 概念定义: K-Means 是一种基于距离的排他的聚类划分方法. 上面的 K-Means 描述中包含了几个概念: 聚类(Clustering):K-Means 是一种聚类分析(Clus ...

  5. 地理信息系统 - ArcGIS - 高/低聚类分析工具(High/Low Clustering ---Getis-Ord General G)

    前段时间在学习空间统计相关的知识,于是把ArcGIS里Spatial Statistics工具箱里的工具好好研究了一遍,同时也整理了一些笔记上传分享.这一篇先聊一些基础概念,工具介绍篇随后上传. 空间 ...

  6. K-means聚类算法

    聚类分析(英语:Cluster analysis,亦称为群集分析) K-means也是聚类算法中最简单的一种了,但是里面包含的思想却是不一般.最早我使用并实现这个算法是在学习韩爷爷那本数据挖掘的书中, ...

  7. Bioinformatics Glossary

    原文:http://homepages.ulb.ac.be/~dgonze/TEACHING/bioinfo_glossary.html Affine gap costs: A scoring sys ...

  8. (转) Deep Learning Research Review Week 2: Reinforcement Learning

      Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...

  9. Spark入门实战系列--8.Spark MLlib(下)--机器学习库SparkMLlib实战

    [注]该系列文章以及使用到安装包/测试数据 可以在<倾情大奉送--Spark入门实战系列>获取 .MLlib实例 1.1 聚类实例 1.1.1 算法说明 聚类(Cluster analys ...

随机推荐

  1. android开发 NDK 编译和使用静态库、动态库 (转)

    在eclipse工程目录下建立一个jni的文件夹 在jni文件夹中建立Android.mk和Application.mk文件 Android.mk文件: Android提供的一种makefile文件, ...

  2. 那些年不错的Android开源项目

    那些年不错的Android开源项目 转载自 eoe 那些年不错的Android开源项目-个性化控件篇 第一部分 个性化控件(View) 主要介绍那些不错个性化的View,包括ListView.Acti ...

  3. Python开发的10个小贴士

    下面是十个Python中很有用的贴士和技巧.其中一些是初学这门语言常常会犯的错误. 注意:假设我们都用的是Python 3 1. 列表推导式 你有一个list:bag = [1, 2, 3, 4, 5 ...

  4. kali 安装火狐

    转自:http://www.kali.org.cn/thread-21271-1-1.html 安装火狐浏览器 打开终端 第一步:apt-get remove iceweasel 第二步: echo ...

  5. java的split的坑,会忽略空值

    String test = "@@@@"; String[] arrayTest = test.split("\\@"); System.out.println ...

  6. cmd.ExecuteNonQuery();和cmd.ExecuteScalar();

    C#...cmd.ExecuteNonQuery();是返回执行命令后影响的参数 返回符合你条件的所有语句,如果你要数据库里某张表的数据,说执行这个命令后他返回的是就是这张表的全部数据cmd.Exec ...

  7. MAXIMO系统 java webservice 中PDA移动应用系统开发

    MAXIMO系统 java webservice 中PDA移动应用系统开发  平时经常用的wince PDA手持设备调用c#写的webservice, 当然PDA也可以调用java webservic ...

  8. HashMap两种遍历数据的方式

    HashMap的遍历有两种方式,一种是entrySet的方式,另外一种是keySet的方式. 第一种利用entrySet的方式: Map map = new HashMap(); Iterator i ...

  9. CentOS6.4 配置HAProxy+Keepalived

    安装HAProxy请参考 http://www.cnblogs.com/kgdxpr/p/3272861.html 安装Keepalived 1.下载安装依赖包 yum install -y wget ...

  10. BZOJ1224: [HNOI2002]彩票

    Description 某地发行一套彩票.彩票上写有1到M这M个自然数.彩民可以在这M个数中任意选取N个不同的数打圈.每个彩民只能买一张彩票,不同的彩民的彩票上的选择不同.每次抽奖将抽出两个自然数X和 ...