https://en.wikipedia.org/wiki/Cluster_analysis

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.

Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς "grape") and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest.

Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Robert Tryon in 1939[1][2] and famously used by Cattell beginning in 1943[3] for trait theory classification in personality psychology.

Cluster analysis的更多相关文章

  1. cluster analysis in data mining

    https://en.wikipedia.org/wiki/K-means_clustering k-means clustering is a method of vector quantizati ...

  2. UVALive 6906 Cluster Analysis 并查集

    Cluster Analysis 题目连接: https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemi ...

  3. UVALive 6906 A - Cluster Analysis

    思路:排个序,依次选就好了. #include <bits/stdc++.h> #define PB push_back #define MP make_pair using namesp ...

  4. K-Means 聚类算法

    K-Means 概念定义: K-Means 是一种基于距离的排他的聚类划分方法. 上面的 K-Means 描述中包含了几个概念: 聚类(Clustering):K-Means 是一种聚类分析(Clus ...

  5. 地理信息系统 - ArcGIS - 高/低聚类分析工具(High/Low Clustering ---Getis-Ord General G)

    前段时间在学习空间统计相关的知识,于是把ArcGIS里Spatial Statistics工具箱里的工具好好研究了一遍,同时也整理了一些笔记上传分享.这一篇先聊一些基础概念,工具介绍篇随后上传. 空间 ...

  6. K-means聚类算法

    聚类分析(英语:Cluster analysis,亦称为群集分析) K-means也是聚类算法中最简单的一种了,但是里面包含的思想却是不一般.最早我使用并实现这个算法是在学习韩爷爷那本数据挖掘的书中, ...

  7. Bioinformatics Glossary

    原文:http://homepages.ulb.ac.be/~dgonze/TEACHING/bioinfo_glossary.html Affine gap costs: A scoring sys ...

  8. (转) Deep Learning Research Review Week 2: Reinforcement Learning

      Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...

  9. Spark入门实战系列--8.Spark MLlib(下)--机器学习库SparkMLlib实战

    [注]该系列文章以及使用到安装包/测试数据 可以在<倾情大奉送--Spark入门实战系列>获取 .MLlib实例 1.1 聚类实例 1.1.1 算法说明 聚类(Cluster analys ...

随机推荐

  1. Java Hour1

    有句名言,叫做10000小时成为某一个领域的专家.姑且不辩论这句话是否正确,让我们到达10000小时的时候再回头来看吧. 本文作者Java 经验约为0 Hour,请各位不吝赐教. Hour1 : 简单 ...

  2. 用sqlplus登陆数据库时,oracle 11g出现ORA-12514问题

    转自:http://zhidao.baidu.com/question/144648216.html 启动服务 然后在sqlplus / as sysdba;执行启动startup nomount;a ...

  3. android操作XML的几种方式(转)

    XML作为一种业界公认的数据交换格式,在各个平台与语言之上,都有广泛使用和实现.其标准型,可靠性,安全性......毋庸置疑.在android平台上,我们要想实现数据存储和数据交换,经常会使用到xml ...

  4. Android Studio 获取 sha1-wangfeng520@

    WIN+R   打开“运行”  输入  CMD 回车 2 CD C:\Program Files\Java\jdk1.7.0_71\bin     (JDK安装路径) keytool -list -v ...

  5. 使用OUYA第一次启动OUYA

    使用OUYA第一次启动OUYA 1.4  使用OUYA 初次使用OUYA时,其启动以后的设置过程耗时较长,也比较繁琐,因此本节将会对其做个详细介绍,让读者的使用过程更加顺利些!好的开端总归是一个不错的 ...

  6. 理解Null,Undefined,NAN

    1.null表示尚未存在的对象,转为数值时为0.它表示"没有对象",即该处不应该有值,常用来表示函数企图返回一个不存在的对象.null是一种特殊的object(引用类型),代表一个 ...

  7. css构造文本

    1. 1. 文本缩进text-indent:值:值为数字,最常用的数值单位是px(像素),也可以直接是百分比!text-indent:100px;text-indent:10%;2. 文本对齐text ...

  8. POJ2240 Arbitrage(Floyd判负环)

    跑完Floyd后,d[u][u]就表示从u点出发可以经过所有n个点回到u点的最短路,因此只要根据数组对角线的信息就能判断是否存在负环. #include<cstdio> #include& ...

  9. 李洪强-C语言5-函数

    C语言函数 一.函数 C语言程序是由函数构成的,每个函数负责完成一部分的功能,函数将工恩呢该封装起来,以供程序调用. 二.函数定义 目的:将一些常用的功能封装起来,以供日后调用. 步骤:确定函数名,确 ...

  10. shell中的case语句

    case语法: case $arg in arg1) 语句1 ;; arg2) 语句2 ;; *) help 语句 ;; esac eg: eg: