https://en.wikipedia.org/wiki/Cluster_analysis

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.

Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς "grape") and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest.

Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Robert Tryon in 1939[1][2] and famously used by Cattell beginning in 1943[3] for trait theory classification in personality psychology.

Cluster analysis的更多相关文章

  1. cluster analysis in data mining

    https://en.wikipedia.org/wiki/K-means_clustering k-means clustering is a method of vector quantizati ...

  2. UVALive 6906 Cluster Analysis 并查集

    Cluster Analysis 题目连接: https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemi ...

  3. UVALive 6906 A - Cluster Analysis

    思路:排个序,依次选就好了. #include <bits/stdc++.h> #define PB push_back #define MP make_pair using namesp ...

  4. K-Means 聚类算法

    K-Means 概念定义: K-Means 是一种基于距离的排他的聚类划分方法. 上面的 K-Means 描述中包含了几个概念: 聚类(Clustering):K-Means 是一种聚类分析(Clus ...

  5. 地理信息系统 - ArcGIS - 高/低聚类分析工具(High/Low Clustering ---Getis-Ord General G)

    前段时间在学习空间统计相关的知识,于是把ArcGIS里Spatial Statistics工具箱里的工具好好研究了一遍,同时也整理了一些笔记上传分享.这一篇先聊一些基础概念,工具介绍篇随后上传. 空间 ...

  6. K-means聚类算法

    聚类分析(英语:Cluster analysis,亦称为群集分析) K-means也是聚类算法中最简单的一种了,但是里面包含的思想却是不一般.最早我使用并实现这个算法是在学习韩爷爷那本数据挖掘的书中, ...

  7. Bioinformatics Glossary

    原文:http://homepages.ulb.ac.be/~dgonze/TEACHING/bioinfo_glossary.html Affine gap costs: A scoring sys ...

  8. (转) Deep Learning Research Review Week 2: Reinforcement Learning

      Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...

  9. Spark入门实战系列--8.Spark MLlib(下)--机器学习库SparkMLlib实战

    [注]该系列文章以及使用到安装包/测试数据 可以在<倾情大奉送--Spark入门实战系列>获取 .MLlib实例 1.1 聚类实例 1.1.1 算法说明 聚类(Cluster analys ...

随机推荐

  1. 同网段下,windows自带远程桌面连接

    1.服务器关闭防火墙 2.右键点击’我的电脑‘进入’属性‘点击左侧菜单栏中的’远程设置‘: 把远程桌面选项设置成’允许运行任意版本远程桌面的计算机连接‘. 3.客户端点击“开始”在附件菜单下面找到“远 ...

  2. EAS使用中FineUI的配置

    <?xml version="1.0" encoding="utf-8"?> <configuration> <configSec ...

  3. poj 2245 水题

    求组合数,dfs即可 #include<cstdio> #include<iostream> #include<algorithm> #include<cst ...

  4. GDUT 校赛01 dp

    aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAABT8AAAILCAIAAAChHn9YAAAgAElEQVR4nOy9f4il13nneUGgxrRYux ...

  5. JavaEE路径陷阱之getRealPath

    转自:http://blog.csdn.net/shendl/article/details/1427637   JavaEE路径陷阱之getRealPath   本文是<Java路径问题最终解 ...

  6. WebService站点服务的地址

    天气的地址 http://webservice.webxml.com.cn/WebServices/WeatherWS.asmx

  7. Spark metrics on wordcount example

    I read the section Metrics on spark website. I wish to try it on the wordcount example, I can't make ...

  8. 【框架】RefreshListView下拉刷新

    布局: <com.example.administrator.d30_myrefreshlistview.RefreshListView android:id="@+id/refres ...

  9. synchronized的理解

    用法解释 synchronized是Java中的关键字,是一种同步锁.它修饰的对象有以下几种: 1. 修饰一个代码块,被修饰的代码块称为同步语句块,其作用的范围是大括号{}括起来的代码,作用的对象是调 ...

  10. DataTable转换成List<T>

    很多时候需要将DataTable转换成一组model,直接对model执行操作会更加方便直观. 代码如下: public static class DataTableToModel { public ...