Cluster analysis

https://en.wikipedia.org/wiki/Cluster_analysis

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.

Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς "grape") and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification the resulting discriminative power is of interest.

Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Robert Tryon in 1939^[1]^[2] and famously used by Cattell beginning in 1943^[3] for trait theory classification in personality psychology.

Cluster analysis的更多相关文章

cluster analysis in data mining
https://en.wikipedia.org/wiki/K-means_clustering k-means clustering is a method of vector quantizati ...
UVALive 6906 Cluster Analysis 并查集
Cluster Analysis 题目连接: https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemi ...
UVALive 6906 A - Cluster Analysis
思路:排个序,依次选就好了. #include <bits/stdc++.h> #define PB push_back #define MP make_pair using namesp ...
K-Means 聚类算法
K-Means 概念定义: K-Means 是一种基于距离的排他的聚类划分方法. 上面的 K-Means 描述中包含了几个概念: 聚类(Clustering):K-Means 是一种聚类分析(Clus ...
地理信息系统 - ArcGIS - 高/低聚类分析工具(High/Low Clustering ---Getis-Ord General G)
前段时间在学习空间统计相关的知识,于是把ArcGIS里Spatial Statistics工具箱里的工具好好研究了一遍,同时也整理了一些笔记上传分享.这一篇先聊一些基础概念,工具介绍篇随后上传. 空间 ...
K-means聚类算法
聚类分析(英语:Cluster analysis,亦称为群集分析) K-means也是聚类算法中最简单的一种了,但是里面包含的思想却是不一般.最早我使用并实现这个算法是在学习韩爷爷那本数据挖掘的书中, ...
Bioinformatics Glossary
原文:http://homepages.ulb.ac.be/~dgonze/TEACHING/bioinfo_glossary.html Affine gap costs: A scoring sys ...
(转) Deep Learning Research Review Week 2: Reinforcement Learning
Deep Learning Research Review Week 2: Reinforcement Learning 转载自: https://adeshpande3.github.io/ad ...
Spark入门实战系列--8.Spark MLlib（下）--机器学习库SparkMLlib实战
[注]该系列文章以及使用到安装包/测试数据可以在<倾情大奉送--Spark入门实战系列>获取 .MLlib实例 1.1 聚类实例 1.1.1 算法说明聚类(Cluster analys ...

随机推荐

在Linux中创建静态库.a和动态库.so
转自:http://www.cnblogs.com/laojie4321/archive/2012/03/28/2421056.html 在Linux中创建静态库.a和动态库.so 我们通常把一些公用 ...
java程序员必须会的技能
1.语法:必须比较熟悉,在写代码的时候IDE的编辑器对某一行报错应该能够根据报错信息知道是什么样的语法错误并且知道任何修正. 2.命令:必须熟悉JDK带的一些常用命令及其常用选项,命令至少需要熟悉:a ...
Apache服务器的下载与安装
关于PHP的运行环境搭载,网上文章繁杂,遂自己整理一篇! PHP的运行必然少不了服务器的支持,何为服务器?通俗讲就是在一台计算机上,安装个服务器软件,这台计算机便可以称之为服务器,服务器软件和计算机本 ...
spark streaming中使用checkpoint
从官方的Programming Guides中看到的我理解streaming中的checkpoint有两种,一种指的是metadata的checkpoint,用于恢复你的streaming:一种是r ...
HeapByteBuffer和DirectByteBuffer以及回收DirectByteBuffer
由于HeapByteBuffer和DirectByteBuffer类都是default类型的,所以你无法字节访问到,你只能通过ByteBuffer间接访问到它,因为JVM不想让你访问到它. 分配Hea ...
SU unif2命令学习
结合ItemsControl在Canvas中动态添加控件的最MVVM的方式
今天很开心的收获: ItemsControl 中 ItemsPanel的重定义和 ItemContainerStyle 以及 ItemTemplate 三者的巧妙结合,在后台代码不实例化任何控件的前提 ...
2016 Multi-University Training Contest 7
6/12 2016 Multi-University Training Contest 7 期望 B Balls and Boxes(BH) 题意: n个球放到m个盒子里,xi表示第i个盒子里的球的数 ...
cocos2d ccitemimage
#ifndef __HELLOWORLD_SCENE_H__ #define __HELLOWORLD_SCENE_H__ #include "cocos2d.h" class H ...
常用正则表达式(?i)忽略字母的大小写！
1.^/d+$ //匹配非负整数(正整数 + 0) 2.^[0-9]*[1-9][0-9]*$ //匹配正整数 3.^((-/d+)|(0+))$ //匹配非正整数(负整数 + 0) 4.^-[0-9 ...

Cluster analysis

Cluster analysis的更多相关文章

随机推荐

热门专题