https://en.wikipedia.org/wiki/K-means_clustering

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.

The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.

The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm[citation needed].

cluster analysis in data mining的更多相关文章

  1. Machine Learning and Data Mining(机器学习与数据挖掘)

    Problems[show] Classification Clustering Regression Anomaly detection Association rules Reinforcemen ...

  2. Cluster analysis

    https://en.wikipedia.org/wiki/Cluster_analysis Cluster analysis or clustering is the task of groupin ...

  3. Data Mining的十种分析方法——摘自《市场研究网络版》谢邦昌教授

    Data Mining的十种分析方法: 记忆基础推理法(Memory-Based Reasoning:MBR)        记忆基础推理法最主要的概念是用已知的案例(case)来预测未来案例的一些属 ...

  4. A web crawler design for data mining

    Abstract The content of the web has increasingly become a focus for academic research. Computer prog ...

  5. Weka 3: Data Mining Software in Java

    官方网站: Weka 3: Data Mining Software in Java 相关使用方法博客 WEKA使用教程(经典教程转载) (实例数据:bank-data.csv) Weka初步一.二. ...

  6. data mining,machine learning,AI,data science,data science,business analytics

    数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics ...

  7. 数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics)之间有什么关系?

    本来我以为不需要解释这个问题的,到底数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)有什么区别,但是前几天因为有个学弟问我,我想了想发现我竟然也回答 ...

  8. 论文翻译:Data mining with big data

    原文: Wu X, Zhu X, Wu G Q, et al. Data mining with big data[J]. IEEE transactions on knowledge and dat ...

  9. 18 Candidates for the Top 10 Algorithms in Data Mining

    Classification============== #1. C4.5 Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.Morga ...

随机推荐

  1. Laravel 4 系列入门教程(一)

    默认条件 本文默认你已经有配置完善的PHP+MySQL运行环境,懂得PHP网站运行的基础知识.跟随本教程走完一遍,你将会得到一个基础的包含登录的简单blog系统,并将学会如何使用一些强大的Larave ...

  2. hdu 4021 n数码

    好题,6666 转自:http://www.cnblogs.com/kuangbin/archive/2012/08/23/2652410.html 题意:给出一个board,上面有24个位置,其中2 ...

  3. vb.net三层实现登录例子

    看三层已经很长时间了,中间有经过了期末考试.回家等等琐事,寒假开学的我已经回想不起什么事三层了,经过了三四天的重新复习,再加上查看各期师哥师姐的博客,终于,自己完成了C#视频中的登录小例子,下面就和大 ...

  4. 导入/导出Excel

    --从Excel文件中,导入数据到SQL数据库中,很简单,直接用下面的语句:/*============================================================ ...

  5. 【HTML5】Web Workers

    什么是 Web Worker? 当在 HTML 页面中执行脚本时,页面的状态是不可响应的,直到脚本已完成. web worker 是运行在后台的 JavaScript,独立于其他脚本,不会影响页面的性 ...

  6. Kali Linux 2016.2初体验使用总结

    Kali Linux 2016.2初体验使用总结 Kali Linux官方于8月30日发布Kali Linux 2016的第二个版本Kali Linux 2016.2.该版本距离Kali Linux  ...

  7. Encoding

    Problem Description Given a string containing only 'A' - 'Z', we could encode it using the following ...

  8. LightOJ1064 Throwing Dice(DP)

    第一眼以为是概率DP,我还不会.不过看题目那么短就读读,其实这应该还不是概率DP,只是个水水的DP.. dp[n][s]表示掷n次骰子点数和为s的情况数 dp[0][0]=1 dp[i][j]=∑dp ...

  9. HDU 1429 (BFS+记忆化状压搜索)

    题目链接: http://acm.hdu.edu.cn/showproblem.php?pid=1429 题目大意:最短时间内出迷宫,可以走回头路,迷宫内有不同的门,对应不同的钥匙. 解题思路: 要是 ...

  10. 【BZOJ】1054: [HAOI2008]移动玩具(bfs+hash)

    http://www.lydsy.com/JudgeOnline/problem.php?id=1054 一开始我还以为要双向广搜....但是很水的数据,不需要了. 直接bfs+hash判重即可. # ...