https://en.wikipedia.org/wiki/K-means_clustering

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.

The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.

The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm[citation needed].

cluster analysis in data mining的更多相关文章

  1. Machine Learning and Data Mining(机器学习与数据挖掘)

    Problems[show] Classification Clustering Regression Anomaly detection Association rules Reinforcemen ...

  2. Cluster analysis

    https://en.wikipedia.org/wiki/Cluster_analysis Cluster analysis or clustering is the task of groupin ...

  3. Data Mining的十种分析方法——摘自《市场研究网络版》谢邦昌教授

    Data Mining的十种分析方法: 记忆基础推理法(Memory-Based Reasoning:MBR)        记忆基础推理法最主要的概念是用已知的案例(case)来预测未来案例的一些属 ...

  4. A web crawler design for data mining

    Abstract The content of the web has increasingly become a focus for academic research. Computer prog ...

  5. Weka 3: Data Mining Software in Java

    官方网站: Weka 3: Data Mining Software in Java 相关使用方法博客 WEKA使用教程(经典教程转载) (实例数据:bank-data.csv) Weka初步一.二. ...

  6. data mining,machine learning,AI,data science,data science,business analytics

    数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics ...

  7. 数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics)之间有什么关系?

    本来我以为不需要解释这个问题的,到底数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)有什么区别,但是前几天因为有个学弟问我,我想了想发现我竟然也回答 ...

  8. 论文翻译:Data mining with big data

    原文: Wu X, Zhu X, Wu G Q, et al. Data mining with big data[J]. IEEE transactions on knowledge and dat ...

  9. 18 Candidates for the Top 10 Algorithms in Data Mining

    Classification============== #1. C4.5 Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.Morga ...

随机推荐

  1. node.js简单的页面输出

    在node.js基本上没有兼容问题(如果你不是从早期的node.js玩起来),而且原生对象又加了这么多扩展,再加上node.js自带的库,每个模块都提供了花样繁多的API,如果还嫌不够,github上 ...

  2. 关于服务器响应,浏览器请求的理解以及javaWeb项目的编码问题

    1.服务器(Server)响应,浏览器(Brower)请求: 对于B/S的软件,数据的传递体现在,用户利用浏览器请求,以获得服务器响应.在JavaWeb项目中,大致包含.java文件的数据处理模块,和 ...

  3. 微信绑定后台是验证token失败

    /AX/dapeng/VfanCms/Lib/ORG/ 在ORG文件夹中,找到Wechat.class.php文件,去掉解释,验证完后改回来!应该是为了防止后台被别人绑定了去.

  4. 利用Bundle在activity之间传递对象

    (2010-12-04 09:45:54) 转载▼ 标签: it 分类: android开发 转自:http://chen592969029.javaeye.com/blog/772656 假如需要在 ...

  5. spark streaming中使用checkpoint

    从官方的Programming Guides中看到的 我理解streaming中的checkpoint有两种,一种指的是metadata的checkpoint,用于恢复你的streaming:一种是r ...

  6. Seismic Unix的一些历史

    本文是我从官网上拷贝过来的,上国外网越来越慢了……(离题了). At the Society of Exploration Geophysicists (SEG) Annual Meeting in ...

  7. ember.js:使用笔记7 页面中插入效果

    在某些情况下,我们需要根据数据生成某些效果:由于每个模版的controller可能不同,在不同页面之间跳转可能会无法随即更新的问题. controller: 直接使用标签:{{}},适用于在子项目内切 ...

  8. XCOJ 1102 (树形DP+背包)

    题目链接: http://xcacm.hfut.edu.cn/oj/problem.php?id=1102 题目大意:树上取点.父亲出现了,其儿子包括孙子...都不能出现.给定预算,问最大值. 解题思 ...

  9. Ubuntu(Linux) 下 unzip 命令使用详解

    1.功能作用:解压缩zip文件 2.位置:/usr/bin/unzip 3.格式用法:unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist ...

  10. 通过console口连接交换机

    最近发现有人不会通过console口连接交换机. 想想当初我还是小白的时候也是如此啊,如是写下教程. 虽然略简单... 1.连线: console线:(Console---usb) 2.安装驱动 (可 ...