cluster analysis in data mining
https://en.wikipedia.org/wiki/K-means_clustering
k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.
The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm[citation needed].
cluster analysis in data mining的更多相关文章
- Machine Learning and Data Mining(机器学习与数据挖掘)
Problems[show] Classification Clustering Regression Anomaly detection Association rules Reinforcemen ...
- Cluster analysis
https://en.wikipedia.org/wiki/Cluster_analysis Cluster analysis or clustering is the task of groupin ...
- Data Mining的十种分析方法——摘自《市场研究网络版》谢邦昌教授
Data Mining的十种分析方法: 记忆基础推理法(Memory-Based Reasoning:MBR) 记忆基础推理法最主要的概念是用已知的案例(case)来预测未来案例的一些属 ...
- A web crawler design for data mining
Abstract The content of the web has increasingly become a focus for academic research. Computer prog ...
- Weka 3: Data Mining Software in Java
官方网站: Weka 3: Data Mining Software in Java 相关使用方法博客 WEKA使用教程(经典教程转载) (实例数据:bank-data.csv) Weka初步一.二. ...
- data mining,machine learning,AI,data science,data science,business analytics
数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics ...
- 数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics)之间有什么关系?
本来我以为不需要解释这个问题的,到底数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)有什么区别,但是前几天因为有个学弟问我,我想了想发现我竟然也回答 ...
- 论文翻译:Data mining with big data
原文: Wu X, Zhu X, Wu G Q, et al. Data mining with big data[J]. IEEE transactions on knowledge and dat ...
- 18 Candidates for the Top 10 Algorithms in Data Mining
Classification============== #1. C4.5 Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.Morga ...
随机推荐
- J2EE面试题集锦(附答案)
转自:http://blog.sina.com.cn/s/blog_4e8be0590100fbb8.html J2EE面试题集锦(附答案)一.基础问答 1.下面哪些类可以被继承? java.lang ...
- Android判断App是否在前台运行(转)
原文地址: http://blog.csdn.net/zuolongsnail/article/details/8168689 Android开发中,有时候需要判断App是否在前台运行. 代码实现如下 ...
- matlab练习程序(最小包围矩形)
又是计算几何,我感觉最近对计算几何上瘾了. 当然,工作上也会用一些,不过工作上一般直接调用boost的geometry库. 上次写过最小包围圆,这次是最小包围矩形,要比最小包围圆复杂些. 最小包围矩形 ...
- Android WebView访问SSL证书网页(onReceivedSslError)
Android WebView访问https SSL证书网页,如淘宝,需要在onReceivedSslError添加SSL支持 webview.setWebViewClient(new WebView ...
- 理解Null,Undefined,NAN
1.null表示尚未存在的对象,转为数值时为0.它表示"没有对象",即该处不应该有值,常用来表示函数企图返回一个不存在的对象.null是一种特殊的object(引用类型),代表一个 ...
- POJ1155 TELE(树形DP)
题目是说给一棵树,叶子结点有负权,边有正权,问最多能选多少个叶子结点,使从叶子到根的权值和小于等于0. 考虑数据规模表示出状态:dp[u][k]表示在u结点为根的子树中选择k个叶子结点的最小权值 最后 ...
- POJ3057 Evacuation(二分图最大匹配)
人作X部:把门按时间拆点,作Y部:如果某人能在某个时间到达某门则连边.就是个二分图最大匹配. 时间可以二分枚举,或者直接从1枚举时间然后加新边在原来的基础上进行增广. 谨记:时间是个不可忽视的维度. ...
- BZOJ2062 : 素颜2(face2)
写个cmp然后sort就好了. cmp的话,需要快速知道两个串的lcp,于是倍增+Hash即可. #include<cstdio> #include<algorithm> ty ...
- javascript生成n至m的随机整数
摘要: 本文讲解如何使用js生成n到m间的随机数字,主要目的是为后期的js生成验证码做准备. Math.random()函数返回0和1之间的伪随机数,可能为0,但总是小于1,[0,1) 生成n-m,包 ...
- TYVJ P1038/P1039 忠诚 标签:线段树
做题记录:2016-08-12 16:30:14 //P1038 描述 老管家是一个聪明能干的人.他为财主工作了整整10年,财主为了让自已账目更加清楚.要求管家每天记k次账,由于管家聪明能干,因而管家 ...