cluster analysis in data mining
https://en.wikipedia.org/wiki/K-means_clustering
k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.
The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm[citation needed].
cluster analysis in data mining的更多相关文章
- Machine Learning and Data Mining(机器学习与数据挖掘)
Problems[show] Classification Clustering Regression Anomaly detection Association rules Reinforcemen ...
- Cluster analysis
https://en.wikipedia.org/wiki/Cluster_analysis Cluster analysis or clustering is the task of groupin ...
- Data Mining的十种分析方法——摘自《市场研究网络版》谢邦昌教授
Data Mining的十种分析方法: 记忆基础推理法(Memory-Based Reasoning:MBR) 记忆基础推理法最主要的概念是用已知的案例(case)来预测未来案例的一些属 ...
- A web crawler design for data mining
Abstract The content of the web has increasingly become a focus for academic research. Computer prog ...
- Weka 3: Data Mining Software in Java
官方网站: Weka 3: Data Mining Software in Java 相关使用方法博客 WEKA使用教程(经典教程转载) (实例数据:bank-data.csv) Weka初步一.二. ...
- data mining,machine learning,AI,data science,data science,business analytics
数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics ...
- 数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)的区别是什么? 数据科学(data science)和商业分析(business analytics)之间有什么关系?
本来我以为不需要解释这个问题的,到底数据挖掘(data mining),机器学习(machine learning),和人工智能(AI)有什么区别,但是前几天因为有个学弟问我,我想了想发现我竟然也回答 ...
- 论文翻译:Data mining with big data
原文: Wu X, Zhu X, Wu G Q, et al. Data mining with big data[J]. IEEE transactions on knowledge and dat ...
- 18 Candidates for the Top 10 Algorithms in Data Mining
Classification============== #1. C4.5 Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.Morga ...
随机推荐
- wp8 -- gameover
<phone:PhoneApplicationPage.Resources> <Storyboard x:Name="Storyboard1"> <D ...
- Android Inflate
inflate就相当于将一个xml中定义的布局找出来. 三种方式可以生成LayoutInflater: LayoutInflaterinflater=LayoutInflater.from(this) ...
- Mysql 对数字的格式化
format函数: 格式化浮点数 format(number, length); Formats the number X to a format like '#,###,###.##', r ...
- 电赛总结(四)——波形发生芯片总结之AD9834
一.特性参数 1.2.3V~5.5V供电 2.输出频率高达37.5MHz 3.正弦波.三角波输出 4.提供相位调制和频率调制功能 5.除非另有说明,VDD = 2.3 V至5.5 V,AGND = D ...
- 【T_SQL】 基础 视图、存储过程、触发器
合作对于我来说,真的很重要,不仅仅是我从中学到了什么技术,更加重要的是我从中学到了如何去协调,如何去处理团队之间的关系,不要误会,我不是组长,但是我们每个人都是组长.在漫长的编译代码的过程中,真的很烦 ...
- 桌面窗体应用程序,FormClosing事件
private void Form1_FormClosing(object sender, FormClosingEventArgs e) { //主窗体关闭时,弹出对话框.判断对话框的返回值(即用户 ...
- 在Linux下搭建SVN服务器
svn不仅仅可以用于程序开发,还可以做很多事情,例如备份文档. CentOS下:安装 这样同一台服务器便可以运行多个svnserver了 检查端口 注:如果修改了svn配置,需要重启svn服务 -j ...
- c# 数据绑定之 DataFormatString 格式
数据绑定之DataFormatString 设定BoundField的DataFormatString,通常有以下几种 DataFormatString= "{0:C}" 货币,货 ...
- clumsy 0.1 测试工具(延迟\掉包\节流\重发\乱序\篡改)
clumsy : http://jagt.github.io/clumsy/可以模拟以下几种场景: 延迟(Lag),把数据包缓存一段时间后再发出,这样能够模拟网络延迟的状况. 掉包(Drop),随机丢 ...
- 【BZOJ】3319: 黑白树
http://www.lydsy.com/JudgeOnline/problem.php?id=3319 题意:给一棵n节点的树(n<=1e6),m个操作(m<=1e6),每次操作有两种: ...