18 Candidates for the Top 10 Algorithms in Data Mining
Classification
==============
#1. C4.5
Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.
Morgan Kaufmann Publishers Inc.
Google Scholar Count in October 2006: 6907
#2. CART
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and
Regression Trees. Wadsworth, Belmont, CA, 1984.
Google Scholar Count in October 2006: 6078
#3. K Nearest Neighbours (kNN)
Hastie, T. and Tibshirani, R. 1996. Discriminant Adaptive Nearest
Neighbor Classification. IEEE Trans. Pattern
Anal. Mach. Intell. (TPAMI). 18, 6 (Jun. 1996), 607-616.
DOI= http://dx.doi.org/10.1109/34.506411
Google SCholar Count: 183
#4. Naive Bayes
Hand, D.J., Yu, K., 2001. Idiot's Bayes: Not So Stupid After All?
Internat. Statist. Rev. 69, 385-398.
Google Scholar Count in October 2006: 51
Statistical Learning
====================
#5. SVM
Vapnik, V. N. 1995. The Nature of Statistical Learning
Theory. Springer-Verlag New York, Inc.
Google Scholar Count in October 2006: 6441
#6. EM
McLachlan, G. and Peel, D. (2000). Finite Mixture Models.
J. Wiley, New York.
Google Scholar Count in October 2006: 848
Association Analysis
====================
#7. Apriori
Rakesh Agrawal and Ramakrishnan Srikant. Fast Algorithms for Mining
Association Rules. In Proc. of the 20th Int'l Conference on Very Large
Databases (VLDB '94), Santiago, Chile, September 1994.
http://citeseer.comp.nus.edu.sg/agrawal94fast.html
Google Scholar Count in October 2006: 3639
#8. FP-Tree
Han, J., Pei, J., and Yin, Y. 2000. Mining frequent patterns without
candidate generation. In Proceedings of the 2000 ACM SIGMOD
international Conference on Management of Data (Dallas, Texas, United
States, May 15 - 18, 2000). SIGMOD '00. ACM Press, New York, NY, 1-12.
DOI= http://doi.acm.org/10.1145/342009.335372
Google Scholar Count in October 2006: 1258
Link Mining
===========
#9. PageRank
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual
Web search engine. In Proceedings of the Seventh international
Conference on World Wide Web (WWW-7) (Brisbane,
Australia). P. H. Enslow and A. Ellis, Eds. Elsevier Science
Publishers B. V., Amsterdam, The Netherlands, 107-117.
DOI= http://dx.doi.org/10.1016/S0169-7552(98)00110-X
Google Shcolar Count: 2558
#10. HITS
Kleinberg, J. M. 1998. Authoritative sources in a hyperlinked
environment. In Proceedings of the Ninth Annual ACM-SIAM Symposium on
Discrete Algorithms (San Francisco, California, United States, January
25 - 27, 1998). Symposium on Discrete Algorithms. Society for
Industrial and Applied Mathematics, Philadelphia, PA, 668-677.
Google Shcolar Count: 2240
Clustering
==========
#11. K-Means
MacQueen, J. B., Some methods for classification and analysis of
multivariate observations, in Proc. 5th Berkeley Symp. Mathematical
Statistics and Probability, 1967, pp. 281-297.
Google Scholar Count in October 2006: 1579
#12. BIRCH
Zhang, T., Ramakrishnan, R., and Livny, M. 1996. BIRCH: an efficient
data clustering method for very large databases. In Proceedings of the
1996 ACM SIGMOD international Conference on Management of Data
(Montreal, Quebec, Canada, June 04 - 06, 1996). J. Widom, Ed.
SIGMOD '96. ACM Press, New York, NY, 103-114.
DOI= http://doi.acm.org/10.1145/233269.233324
Google Scholar Count in October 2006: 853
Bagging and Boosting
====================
#13. AdaBoost
Freund, Y. and Schapire, R. E. 1997. A decision-theoretic
generalization of on-line learning and an application to
boosting. J. Comput. Syst. Sci. 55, 1 (Aug. 1997), 119-139.
DOI= http://dx.doi.org/10.1006/jcss.1997.1504
Google Scholar Count in October 2006: 1576
Sequential Patterns
===================
#14. GSP
Srikant, R. and Agrawal, R. 1996. Mining Sequential Patterns:
Generalizations and Performance Improvements. In Proceedings of the
5th international Conference on Extending Database Technology:
Advances in Database Technology (March 25 - 29, 1996). P. M. Apers,
M. Bouzeghoub, and G. Gardarin, Eds. Lecture Notes In Computer
Science, vol. 1057. Springer-Verlag, London, 3-17.
Google Scholar Count in October 2006: 596
#15. PrefixSpan
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal and
M-C. Hsu. PrefixSpan: Mining Sequential Patterns Efficiently by
Prefix-Projected Pattern Growth. In Proceedings of the 17th
international Conference on Data Engineering (April 02 - 06,
2001). ICDE '01. IEEE Computer Society, Washington, DC.
Google Scholar Count in October 2006: 248
Integrated Mining
=================
#16. CBA
Liu, B., Hsu, W. and Ma, Y. M. Integrating classification and
association rule mining. KDD-98, 1998, pp. 80-86.
http://citeseer.comp.nus.edu.sg/liu98integrating.html
Google Scholar Count in October 2006: 436
Rough Sets
==========
#17. Finding reduct
Zdzislaw Pawlak, Rough Sets: Theoretical Aspects of Reasoning about
Data, Kluwer Academic Publishers, Norwell, MA, 1992
Google Scholar Count in October 2006: 329
Graph Mining
============
#18. gSpan
Yan, X. and Han, J. 2002. gSpan: Graph-Based Substructure Pattern
Mining. In Proceedings of the 2002 IEEE International Conference on
Data Mining (ICDM '02) (December 09 - 12, 2002). IEEE Computer
Society, Washington, DC.
Google Scholar Count in October 2006: 155
18 Candidates for the Top 10 Algorithms in Data Mining的更多相关文章
- Top 10 Algorithms for Coding Interview--reference
By X Wang Update History:Web Version latest update: 4/6/2014PDF Version latest update: 1/16/2014 The ...
- Top 10 Algorithms of 20th and 21st Century
Top 10 Algorithms of 20th and 21st Century MATH 595 (Section TTA) Fall 2014 TR 2:00 pm - 3:20 pm, Ro ...
- 转:Top 10 Algorithms for Coding Interview
The following are top 10 algorithms related concepts in coding interview. I will try to illustrate t ...
- Favorites of top 10 rules for success
Dec. 31, 2015 Stayed up to last minute of 2015, 12:00am, watching a few of videos about top 10 rules ...
- [转]Top 10 DTrace scripts for Mac OS X
org link: http://dtrace.org/blogs/brendan/2011/10/10/top-10-dtrace-scripts-for-mac-os-x/ Top 10 DTra ...
- Top 10 Methods for Java Arrays
作者:X Wang 出处:http://www.programcreek.com/2013/09/top-10-methods-for-java-arrays/ 转载文章,转载请注明作者和出处 The ...
- Top 10 Universities for Artificial Intelligence
1. Massachusetts Institute of Technology, Cambridge, MA Massachusetts Institute of Technology is a p ...
- Top 10 Free Wireless Network hacking/monitoring tools for ethical hackers and businesses
There are lots of free tools available online to get easy access to the WiFi networks intended to he ...
- TOP 10开源的推荐系统简介
最近这两年推荐系统特别火,本文搜集整理了一些比较好的开源推荐系统,即有轻量级的适用于做研究的SVDFeature.LibMF.LibFM等,也有重量级的适用于工业系统的 Mahout.Oryx.Eas ...
随机推荐
- 如何区分浏览器发起的是基于http/1.x还是http/2的请求?
前言 随着2015年http2.0被推出以来,主流的现代浏览器大多都开始慢慢去实现这个协议,那么如果查看自己的浏览器是否支持发送http2.0的请求,或者如何查看浏览器发送的请求是基于哪一个 ...
- CEIWEI USBMonitor监控驱动 OCX/SDK USB 监控精灵 USB过滤驱动
CEIWEI USBMonitor监控精灵软件SDK USBMonitorX.dll SDK,能够嵌入到你的App程序中,从而在你的App中实现USB端口协议分析.调试USB设备的协议信息,并可以拦截 ...
- mongdb学习
1.启动mongdb 服务(注启动之前把data 文件夹清空) 2.开发和jpa一样 只是extendMongoRepository 3.实体类只有id注解,属性全为String
- Selenium绕过登录的实现
1.使用命令行启动Chrome:Mac:/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome -remote-debugging ...
- 腾讯云+阿里云 搭建hadoop + hbase
目录 服务器配置 hadoop hbase JAVA测试 历时两天,踩了无数坑最后搭建成功... 准备 两台服务器都安装jdk1.8(最好装在相同路径). hadoop 下载 hbase 下载 这里使 ...
- [数据结构] - ArrayList探究
一 概述 ArrayList可以理解为动态数组,与java的数组相比,它的容量能动态曾长,ArrayList是List接口的可变数组的实现,允许包括null值在内的所有元素.除了实现List接口外,此 ...
- [DEBUG] Springboot打包jar/war后访问包外的路径
======================================================== 我是Ruriko,我爱这个世界:)
- 题解 luoguP3554 【[POI2013]LUK-Triumphal arch】
代码的关键部分 inline void dfs(int u,int fa) { ; for(int i=first[u]; i; i=nxt[i]) { int v=go[i]; if(v==fa)c ...
- VMware安装windows2003
一.安装vm 这一项大家应该都会,网上也有很多教程. 二.搭建Windows server 2003 1.镜像下载- 2.虚拟机安装 首先是新建虚拟机,我选的是自定义,也可以选典型 第一步默认下一步, ...
- Mybatis-Plus myBatis的增强工具
1. Mybatis-Plus简介 1.1. 什么是Mybatis-Plus MyBatis-Plus(简称 MP)是一个 MyBatis 的增强工具,在 MyBatis 的基础上只做增强不做改变,为 ...