18 Candidates for the Top 10 Algorithms in Data Mining
Classification
==============
#1. C4.5
Quinlan, J. R. 1993. C4.5: Programs for Machine Learning.
Morgan Kaufmann Publishers Inc.
Google Scholar Count in October 2006: 6907
#2. CART
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and
Regression Trees. Wadsworth, Belmont, CA, 1984.
Google Scholar Count in October 2006: 6078
#3. K Nearest Neighbours (kNN)
Hastie, T. and Tibshirani, R. 1996. Discriminant Adaptive Nearest
Neighbor Classification. IEEE Trans. Pattern
Anal. Mach. Intell. (TPAMI). 18, 6 (Jun. 1996), 607-616.
DOI= http://dx.doi.org/10.1109/34.506411
Google SCholar Count: 183
#4. Naive Bayes
Hand, D.J., Yu, K., 2001. Idiot's Bayes: Not So Stupid After All?
Internat. Statist. Rev. 69, 385-398.
Google Scholar Count in October 2006: 51
Statistical Learning
====================
#5. SVM
Vapnik, V. N. 1995. The Nature of Statistical Learning
Theory. Springer-Verlag New York, Inc.
Google Scholar Count in October 2006: 6441
#6. EM
McLachlan, G. and Peel, D. (2000). Finite Mixture Models.
J. Wiley, New York.
Google Scholar Count in October 2006: 848
Association Analysis
====================
#7. Apriori
Rakesh Agrawal and Ramakrishnan Srikant. Fast Algorithms for Mining
Association Rules. In Proc. of the 20th Int'l Conference on Very Large
Databases (VLDB '94), Santiago, Chile, September 1994.
http://citeseer.comp.nus.edu.sg/agrawal94fast.html
Google Scholar Count in October 2006: 3639
#8. FP-Tree
Han, J., Pei, J., and Yin, Y. 2000. Mining frequent patterns without
candidate generation. In Proceedings of the 2000 ACM SIGMOD
international Conference on Management of Data (Dallas, Texas, United
States, May 15 - 18, 2000). SIGMOD '00. ACM Press, New York, NY, 1-12.
DOI= http://doi.acm.org/10.1145/342009.335372
Google Scholar Count in October 2006: 1258
Link Mining
===========
#9. PageRank
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual
Web search engine. In Proceedings of the Seventh international
Conference on World Wide Web (WWW-7) (Brisbane,
Australia). P. H. Enslow and A. Ellis, Eds. Elsevier Science
Publishers B. V., Amsterdam, The Netherlands, 107-117.
DOI= http://dx.doi.org/10.1016/S0169-7552(98)00110-X
Google Shcolar Count: 2558
#10. HITS
Kleinberg, J. M. 1998. Authoritative sources in a hyperlinked
environment. In Proceedings of the Ninth Annual ACM-SIAM Symposium on
Discrete Algorithms (San Francisco, California, United States, January
25 - 27, 1998). Symposium on Discrete Algorithms. Society for
Industrial and Applied Mathematics, Philadelphia, PA, 668-677.
Google Shcolar Count: 2240
Clustering
==========
#11. K-Means
MacQueen, J. B., Some methods for classification and analysis of
multivariate observations, in Proc. 5th Berkeley Symp. Mathematical
Statistics and Probability, 1967, pp. 281-297.
Google Scholar Count in October 2006: 1579
#12. BIRCH
Zhang, T., Ramakrishnan, R., and Livny, M. 1996. BIRCH: an efficient
data clustering method for very large databases. In Proceedings of the
1996 ACM SIGMOD international Conference on Management of Data
(Montreal, Quebec, Canada, June 04 - 06, 1996). J. Widom, Ed.
SIGMOD '96. ACM Press, New York, NY, 103-114.
DOI= http://doi.acm.org/10.1145/233269.233324
Google Scholar Count in October 2006: 853
Bagging and Boosting
====================
#13. AdaBoost
Freund, Y. and Schapire, R. E. 1997. A decision-theoretic
generalization of on-line learning and an application to
boosting. J. Comput. Syst. Sci. 55, 1 (Aug. 1997), 119-139.
DOI= http://dx.doi.org/10.1006/jcss.1997.1504
Google Scholar Count in October 2006: 1576
Sequential Patterns
===================
#14. GSP
Srikant, R. and Agrawal, R. 1996. Mining Sequential Patterns:
Generalizations and Performance Improvements. In Proceedings of the
5th international Conference on Extending Database Technology:
Advances in Database Technology (March 25 - 29, 1996). P. M. Apers,
M. Bouzeghoub, and G. Gardarin, Eds. Lecture Notes In Computer
Science, vol. 1057. Springer-Verlag, London, 3-17.
Google Scholar Count in October 2006: 596
#15. PrefixSpan
J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal and
M-C. Hsu. PrefixSpan: Mining Sequential Patterns Efficiently by
Prefix-Projected Pattern Growth. In Proceedings of the 17th
international Conference on Data Engineering (April 02 - 06,
2001). ICDE '01. IEEE Computer Society, Washington, DC.
Google Scholar Count in October 2006: 248
Integrated Mining
=================
#16. CBA
Liu, B., Hsu, W. and Ma, Y. M. Integrating classification and
association rule mining. KDD-98, 1998, pp. 80-86.
http://citeseer.comp.nus.edu.sg/liu98integrating.html
Google Scholar Count in October 2006: 436
Rough Sets
==========
#17. Finding reduct
Zdzislaw Pawlak, Rough Sets: Theoretical Aspects of Reasoning about
Data, Kluwer Academic Publishers, Norwell, MA, 1992
Google Scholar Count in October 2006: 329
Graph Mining
============
#18. gSpan
Yan, X. and Han, J. 2002. gSpan: Graph-Based Substructure Pattern
Mining. In Proceedings of the 2002 IEEE International Conference on
Data Mining (ICDM '02) (December 09 - 12, 2002). IEEE Computer
Society, Washington, DC.
Google Scholar Count in October 2006: 155
18 Candidates for the Top 10 Algorithms in Data Mining的更多相关文章
- Top 10 Algorithms for Coding Interview--reference
By X Wang Update History:Web Version latest update: 4/6/2014PDF Version latest update: 1/16/2014 The ...
- Top 10 Algorithms of 20th and 21st Century
Top 10 Algorithms of 20th and 21st Century MATH 595 (Section TTA) Fall 2014 TR 2:00 pm - 3:20 pm, Ro ...
- 转:Top 10 Algorithms for Coding Interview
The following are top 10 algorithms related concepts in coding interview. I will try to illustrate t ...
- Favorites of top 10 rules for success
Dec. 31, 2015 Stayed up to last minute of 2015, 12:00am, watching a few of videos about top 10 rules ...
- [转]Top 10 DTrace scripts for Mac OS X
org link: http://dtrace.org/blogs/brendan/2011/10/10/top-10-dtrace-scripts-for-mac-os-x/ Top 10 DTra ...
- Top 10 Methods for Java Arrays
作者:X Wang 出处:http://www.programcreek.com/2013/09/top-10-methods-for-java-arrays/ 转载文章,转载请注明作者和出处 The ...
- Top 10 Universities for Artificial Intelligence
1. Massachusetts Institute of Technology, Cambridge, MA Massachusetts Institute of Technology is a p ...
- Top 10 Free Wireless Network hacking/monitoring tools for ethical hackers and businesses
There are lots of free tools available online to get easy access to the WiFi networks intended to he ...
- TOP 10开源的推荐系统简介
最近这两年推荐系统特别火,本文搜集整理了一些比较好的开源推荐系统,即有轻量级的适用于做研究的SVDFeature.LibMF.LibFM等,也有重量级的适用于工业系统的 Mahout.Oryx.Eas ...
随机推荐
- Selenium ? 也要学...!
一.selenium 简介 Selenium是ThroughtWorks公司一个强大的开源Web功能测试工具系列,包括Selenium-IDE.Selenium-RC.Selenium-Webdriv ...
- Unity接入九游SDK学习与踩坑
学习之路漫漫,应修之期远兮.持之以恒,方得始终. 这几日接入九游SDK,于浑浑噩噩中成长. 下面是步骤: 一:下载九游SDK 二:打开Android Studio新建一个工程,并且新建一个Androi ...
- NET Core 3.0中的WPF
在.NET Core 3.0中的WPF中使用IOC图文教程 我们都知道.NET Core 3.0已经发布了第六个预览版,我们也知道.NET Core 3.0现在已经支持创建WPF项目了,刚好今天在 ...
- 日常工作问题解决:rhel7下使用teamd配置双网卡绑定
目录 1.情景描述 2.准备工作 2.1 确认网卡信息 2.2 删除原有网卡配置信息 3.配置网卡绑定 3.1 配置千兆网卡双网卡热备用作心跳 3.2 配置网兆网卡双网卡负载均衡用作业务 1.情景描述 ...
- 表单绑定 v-model
<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8" ...
- Django mysql-client
sudo apt-get install libmysqlclient-dev 报错,mysqlclient 没有安装 :实际已经安装,这是因为mysql安装时没有安装好 https://www.cn ...
- 快速创建一个SpringBoot项目并整合Mybatis
2019-09-15 一.Maven环境搭建 1.导入jar坐标 <project xmlns="http://maven.apache.org/POM/4.0.0" xml ...
- gdb 常用命令总结(精优)
格式说明: [xxx]:可选参数,即可以指定可以不指定,实际输入的内容是 xxx <xxx>:占位参数,即必须指定的参数,实际输入的内容是 xxx gdb 常用命令: gdb [file] ...
- 第2章:LeetCode--第二部分
本部分是非Top的一些常见题型及不常见题 LeetCode -- Longest Palindromic Substring class Solution { public: int isPalind ...
- SQL概要与表的创建
SQL概要与表的创建 1.表的结构 关系数据库通过类似Excel 工作表那样的.由行和列组成的二维表来管理数据.用来管理数据的二维表在关系数据库中简称为表. 根据SQL 语句的内容返回的数据同 ...