CLIQUE(Clustering In QUEst)是一种简单的基于网格的聚类方法,用于发现子空间中基于密度的簇.CLIQUE把每个维划分成不重叠的区间,从而把数据对象的整个嵌入空间划分成单元.它使用一个密度阈值识别稠密单元和稀疏单元.一个单元是稠密的,如果映射到它的对象数超过该密度阈值. CLIQUE识别候选搜索空间的主要策略是使用稠密单元关于维度的单调性.这基于频繁模式和关联规则挖掘使用的先验性质.在子空间聚类的背景下,单调性陈述如下: 一个k-维(>1)单元c至少有I个点,仅当c的每个(
http://www.onjava.com/pub/a/onjava/2001/05/30/optimization.htmlComparing the performance of LinkedLists and ArrayLists (and Vectors) (Page last updated May 2001, Added 2001-06-18, Author Jack Shirazi, Publisher OnJava). Tips: ArrayList is faster than
Awesome系列的Java资源整理.awesome-java 就是akullpp发起维护的Java资源列表,内容包括:构建工具.数据库.框架.模板.安全.代码分析.日志.第三方库.书籍.Java 站点等等. 经典的工具与库 (Ancients) In existence since the beginning of time and which will continue being used long after the hype has waned. Apache Ant - Build
数据有两个方向,一个是偏计算机的,另一个是偏经济的.你学过Java,所以你可以偏将计算机基础1. 读书<Introduction to Data Mining>,这本书很浅显易懂,没有复杂高深的公式,很合适入门的人.另外可以用这本书做参考<Data Mining : Concepts and Techniques>.第二本比较厚,也多了一些数据仓库方面的知识.如果对算法比较喜欢,可以再阅读<Introduction to Machine Learning>.当然,还有&
http://www.cnblogs.com/zhangchaoyang/articles/2200800.html http://blog.csdn.net/qll125596718/article/details/6895291 BIRCH(Balanced Iterative Reducing and Clustering using Hierarchies)天生就是为处理超大规模(至少要让你的内存容不下)的数据集而设计的,它可以在任何给定的内存下运行.关于BIRCH的更多特点先不介绍,我
Build Tool Tools which handle the buildcycle of an application. Apache Maven - Declarative build and dependency management which favors convention over configuration. It's preferable to Apache Ant which uses a rather procedural approach and can be di
Tomcat Clustering - A Step By Step Guide Apache Tomcat is a great performer on its own, but if you're expecting more traffic as your site expands, or are thinking about the best way to provide high availability, you'll be happy to know that Tomcat al
Using MLLib in ScalaFollowing code snippets can be executed in spark-shell. Binary ClassificationThe following code snippet illustrates how to load a sample dataset, execute a training algorithm on this training data using a static method in the algo
转自:http://www.cnblogs.com/vivounicorn/archive/2011/09/23/2186483.html Mahout学习——Canopy Clustering 聚类是机器学习里很重要的一类方法,基本原则是将“性质相似”(这里就有相似的标准问题,比如是基于概率分布模型的相似性又或是基于距离的相似性)的对象尽可能的放在一个Cluster中而不同Cluster中 对象尽可能不相似.对聚类算法而言,有三座大山需要爬过去:(1).a large number of cl