data ------> knowledge

Are all patterns interesting?

No. only a small fraction of the patterns potentially generated would actually be of interest to a given user.

What makes a pattern interesting?

  • easily understood by humans
  • valid
  • potentially useful
  • novel
  • An interesting pattern represents knowledge.

Can a data mining system generate all of the interesting patterns?

It is often unrealistic and inefficient for data mining systems to generate all possible pattern.

1.7 Major issue in data mining

major issues:

  1. mining methodology
  2. user interaction
  3. efficiency and scalability可扩展性
  4. diversity of data types
  5. data mining and society

BK: Data mining的更多相关文章

  1. BK: Data mining: concepts and techniques (1)

    Chapter 1 data mining is knowledge discovery from data; The knowledge discovery process is an iterat ...

  2. BK: Data mining, Chapter 2 - getting to know your data

    Why: real-world data are typically noisy, enormous in volume, and may originate from a hodgepodge of ...

  3. Distributed Databases and Data Mining: Class timetable

    Course textbooks Text 1: M. T. Oszu and P. Valduriez, Principles of Distributed Database Systems, 2n ...

  4. What is the most common software of data mining? (整理中)

    What is the most common software of data mining? 1 Orange? 2 Weka? 3 Apache mahout? 4 Rapidminer? 5 ...

  5. What’s the difference between data mining and data warehousing?

    Data mining is the process of finding patterns in a given data set. These patterns can often provide ...

  6. A web crawler design for data mining

    Abstract The content of the web has increasingly become a focus for academic research. Computer prog ...

  7. Datasets for Data Mining and Data Science

    https://github.com/mattbane/RecommenderSystem http://grouplens.org/datasets/movielens/ KDDCUP-2012官网 ...

  8. cluster analysis in data mining

    https://en.wikipedia.org/wiki/K-means_clustering k-means clustering is a method of vector quantizati ...

  9. Weka 3: Data Mining Software in Java

    官方网站: Weka 3: Data Mining Software in Java 相关使用方法博客 WEKA使用教程(经典教程转载) (实例数据:bank-data.csv) Weka初步一.二. ...

随机推荐

  1. vue 路由过渡动效

    <router-view> 是基本的动态组件,所以我们可以用 <transition> 组件给它添加一些过渡效果: <transition name="slid ...

  2. MS14-068提权和impacket工具包提权

    ms14-068提权 工具利用 a)拿下边界机win7,并已经有win7上任意一个账号的密码 -u 用户名@域 -p 用户密码 -s 用户sid -d 域控 ms14-068.exe -u test3 ...

  3. Hash存储模型、B-Tree存储模型、LSM存储模型介绍

    每一种数据存储系统,对应有一种存储模型,或者叫存储引擎.我们今天要介绍的是三种比较流行的存储模型,分别是: Hash存储模型 B-Tree存储模型 LSM存储模型 不同存储模型的应用情况 1.Hash ...

  4. PHP0026:PHP 博客项目开发3

  5. cf912D

    题意简述:往n*m的网格中放k条鱼,一个网格最多放一条鱼,然后用一个r*r的网随机去捞鱼,问怎么怎么放鱼能使得捞鱼的期望最大,输出这个期望 题解:肯定优先往中间放,这里k不大,因此有别的简单方法,否则 ...

  6. kaks calculator批量计算多个基因的选择压力kaks值

    欢迎来到"bio生物信息"的世界 今天给大家带来"批量计算kaks值"的技能. 关于kaks的背景知识我就不介绍了,感兴趣的自行搜索,这里直接开始讲怎么批量计算 ...

  7. IntelliJ IDEA 更新

    一. 下载最新版本的idea 1. 官网下载,官网地址:http://www.jetbrains.com/idea/download/#section=windows 2. 百度网盘直接下载:http ...

  8. CentOS7防火墙设置常用命令

    目录 开/关/重启防火墙 查看所有开启的端口号 CentOS7环境下防火墙常用命令 开/关/重启防火墙 查看防火墙状态 firewall-cmd --state 启动防火墙 systemctl sta ...

  9. webkit 技术内幕 笔记 二

    浏览器历史 80年代末-90年代初:worldwideweb(nexus) -- Berners-Lee 1993: Mosaic浏览器,后来叫网景(Netscape)--Marc Andreesse ...

  10. oracle基础知识点

    一.count(*).count(1).count(字段名)的区别select count(*) from t_md_inst --153797 --包含字段为null 的记录select count ...