http://blog.csdn.net/pipisorry/article/details/48894963

海量数据挖掘Mining Massive Datasets(MMDs) -Jure Leskovec courses学习笔记之Nearest-Neighbor Learning,KNN最近邻学习

{The module is about large scale machine learning.}

Supervised Learning监督学习

Note: y有多种不同的形式,对应不同的问题。如为实数时,属于回归问题。

下面我们主要讲解分类问题

大规模机器学习方法

how do we efficiently train?Or build a model based on the based on the data?

So in a sense the main question is how do I find this function f.That takes the input features and predicts the class variable.

皮皮blog

Instance based learning基于实例的学习

最近邻分类器Nearest nerghbor

最近邻分类器要考虑的问题

Note: 最后一个要考虑的问题就是:How to take all these nearest neighbors and combine their values into a single point that I can use as prediction.

1-Nearest Nerghbor

1-Nearest nerghbor的重大缺陷:预测值附近变化大,用一个值来预测不准确。the method is suffering from It is making lots of very spiky, or sharp decisions, because we are only looking at the one nearest neighbor.

K-Nearest Nerghbor

Note: f(x) is much smoother than what is was before.

Kernel Regression核回归

皮皮blog

寻找最近邻的方法

一般扫描数据点方法的时间复杂度:线性时间

solution would require a linear pass over the data, so it would take linear time.

使用LSH的时间复杂度:常数时间(可用于大规模数据)

using locality sensitive hashing, we could find, nearest neighbors in near constant time.So that would be a good way how to really make nearest neighbor classifiers scale to large scale data.

具体是怎么实现的?

from:http://blog.csdn.net/pipisorry/article/details/48894963

ref:论文:GPU上的K近邻并行暴力搜索Brute-Force k-Nearest Neighbors Search on the GPU

海量数据挖掘MMDS week2: Nearest-Neighbor Learning最近邻学习的更多相关文章

  1. 海量数据挖掘MMDS week2: 局部敏感哈希Locality-Sensitive Hashing, LSH

    http://blog.csdn.net/pipisorry/article/details/48858661 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

  2. 海量数据挖掘MMDS week2: 频繁项集挖掘 Apriori算法的改进:非hash方法

    http://blog.csdn.net/pipisorry/article/details/48914067 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

  3. 海量数据挖掘MMDS week2: 频繁项集挖掘 Apriori算法的改进:基于hash的方法

    http://blog.csdn.net/pipisorry/article/details/48901217 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

  4. 海量数据挖掘MMDS week2: Association Rules关联规则与频繁项集挖掘

    http://blog.csdn.net/pipisorry/article/details/48894977 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

  5. 海量数据挖掘MMDS week2: LSH的距离度量方法

    http://blog.csdn.net/pipisorry/article/details/48882167 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

  6. 海量数据挖掘MMDS week7: 局部敏感哈希LSH(进阶)

    http://blog.csdn.net/pipisorry/article/details/49686913 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

  7. 海量数据挖掘MMDS week3:社交网络之社区检测:高级技巧

    http://blog.csdn.net/pipisorry/article/details/49052255 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

  8. 海量数据挖掘MMDS week6: 支持向量机Support-Vector Machines,SVM

    http://blog.csdn.net/pipisorry/article/details/49445387 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

  9. 海量数据挖掘MMDS week5: 聚类clustering

    http://blog.csdn.net/pipisorry/article/details/49427989 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

随机推荐

  1. 31. Next Permutation(中等,搞清楚啥是 next permutation)

    Implement next permutation, which rearranges numbers into the lexicographically next greater permuta ...

  2. Oracle中查询和删除相同记录的3种方法

    --创建测试表 )); ,'); ,'); ,'); ,'); ,'); ,'); commit; select * from test; --查询相同记录 ); select id,name fro ...

  3. 彻底理解Oracle中的集合操作与复合查询

    --Oracle中的复合查询 复合查询:包含集合运算(操作)的查询 常见的集合操作有: union: 两个查询的并集(无重复行.按第一个查询的第一列升序排序) union all:两个查询的并集(有重 ...

  4. Docker Kubernetes 项目

    Kubernetes 是 Google 团队发起并维护的基于Docker的开源容器集群管理系统,它不仅支持常见的云平台,而且支持内部数据中心. 建于Docker之上的Kubernetes可以构建一个容 ...

  5. 分布式改造剧集2---DIY分布式锁

    前言: ​ 好了,终于又开始播放分布式改造剧集了.前面一集中(http://www.cnblogs.com/Kidezyq/p/8748961.html)我们DIY了一个Hessian转发实现,最后我 ...

  6. activiti源码分析

    http://blog.csdn.net/vote/candidate.html?username=qq_30739519 欢迎大家投票吧谢谢

  7. Python 2.7 闭包的局限

    想法源自:http://stackoverflow.com/questions/141642/what-limitations-have-closures-in-python-compared-to- ...

  8. Python 继承标准类时发生了什么

    定义标准类dict的一个子类c: >>> class c(dict): pass >>> y=c({1:2,3:4}) >>> y {1: 2, ...

  9. Scikit-learn:scikit-learn快速教程及实例

    http://blog.csdn.net/pipisorry/article/details/52251305 :7] y = dataset[:,8] 我们将在下面所有的例子里使用这个数据组,换言之 ...

  10. JAVA面向对象-----内部类的概述

    JAVA面向对象-–内部类的概述s 将类定义在另一个类的内部则成为内部类.其实就是类定义的位置发生了变化. 在一个类中,定义在类中的叫成员变量,定义在函数中的叫成员函数,那么根据类定义的位置也可以分为 ...