海量数据挖掘MMDS week2: Nearest-Neighbor Learning最近邻学习

http://blog.csdn.net/pipisorry/article/details/48894963

海量数据挖掘Mining Massive Datasets(MMDs) -Jure Leskovec courses学习笔记之Nearest-Neighbor Learning，KNN最近邻学习

{The module is about large scale machine learning.}

Supervised Learning监督学习

Note: y有多种不同的形式，对应不同的问题。如为实数时，属于回归问题。

下面我们主要讲解分类问题

大规模机器学习方法

how do we efficiently train?Or build a model based on the based on the data?

So in a sense the main question is how do I find this function f.That takes the input features and predicts the class variable.

皮皮blog

Instance based learning基于实例的学习

最近邻分类器Nearest nerghbor

最近邻分类器要考虑的问题

Note: 最后一个要考虑的问题就是：How to take all these nearest neighbors and combine their values into a single point that I can use as prediction.

1-Nearest Nerghbor

1-Nearest nerghbor的重大缺陷：预测值附近变化大，用一个值来预测不准确。the method is suffering from It is making lots of very spiky, or sharp decisions, because we are only looking at the one nearest neighbor.

K-Nearest Nerghbor

Note: f(x) is much smoother than what is was before.

Kernel Regression核回归

皮皮blog

寻找最近邻的方法

一般扫描数据点方法的时间复杂度：线性时间

solution would require a linear pass over the data, so it would take linear time.

使用LSH的时间复杂度：常数时间（可用于大规模数据）

using locality sensitive hashing, we could find, nearest neighbors in near constant time.So that would be a good way how to really make nearest neighbor classifiers scale to large scale data.

具体是怎么实现的？

from:http://blog.csdn.net/pipisorry/article/details/48894963

ref:论文:GPU上的K近邻并行暴力搜索Brute-Force k-Nearest Neighbors Search on the GPU

海量数据挖掘MMDS week2: Nearest-Neighbor Learning最近邻学习的更多相关文章

海量数据挖掘MMDS week2: 局部敏感哈希Locality-Sensitive Hashing, LSH
http://blog.csdn.net/pipisorry/article/details/48858661 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...
海量数据挖掘MMDS week2: 频繁项集挖掘 Apriori算法的改进：非hash方法
http://blog.csdn.net/pipisorry/article/details/48914067 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...
海量数据挖掘MMDS week2: 频繁项集挖掘 Apriori算法的改进：基于hash的方法
http://blog.csdn.net/pipisorry/article/details/48901217 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...
海量数据挖掘MMDS week2: Association Rules关联规则与频繁项集挖掘
http://blog.csdn.net/pipisorry/article/details/48894977 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...
海量数据挖掘MMDS week2: LSH的距离度量方法
http://blog.csdn.net/pipisorry/article/details/48882167 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...
海量数据挖掘MMDS week7: 局部敏感哈希LSH（进阶）
http://blog.csdn.net/pipisorry/article/details/49686913 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...
海量数据挖掘MMDS week3:社交网络之社区检测：高级技巧
http://blog.csdn.net/pipisorry/article/details/49052255 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...
海量数据挖掘MMDS week6: 支持向量机Support-Vector Machines,SVM
http://blog.csdn.net/pipisorry/article/details/49445387 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...
海量数据挖掘MMDS week5: 聚类clustering
http://blog.csdn.net/pipisorry/article/details/49427989 海量数据挖掘Mining Massive Datasets(MMDs) -Jure Le ...

随机推荐

Logistic Regression 算法向量化实现及心得
Author: 相忠良(Zhong-Liang Xiang) Email: ugoood@163.com Date: Sep. 23st, 2017 根据 Andrew Ng 老师的深度学习课程课后作 ...
js页面(页面上无服务端控件，且页面不刷新)实现请求一般处理程序下载文件方法
对于js页面来说,未使用服务端控件,点击下载按钮时不会触发服务端事件,且不会提交数据到服务端页面后台进行数据处理,所以要下载文件比较困难.且使用jQ的post来请求一般处理程序也不能实现文件的下载,根 ...
python学习之路前端-CSS
CSS概述 css是英文Cascading Style Sheets的缩写,称为层叠样式表,用于对页面进行美化. 存在方式有三种:元素内联.页面嵌入和外部引入,比较三种方式的优缺点. 语法:style ...
Clojure新手入门
官方网站 clojure.org 环境安装 Java(JDK) Leiningen 编辑工具 Eclipse插件 -- Counterclockwise IntelliJ插件 -- Cursive E ...
Ubuntu 16.04: How to resolve libqt5x11extras5 (>= 5.1.0) but it is not going to be installed
Issue: When you install Virtualbox 5.1 on Ubuntu 16.04, you may encounter following error: root@XIAY ...
Android动态修改ToolBar的Menu菜单
Android动态修改ToolBar的Menu菜单效果图实现实现很简单,就是一个具有3个Action的Menu,在我们滑动到不同状态的时候,把对应的Action隐藏了. 开始上货 Menu Me ...
在Spring Boot中输出REST资源
前面我们我们已经看了Spring Boot中的很多知识点了,也见识到Spring Boot带给我们的各种便利了,今天我们来看看针对在Spring Boot中输出REST资源这一需求,Spring Bo ...
制定一个apk路径然后跳出安装界面
制定一个apk的路径然后跳出界面让用户选择是否安装我们系统有一个写好的Activity来协助我们完成这一功能我们来看看它的清单文件 <?xml version="1.0" ...
JDK 源码学习——ByteBuffer
ByteBuffer 在NIO的作用 Java SE4 开始引入Java NIO,相比较于老的IO,更加依赖底层实现.引入通道(Channels),选择器(selector),缓冲(Buffers). ...
硬件模块化机器人操作系统 Hardware Robot Operating System (H-ROS)
原文网址:http://www.ros.org/news/2016/10/hardware-robot-operating-system-h-ros.html 推荐网址:https://h-ros.c ...

海量数据挖掘MMDS week2: Nearest-Neighbor Learning最近邻学习

海量数据挖掘MMDS week2: Nearest-Neighbor Learning最近邻学习的更多相关文章

随机推荐

热门专题