ML_note1
Supervised Learning
In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.
Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.
Example 1:
Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.
We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.
Example 2:
(a) Regression - Given a picture of a person, we have to predict their age on the basis of the given picture
(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.
Unsupervised Learning
Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.
We can derive this structure by clustering the data based on relationships among the variables in the data.
With unsupervised learning there is no feedback based on the prediction results.
Example:
Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.
Non-clustering: The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).
Cost Function
This function is otherwise called the "Squared error function", or "Mean squared error".
ML_note1的更多相关文章
随机推荐
- DFS序的题目列表
所谓dfs序就是将之前的顺序进行修改,获得一个新的序列,然后再新的序列下进行一系列其他的操作 一般题目给你的都会是一棵树,然后点之间都是无关的,我们首要的任务就是先把这些序列重新排.然后再根据dfs的 ...
- 转 layout_weight体验(实现按比例显示)
http://www.cnblogs.com/zhmore/archive/2011/11/04/2236514.html 在android开发中LinearLayout很常用,LinearLayou ...
- C#获取客服端ip和用户名
. 在asp.Net中专用属性: 获取服务器电脑名:page.server.manchinename 获取用户信息:page.user 获取客户端电脑名:page.request.userhostna ...
- Swaps in Permutation
Swaps in Permutation You are given a permutation of the numbers 1, 2, ..., n and m pairs of position ...
- ibus用上搜狗拼音词库
1.下载搜狗拼音词库 wget http://hslinuxextra.googlecode.com/files/sougou-phrases-full.7z 2.用sougou-phrases-fu ...
- PAT (Advanced Level) 1057. Stack (30)
树状数组+二分. #include<iostream> #include<cstring> #include<cmath> #include<algorith ...
- Commons Codec基本使用(转载)
在实际的应用中,我们经常需要对字符串进行编解码,Apache Commons家族中的Commons Codec就提供了一些公共的编解码实现,比如Base64, Hex, MD5,Phonetic an ...
- WebDriver(Selenium2) 根据新窗口title切换窗口
http://uniquepig.iteye.com/blog/1559321 在webdriver官方的api中,切换窗口的方法提供的参数是nameOrHandle. 引用 http://uniqu ...
- Mysql 技巧
order by条件: SELECT * FROM tablename WHERE id_one=27 OR id_two=27 ORDER BY CASE WHEN id_one=27 THEN t ...
- Kafka 在行动:7步实现从RDBMS到Hadoop的实时流传输
原文:https://coyee.com/article/11095-kafka-in-action-7-steps-to-real-time-streaming-from-rdbms-to-hado ...