摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程,第一章《绪论:初识机器学习》中第4课时《无监督学习》的视频原文字幕。为本人在视频学习过程中逐字逐句记录下来以便日后查阅使用。现分享给大家。如有错误,欢迎大家批评指正,在此表示诚挚地感谢!同时希望对大家的学习能有所帮助。

In this video (article), we'll talk about the second major type of machine learning problem, called Unsupervised Learning. In the last video, we talked about Supervised Learning. Back then, we got data sets that look like this, where each example was labeled either as a positive or negative example, whether it was a benign or a malignant tumor. So for each example in Supervised Learning, we were told explicitly what is the so-called right answer, whether it's benign or malignant.

In Unsupervised Learning, we're given data that looks different, data that looks like this, that doesn't have any labels, or that all has the same labels or really no labels. So we're given the data set and we're not told what to do with it, and we're not told what each data point is. Instead we're just told, here is a data set. Can you find some structure in the data? Given this data set, an Unsupervised Learning algorithm might decide that the data lives in two different clusters. And so there's one cluster, and there's a different cluster. And the unsupervised learning algorithm may break these data into these two separate clusters. So this is called a clustering algorithm. And this turns out to be used in many places.

One example where clustering is used is in Google News and if you have not seen this before, you can actually go to this URL news.google.com to take a look. What Google News does is everyday it goes and looks at tens of thousands or hundreds of thousands of new stories on the web, and it groups them into cohesive news stories. For example, let's look here (red rectangle). The URLs here link to different news stories about the BP Oil Well story. So, let's click on one of these URLs, and we'll click on one of these URLs. What we'll get to is a web page like this. Here's a Wall Street Journal article about the BP Oil Well Spill stories, of "BP Kills Macondo", which is a name of the spill. And if you click on a different URL from that group, then you might get the different story. Here's the CNN story about, again, the BP Oil Spill. And if you click on yet a third link, then you might get a different story. Here's the UK Guardian story about the BP Oil Spill. So what Google News has done is look for tens of thousands of news stories, and automatically cluster them together. So, the news stories that are all about the same topic get displayed together.

It turns out that clustering algorithms and Unsupervised Learning algorithms are used in many other problems as well. Here's one on understanding genomics. Here's an example of DNA microarray data. The idea is to have a group of different individuals, and for each of them, you measure how much they do or do not have a certain gene. Technically you measure how much certain genes are expressed. So these colors, red, green, gray and so on, they showed the degree to which different individuals do or do not have a specific gene. And what you can do is then run a clustering algorithm to group individuals into different categories or into different types of people. So this is Unsupervised Learning, because we're not telling the algorithm in advance that these are type 1 people, those are type 2 persons, those are type 3 persons and so on. And instead what we were saying is here's a bunch of data, I don't know what's in this data, I don't know who is in what type, I don't even know what the different types of people are, but can you automatically find structure in the data for me? Can you automatically cluster the individuals into these types that I don't know in advance? Because we're not giving the algorithm the right answer for the examples in my data set, this is Unsupervised Learning.

Unsupervised Learning or clustering is used for a bunch of other applications. It's used to organize large computer clusters. I had some friends looking at large data centers, that is large computer clusters, and trying to figure out which machines tend to work together. And if you can put those machines together, you can make your data center work more efficiently. This second application is on social network analysis. So given the knowledge about which friends you email the most, or given your facebook friends or your Google+ circles, can we automatically identify which are cohesive groups of friends, also which are groups of people that all know each other? Market segmentation. Many companies have huge databases of customer information. So, can you look at this customer data set, and automatically discover market segments, and automatically group your customers into different market segments, so that you can automatically and more efficiently sell or market your different market segments together? Again, this is Unsupervised Learning, because we have all this customer data, but we don't know in advance what are the market segments, and for the customers in our data set, you know, we don't know in advance who is in market segment wone, who is in market segment two, and so on. But we have to let the algorithm discover all this just from the data. Finally, it turns out that Unsupervised Learning is also used for surprisingly astronomical data analysis, and these clustering algorithms gave surprisingly interesting theories of how galaxies are formed. All of these are examples of clustering, which is just one type of Unsupervised Learning. Let me tell you about another one.

I'm gonna tell you about the cocktail party problem. So, you've been to cocktail parties before, right? Well, you can imagine there's a party, room full of people, all sitting around, all talking at the same time. All there are all these overlapping voices, because everyone is talking at the same time, and it is almost hard to hear the person in front of you. So maybe at a cocktail party of two people, two people talking at the same time, and it's a somewhat small cocktail party. And we're going to put two microphones in the room, so there are microphones, and because these microphones are at two different distances from the speakers, each microphone records a different combination of these two speaks' voices. Maybe speaker one is a little louder in microphone one, and maybe speaker two is a little bit louder on microphone 2, because the two microphones are at different positions relative to the two speakers, but each microphone records an overlapping combination of both speakers' voices. So, here's an actual recording of two speakers recorded by a researcher. Let me play for you of the first, what the first microphone sounds like. One(uno), two(dos), three(tres), four(cuatro), five(cinco), six(seis), seven(siete), eight(ocho), nine(neuve), ten(y diez). All right, maybe not the most interesting cocktail party, there's two people counting from one to ten in two languages but you know. What you just heard was the first microphone recording, here's the second recording. Uno(one), dos(two), tres(three), cuatro(four), cinco(five), seis(six), siete(seven), ocho(eight), nueve(nine), y diez(ten). So we can do is take these two microphones' recordings and give them to an Unsupervised Learning algorithm, called the cocktail party algorithm. And tell the algorithm find structure in this data for me. And what the algorithm will do is listen to these audio recordings, and say, you know it sounds like the two audio recordings that are being added together, or that are being summed together to produce these recordings that we had. Moreover, what the cocktail party algorithm will do is separate out these two audio sources that were being added or being summed together to form other recordings. And in fact, here's the first output of the cocktail party algorithm. One, two, three, four, five, six, seven, eight, night, ten. So, it separated out the English voice in one of the recordings. And here's the second output. Uno, dos, tres, quatro, cinco, seis, siete, ocho, nueve, y diez. Not too bad. To give you one more example, here's another recording of another similar situation, here's the first microphone [with background music]: one, two, three, four, five, six, seven, eight, nine, ten. Ok so the poor guy's gone home from the cocktail party, and he's now sitting in a room by himself talking to his radio. Here's the second microphone recording: one, two, three, four, five, six, seven, eight, nine, ten. When you give these two microphones recordings to the same algorithm, what it does, is again say, you know, it sounds like there are two audio sources, and moreover, the algorithm says, here is the first of the audio source I found. One, two, three, four, five, six, seven, eight, nine, ten. So that wasn't perfect, it got the voice, but it also got a little bit of the music in there. Then here's the second output to the algorithm: [the music]. Not too bad, in that second output it managed to get rid of the voice entirely, and just, you know, cleaned up the music, got rid of the counting from one to ten. So, you might look at an Unsupervised Learning algorithm like this, and ask how complicated it is to implement this, right? It seems like in order to build this application, it seems like to do this audio processing, you need to write a ton of code, or maybe link into like a bunch of C++ or Java libraries that process audio, it seems like a really complicated program, to do this audio, separate out audio and so on. It turns out the algorithm, to do what you just heard, that can be done in one line of code shown right here.

It did take researchers a long time to come up with this line of code. I'm not saying this is an easy problem. But it turns out that when you use the right programming environment, many learning algorithms can be really short programs. So, this is also why in this class we're going to use the Octave programming environment. Octave is free open source software, and using a tool like Octave or Matlab, many learning algorithms become just a few lines of code to implement. Later in this class, I'll just teach you a little bit about how to use Octave, and you'll be implementing some of these algorithms in Octave. Or if you have MATLAB, you can use that too. It turns out the Silicon Valley, for a lot of machine learning algorithms, what we do is first prototype our software in Octave, because software in Octave makes it incredibly fast to implement these learning algorithms. Here each of these functions like for example the SVD function, that stands for singular value decomposition; but that turns out to be a linear algebra routine that is just built into Octave. If you were trying to do this in C++ or Java, this would be many lines of codes, linking complex C++ or Java libraries. So, you can implement this stuff as C++ or Java or Python, it's just more complicated to do so in those languages. What I've seen after having taught machine learning for almost a decade now, is that, you learn much faster if you use Octave as your programming environment, and if you use Octave as your learning tool and as your prototyping tool, it'll let you learn and prototype learning algorithms more much quickly. And in fact what many people will do in the large Silicon Valley companies is in fact, use an algorithm(tool?) like Octave to first prototype the learning algorithm. And only after you've gotten it to work, then you migrate it to C++ or Java or whatever. It turns out that by doing things this way, you can often get your algorithm to work much faster than if you were starting out in C++. So, I know that as an instructor, I get to say "trust me on this one" only a finite number of times, but for those of you who have never used these Octave programming environment before, I'm going to ask you to trust me on this one and say that you will, I think, your development time is one of the most valuable resources. And having seen lots of people do this, I think you as a machine learning researcher, or machine learning developer, will be much more productive if you learn to start in prototype, just start in Octave, and then some other language.

Finally, to wrap up this video, I have a quick review question for you. We talked about Unsupervised Learning, which is a learning setting where you give the algorithm a ton of data and just ask it to find structure in the data for us. Of the following four examples, which ones, which of these four do you think will be an Unsupervised Learning algorithm as opposed to Supervised Learning problem. For each of the four check boxes on the left, check the ones for which you think Unsupervised Learning algorithm would be appropriate, and then click the button on the lower right to check your answer. So, when the video pauses, please answer the question on the slide. So, hopefully, you've remembered the spam filter problem. If you have labeled data, you know, of spam and non-spam e-mail, we'd treat this as a Supervised Learning problem. The news story example, that's exactly the Google News example that we saw in this video, we saw how you can use a clustering algorithm to cluster these articles together, so that's Unsupervised Learning. The market segmentation example I talked a little bit earlier, you do that as an Unsupervised Learning problem, because I am just gonna give my algorithm data, and ask it to discover market algorithms automatically. And the final example, diabetes, well that's actually just like our breast cancer example from the last video. Only instead of, you know, good and bad cancer tumors or benign or malignant tumors, we instead have diabetes or not, and so we will solve that as a Supervised Learning problem, just like we did for the breast tumor data.

So, that is it for Unsupervised Learning, and in the next video (article), we'll delve more into specific learning algorithms, and start to talk about how these algorithms work, and how you can go about implementing them.

<end>

Introduction - Unsupervised Learning的更多相关文章

  1. Machine Learning Algorithms Study Notes(4)—无监督学习(unsupervised learning)

    1    Unsupervised Learning 1.1    k-means clustering algorithm 1.1.1    算法思想 1.1.2    k-means的不足之处 1 ...

  2. Unsupervised Learning: Use Cases

    Unsupervised Learning: Use Cases Contents Visualization K-Means Clustering Transfer Learning K-Neare ...

  3. Coursera 机器学习 第8章(上) Unsupervised Learning 学习笔记

    8 Unsupervised Learning8.1 Clustering8.1.1 Unsupervised Learning: Introduction集群(聚类)的概念.什么是无监督学习:对于无 ...

  4. Unsupervised Learning and Text Mining of Emotion Terms Using R

    Unsupervised learning refers to data science approaches that involve learning without a prior knowle ...

  5. Supervised Learning and Unsupervised Learning

    Supervised Learning In supervised learning, we are given a data set and already know what our correc ...

  6. Unsupervised learning无监督学习

    Unsupervised learning allows us to approach problems with little or no idea what our results should ...

  7. PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning --- 论文笔记

    PredNet --- Deep Predictive coding networks for video prediction and unsupervised learning   ICLR 20 ...

  8. [转]Introduction to Learning to Trade with Reinforcement Learning

    Introduction to Learning to Trade with Reinforcement Learning http://www.wildml.com/2018/02/introduc ...

  9. 131.005 Unsupervised Learning - Cluster | 非监督学习 - 聚类

    @(131 - Machine Learning | 机器学习) 零. Goal How Unsupervised Learning fills in that model gap from the ...

随机推荐

  1. MyBatis 接口注解方式代替mapper.xml

    https://blog.csdn.net/m0_38068812/article/details/86566929 spring boot(8)-mybatis三种动态sql  或者 这个 1. 代 ...

  2. 历年NOIP题

    做了几天远古老题,发现不可做,于是咕掉..转而从2005开始.. 1997: P1549 棋盘问题(2):搜索,优化搜索顺序,对于第一行第一列先搜小的(但是其实这样是错的,仅仅能过原题) 加强版咕. ...

  3. JAVA的选择结构(二)

    1.switch选择结构:                        语法:                            switch (key) {                   ...

  4. java+web+下载断点续传

    1.先将 webuploader-0.1.5.zip 这个文件下载下来:https://github.com/fex-team/webuploader/releases  根据个人的需求放置自己需要的 ...

  5. Cogs 13. 运输问题4(费用流)

    运输问题4 ★★☆ 输入文件:maxflowd.in 输出文件:maxflowd.out 简单对比 时间限制:1 s 内存限制:128 MB [问题描述] 一个工厂每天生产若干商品,需运输到销售部门进 ...

  6. P4514 上帝造题的七分钟——二维树状数组

    P4514 上帝造题的七分钟 求的是矩阵里所有数的和: 维护四个树状数组: #include<cstdio> #include<cstring> #include<alg ...

  7. meshing-划分圆柱结构化网格

    原视频下载地址:https://yunpan.cn/cqjeckrzEpVkY  访问密码 eb5d

  8. win10 sedlauncher.exe占用cpu处理

    打开应用和功能,搜KB4023057,然后卸载. 打开系统服务,找到Windows Remediation Service (sedsvc)和Windows Update Medic Service ...

  9. 如果前面的IO操作出问题了,按照我们代码的意思,不就try catch 了吗,这样的话线程就没关闭了,就会造成线程泄露。 那怎么解决这个问题呢,其实也简单,把关闭线程的方法写到finally里就可以了。

    https://mp.weixin.qq.com/s/WaNVT2bZFGHNO_mb5nK6vw 连HDFS源码大神都会犯的错之线程泄露(1) 西瓜老师 西瓜老师爱大数据 1月11日  

  10. maven install的时候把源码一起放到仓库

    在pom.xml中加入插件 <build>    <plugins>        <!-- Source attach plugin -->        < ...