The mean shift clustering algorithm

MEAN SHIFT CLUSTERING

Mean shift clustering is a general non-parametric cluster finding procedure — introduced by Fukunaga and Hostetler [1], and popular within the computer vision field.
Nicely, and in contrast to the more-well-known K-means clustering algorithm, the output of mean shift does not depend on any explicit assumptions on the shape of the point distribution, the number of clusters, or any form of random initialization.

We describe the mean shift algorithm in some detail in the technical background sectionat the end of this post. However, its essence is readily explained in a few
words: Essentially, mean shift treats the clustering problem by supposing that all points given represent samples from some underlying probability density function, with regions of high sample density corresponding to the local maxima of this distribution.
To find these local maxima, the algorithm works by allowing the points to attract each other, via what might be considered a short-ranged “gravitational” force. Allowing the points to gravitate towards areas of higher density, one can show that they will eventually
coalesce at a series of points, close to the local maxima of the distribution. Those data points that converge to the same local maxima are considered to be members of the same cluster.

In the next couple of sections, we illustrate application of the algorithm to a couple of problems. We make use of the python package SkLearn, which contains a mean shift
implementation. Following this, we provide a quick discussion and an appendix on technical details.

Follow @efavdb

Follow us on twitter for new submission alerts!

MEAN SHIFT CLUSTERING IN ACTION

In today’s post we will have two examples. First, we will show how to use mean shift clustering to identify clusters of data in a 2D data set. Second, we will use the algorithm to segment a picture based on the colors in the image. To do this we need a handful
of libraries from sklearn, numpy, matplotlib, and the Python Imaging Library (PIL) to handle reading in a jpeg image.

1
2
3
4
5
6
import
numpy as np
from
sklearn.cluster import
MeanShift, estimate_bandwidth
from
sklearn.datasets.samples_generator import
make_blobs
import
matplotlib.pyplot as plt
from
itertools import
cycle
from
PIL import
Image

FINDING CLUSTERS IN A 2D DATA SET

This first example is based off of the sklearn tutorial for mean shift clustering: We generate data points centered at 4 locations,
making use of sklearn’s make_blobs library. To apply the clustering algorithm to the points generated, we must first set the attractive interaction length between examples, also know as the algorithm’s bandwidth. Sklearn’s implementation contains a built-in
function that allows it to automatically estimate a reasonable value for this, based upon the typical distance between examples. We make use of that below, carry out the clustering, and then plot the results.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#%% Generate sample data
centers
=
[[
1,
1], [-.75,
-1], [1,
-1], [-3,
2]]
X, _
=
make_blobs(n_samples
=10000, centers=centers,
cluster_std
=0.6)
 
#%% Compute clustering with MeanShift
 
# The bandwidth can be automatically estimated
bandwidth
=
estimate_bandwidth(X, quantile
=.1,
                               n_samples=500)
ms
=
MeanShift(bandwidth
=bandwidth, bin_seeding=True)
ms.fit(X)
labels
=
ms.labels_
cluster_centers
=
ms.cluster_centers_
 
n_clusters_
=
labels.
max()+1
 
#%% Plot result
plt.figure(1)
plt.clf()
 
colors
=
cycle(
'bgrcmykbgrcmykbgrcmykbgrcmyk')
for
k, col in
zip(range(n_clusters_), colors):
    my_members
=
labels
==
k
    cluster_center
=
cluster_centers[k]
    plt.plot(X[my_members,
0], X[my_members,
1], col
+
'.'
)
    plt.plot(cluster_center[0], cluster_center[1],
             'o', markerfacecolor=col,
             markeredgecolor='k',
markersize
=14)
plt.title('Estimated number of clusters: %d'
%
n_clusters_)
plt.show()

As you can see below, the algorithm has found clusters centered on each of the blobs we generated.

SEGMENTING A COLOR PHOTO

In the first example, we were using mean shift clustering to look for spatial clusters. In our second example, we will instead explore 3D color space, RGB, by considering pixel values taken from an image of a toy car. The procedure is similar — here, we cluster
points in 3d, but instead of having data(x,y) we have data(r,g,b) taken from the image’s RGB pixel values. Clustering these color values in this 3d space returns a series of clusters, where the pixels in those clusters are similar in RGB space. Recoloring
pixels according to their cluster, we obtain a segmentation of the original image.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#%% Part 2: Color image segmentation using mean shift
 
image
=
Image.
open('toy.jpg')
image
=
np.array(image)
 
#Need to convert image into feature array based
#on rgb intensities
flat_image=np.reshape(image, [-1,
3])
 
#Estimate bandwidth
bandwidth2
=
estimate_bandwidth(flat_image,
                                quantile=.2,
n_samples
=500)
ms
=
MeanShift(bandwidth2, bin_seeding
=True)
ms.fit(flat_image)
labels=ms.labels_
 
# Plot image vs segmented image
plt.figure(2)
plt.subplot(2,
1,
1)
plt.imshow(image)
plt.axis('off')
plt.subplot(2,
1,
2)
plt.imshow(np.reshape(labels, [851,1280]))
plt.axis('off')

The bottom image below illustrates that one can effectively use this approach to identify the key shapes within an image, all without doing any image processing to get rid of glare or background — pretty great!

DISCUSSION

Although mean shift is a reasonably versatile algorithm, it has primarily been applied to problems in computer vision, where it has been used for image segmentation, clustering, and video tracking. Application to big data problems can be challenging due to
the fact the algorithm can become relatively slow in this limit. However, research is presently underway to speed up its convergence, which should enable its application to larger data sets.

Mean shift pros:

  1. No assumptions on the shape or number of data clusters.
  2. The procedure only has one parameter, the bandwidth.
  3. Output doesn’t depend on initializations.

Mean shift cons:

  1. Output does depend on bandwidth: too small and convergence is slow, too large and some clusters may be missed.
  2. Computationally expensive for large feature spaces.
  3. Often slower than K-Means clustering

Technical details follow.

TECHNICAL BACKGROUND

KERNEL DENSITY ESTIMATION

A general formulation of the mean shift algorithm can be developed through consideration of density kernels. These effectively work by smearing out each point example in space over some small window. Summing up the mass from each of these smeared units gives
an estimate for the probability density at every point in space (by smearing, we are able to obtain estimates at locations that do not sit exactly atop any example). This approach is often referred to as kernel
density estimation
 — a method for density estimation that often converges more quickly than binning, or histogramming, and one that also nicely returns a continuous estimate for the density function.

To illustrate, suppose we are given a data set {ui} of
points in d-dimensional space, sampled from some larger population, and that we have chosen a kernel K having
bandwidth parameter h.
Together, these data and kernel function return the following kernel density estimator for the full population’s density function

fK(u)=1nhd∑i=1nK(u−uih)

The kernel (smearing) function here is required to satisfy the following two conditions:

  1. ∫K(u)du=1
  2. K(u)=K(|u|) for
    all values of u

The first requirement is needed to ensure that our estimate is normalized, and the second is associated with the symmetry of our space. Two popular kernel functions that satisfy these conditions are given by

  1. Flat/Uniform K(u)=12{10−1≤|u|≤1else
  2. Gaussian = K(u)=1(2π)d/2e−12|u|2

Below we plot an example in 1-d using the gaussian kernel to estimate the density of some population along the x-axis. You can see that each sample point adds a small Gaussian to our estimate, centered about it: The equations above may look a bit intimidating,
but the graphic here should clarify that the concept is pretty straightforward.

Example of a kernel density estimation using a gaussian kernel for each data point: Adding up small Gaussians about each example returns our net estimate for the total density, the black curve.

MEAN SHIFT ALGORITHM

Recall that the basic goal of the mean shift algorithm is to move particles in the direction of local increasing density. To obtain an estimate for this direction, a gradient is applied to the kernel density estimate discussed above. Assuming an angularly symmetric
kernel function, K(u)=K(|u|),
one can show that this gradient takes the form

∇fK(u)=2nhd+2(∑i=1ng(∣∣u−uih∣∣))m(u).

where

m(u)=⎛⎝⎜⎜⎜∑i=1nuig(∣∣u−uih∣∣)∑i=1ng(∣∣u−uih∣∣)−u⎞⎠⎟⎟⎟,

and g(|u|)=−K′(|u|) is
the derivative of the selected kernel profile. The vector m(u)here,
called the mean shift vector, points in the direction of increasing density — the direction we must move our example. With this estimate, then, the mean shift algorithm protocol becomes

  • Compute the mean shift vector m(ui),
    evaluated at the location of each training example ui
  • Move each example from ui→ui+m(ui)
  • Repeat until convergence — ie, until the particles have reached equilibrium.

As a final step, one determines which examples have ended up at the same points, marking them as members of the same cluster.

For a proof of convergence and further mathematical details, see Comaniciu & Meer (2002) [2].

1. Fukunaga and Hostetler, “The Estimation of the Gradient of a Density Function, with Applications in Pattern Recognition”, IEEE Transactions on Information Theory vol 21 , pp 32-40 ,1975

2. Dorin Comaniciu and Peter Meer, Mean Shift : A Robust approach towards feature space analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence vol 24 No 5 May 2002.

The mean shift clustering algorithm的更多相关文章

  1. Machine Learning—The k-means clustering algorithm

    印象笔记同步分享:Machine Learning-The k-means clustering algorithm

  2. AP聚类算法(Affinity propagation Clustering Algorithm )

    AP聚类算法是基于数据点间的"信息传递"的一种聚类算法.与k-均值算法或k中心点算法不同,AP算法不需要在运行算法之前确定聚类的个数.AP算法寻找的"examplars& ...

  3. Clustering by density peaks and distance

    这次介绍的是Alex和Alessandro于2014年发表在的Science上的一篇关于聚类的文章[13],该文章的基本思想很简单,但是其聚类效果却兼具了谱聚类(Spectral Clustering ...

  4. 挑子学习笔记:两步聚类算法(TwoStep Cluster Algorithm)——改进的BIRCH算法

    转载请标明出处:http://www.cnblogs.com/tiaozistudy/p/twostep_cluster_algorithm.html 两步聚类算法是在SPSS Modeler中使用的 ...

  5. Study notes for Clustering and K-means

    1. Clustering Analysis Clustering is the process of grouping a set of (unlabeled) data objects into ...

  6. DBSCAN(Density-based spatial clustering of applications with noise)

    Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm ...

  7. HDU 5489 Difference of Clustering 图论

    Difference of Clustering Problem Description Given two clustering algorithms, the old and the new, y ...

  8. mean shift聚类算法的MATLAB程序

    mean shift聚类算法的MATLAB程序 凯鲁嘎吉 - 博客园 http://www.cnblogs.com/kailugaji/ 1. mean shift 简介 mean shift, 写的 ...

  9. 基于K-means Clustering聚类算法对电商商户进行级别划分(含Octave仿真)

    在从事电商做频道运营时,每到关键时间节点,大促前,季度末等等,我们要做的一件事情就是品牌池打分,更新所有店铺的等级.例如,所以的商户分入SKA,KA,普通店铺,新店铺这4个级别,对于不同级别的商户,会 ...

随机推荐

  1. U3D prefab

    1,prefab相当于一个类,字面意思就是预设,预先设计好的类.把一个prefab拖放到场景中就生成了一个实例,把二个prefab放到场景中就生成了两个实例. 不同的实例独立动作,拥有自己独立的状态与 ...

  2. log4j输出日志乱码(转)

    log4j日志文件乱码问题的解决方法 log4j日志文件中文乱码处理方法 log4j 控制台和文件输出乱码问题解决 写在前面,第三篇文章中将原因解释的最清楚,为什么设置为UTF-8或者GBK就生效了, ...

  3. php基础26:文件与目录1

    <meta charset="utf-8"> <?php //绝对路径 $path = "E:\AppServ\www\php\/33-catalog. ...

  4. Caffe学习系列(4):激活层(Activiation Layers)及参数

    在激活层中,对输入数据进行激活操作(实际上就是一种函数变换),是逐元素进行运算的.从bottom得到一个blob数据输入,运算后,从top输入一个blob数据.在运算过程中,没有改变数据的大小,即输入 ...

  5. word2010 数学公式/联立方程/大括号内方程组如何左对齐?

    如何在word中输入的联立方程使其条件左对齐? 如输入: 实现如下对齐: 就是在每个逗号 .前输入一个 & 号就可以了, 注意这个逗号一定要是 位于这个方框里头,然后在其前面输入 & ...

  6. [Android] 安卓模拟器临时文件相关问题

    今天生产环境有台机器的硬盘满了,排查发现我的模块在/tmp/android-username目录下留了一堆形如“emulator-1tpH5l”的文件,占用了很大的空间. 这个模块会反复启停好几个安卓 ...

  7. IOS开发之—— Core Foundation对象与OC对象相对转换的问题

    对ARC盲目依赖的同学: 1过度使用block后,无法解决循环引用问题 2遇到底层Core Foundation对象,需要自己手工管理它们的引用计数时,显得一筹莫展 first:对于底层Core Fo ...

  8. OpenCV Start

    开始学习opencv了. 从官网下载了 opencv-3.0.0-alpha.exe(windows版本) opencv-3.0.0-alpha.zip (linux版本) 从windows版本的安装 ...

  9. WP小游戏产品海外发行经验小结

    在群里和大家聊天的时候,大家最多抱怨的就是国内WP份额低,辛辛苦苦做的APP变现困难.我和大家一样,兼职做一些开发,不过我的APP主要面向的是海外市场,从5月份上线到现在不到两个月的时间,没有花费一分 ...

  10. Mininet的安装与卸载

    1.Mininet的卸载比较简单,只需要执行以下命令: sudo rm -rf /usr/local/bin/mn /usr/local/bin/mnexec /usr/local/lib/pytho ...