这篇随笔是对Wikipedia上 k-d tree 词条的摘录, 我认为对该词条解释相当生动详细, 是不可多得的好文.


Overview

A $k$-d tree (short for $k$-dimensional tree) is a binary space-partitioning tree for organizing points in a $k$-dimensional space. $k$-d trees are a useful data structure for searches involving a multidimensional search key.

Construction

The canonical method of $k$-d tree construction has the following constraints:

  • As one moves down the tree, one cycles through the axes used to select the splitting planes.
  • Points are inserted by selecting the median of the points being put into the subtree, with respect to their coordinates in the axis being used to create the splitting plane.

This method leads to a balanced $k$-d tree, in which each leaf node is approximately the same distance from the root. However, balanced trees are not necessarily optimal for all applications.

Nearest Neighboring Search

Terms:

  • the split dimensions
  • the splitting (hyper)plane
  • "current best"

The **nearest neighbour ** (NN) search algorithm aims to find the point in the tree that is nearest to a given point. This search can be done efficiently by using the tree properties to quickly eliminate large portions of the search space.

Searching for a nearest neighbour in a $k$-d tree proceeds as follows:

  1. Starting with the root node, the algorithm moves down the tree recursively.
  2. Once the algorithm reaches a leaf node, it saves that node point as "current best"
  3. The algorithm unwinds the recursion of the tree, performing the following steps at each node:
    1. If the current node is closer than the current best, then it becomes the current best.
    2. The algorithm checks whether there could be any points on the other side of the splitting plane that are closer to the search point than the current best. In concept, this is done by intersecting the splitting hyperplane with a hypersphere around the the search point that has a radius equal to the current nearest distance. Since the hyperplanes are all axis-aligned this is implemented as a simple comparison to see whether the distance between the splitting coordinate of the search point and current node is less than the distance (overall coordinates) from the search point to the current best.
      1. If the hypersphere crosses the plane, there could be nearer points on the other side of the plane, so the algorithm must move down the other branch of the tree from the current node looking for closer points, following the same recursive process as the entire search.
      2. If the hypersphere doesn't intersect the splitting plane, then the algorithm continues walking up the tree, and the entire branch on the other side of that node is eliminated.

Generally, the algorithm uses squared distances for comparison to avoid computing square roots. Additionally, it can save computation by holding the squared current best distance in a variable for computation.

The algorithm can be extended in several ways by simple modifications. If can provide the $k $ nearest neighbors to a point by maintaining $k$ current bests instead of just one. A branch is only eliminated when $k$ points have been found and the branch cannot have points closer than any of the $k$ current bests.

Implementation

$k$ 近临 ($k$NN)

#include <bits/stdc++.h>
#define lson id<<1
#define rson id<<1|1
#define sqr(x) (x)*(x)
using namespace std;
using LL=long long;
const int N=5e4+5; // K-D tree: a special case of binary space partitioning trees int DIM, idx; struct Node{
int key[5];
bool operator<(const Node &rhs)const{
return key[idx]<rhs.key[idx];
}
void read(){
for(int i=0; i<DIM; i++)
scanf("%d", key+i);
}
LL dis2(const Node &rhs)const{
LL res=0;
for(int i=0; i<DIM; i++)
res+=sqr(key[i]-rhs.key[i]);
return res;
}
void out(){
for(int i=0; i<DIM; i++)
printf("%d%c", key[i], i==DIM-1?'\n':' ');
}
}p[N]; Node a[N<<2]; // K-D tree
bool f[N<<2]; // [l, r)
void build(int id, int l, int r, int dep)
{
if(l==r) return; // error-prone
f[id]=true, f[lson]=f[rson]=false; // select axis based on depth so that axis cycles through all valid values
idx=dep%DIM;
int mid=l+r>>1; // sort point list and choose median as pivot element
nth_element(p+l, p+mid, p+r);
a[id]=p[mid];
build(lson, l, mid, dep+1);
build(rson, mid+1, r, dep+1);
} using P=pair<LL,Node>;
priority_queue<P> que; // multidimensional search key void query(const Node &p, int id, int m, int dep){
int dim=dep%DIM;
int x=lson, y=rson;
// left: <, right >=
if(p.key[dim]>=a[id].key[dim])
swap(x, y); if(f[x]) query(p, x, m, dep+1); P cur{p.dis2(a[id]), a[id]}; if(que.size()<m){
que.push(cur);
}
else if(cur.first<que.top().first){
que.pop();
que.push(cur);
}
if(f[y] && sqr(a[id].key[dim]-p.key[dim])<que.top().first)
query(p, y, m, dep+1);
}

说明:

  1. bool数组f[], 表示一个完全二叉树中的某个节点是否存在, 也可不用完全二叉树的表示法, 而用两个数组lson[]rson[]表示, 这样的好处还有: 节省空间, 数组可以只开到节点数的2倍.
  2. 区间采用左闭右开表示.

K-D Tree的更多相关文章

  1. 第46届ICPC澳门站 K - Link-Cut Tree // 贪心 + 并查集 + DFS

    原题链接:K-Link-Cut Tree_第46屆ICPC 東亞洲區域賽(澳門)(正式賽) (nowcoder.com) 题意: 要求一个边权值总和最小的环,并从小到大输出边权值(2的次幂):若不存在 ...

  2. AOJ DSL_2_C Range Search (kD Tree)

    Range Search (kD Tree) The range search problem consists of a set of attributed records S to determi ...

  3. Size Balance Tree(SBT模板整理)

    /* * tree[x].left 表示以 x 为节点的左儿子 * tree[x].right 表示以 x 为节点的右儿子 * tree[x].size 表示以 x 为根的节点的个数(大小) */ s ...

  4. HDU3333 Turing Tree(线段树)

    题目 Source http://acm.hdu.edu.cn/showproblem.php?pid=3333 Description After inventing Turing Tree, 3x ...

  5. POJ 3321 Apple Tree(树状数组)

                                                              Apple Tree Time Limit: 2000MS   Memory Lim ...

  6. CF 161D Distance in Tree 树形DP

    一棵树,边长都是1,问这棵树有多少点对的距离刚好为k 令tree(i)表示以i为根的子树 dp[i][j][1]:在tree(i)中,经过节点i,长度为j,其中一个端点为i的路径的个数dp[i][j] ...

  7. Segment Tree 扫描线 分类: ACM TYPE 2014-08-29 13:08 89人阅读 评论(0) 收藏

    #include<iostream> #include<cstdio> #include<algorithm> #define Max 1005 using nam ...

  8. Size Balanced Tree(SBT) 模板

    首先是从二叉搜索树开始,一棵二叉搜索树的定义是: 1.这是一棵二叉树: 2.令x为二叉树中某个结点上表示的值,那么其左子树上所有结点的值都要不大于x,其右子树上所有结点的值都要不小于x. 由二叉搜索树 ...

  9. hdu 5274 Dylans loves tree(LCA + 线段树)

    Dylans loves tree Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 131072/131072 K (Java/Othe ...

随机推荐

  1. <实训|第七天>横扫Linux磁盘分区、软件安装障碍附制作软件仓库

    期待已久的linux运维.oracle"培训班"终于开班了,我从已经开始长期四个半月的linux运维.oracle培训,每天白天我会好好学习,晚上回来我会努力更新教程,包括今天学到 ...

  2. 从零开始搭建架构实施Android项目

    我们先假设一个场景需求:刚有孩子的爸爸妈妈对用照片.视频记录宝宝成长有强烈的意愿,但苦于目前没有一款专门的手机APP做这件事.A公司洞察到市场需求,要求开发团队尽快完成Android客户端的开发.以下 ...

  3. web性能优化——简介

    简介 性能优化的第一准则:加缓存.几乎绝大部分优化都围绕这个来进行的.让用户最快的看到结果. 性能优化的第二准则:最小原则.绝不提供多余的信息.比如,静态资源(图片.css.js)压缩,图片的滚动加载 ...

  4. 系统升级日记(3)- 升级SharePoint解决方案和Infopath

    最近一段时间在公司忙于将各类系统进行升级,其最主要的目标有两个,一个是将TFS2010升级到TFS2013,另外一个是将SharePoint 2010升级到SharePoint 2013.本记录旨在记 ...

  5. 【BZOJ1002】【FJOI2007】轮状病毒(生成树计数)

    1002: [FJOI2007]轮状病毒 Time Limit: 1 Sec  Memory Limit: 162 MBSubmit: 1766  Solved: 946[Submit][Status ...

  6. EXCEL时间日期转换为常规字符显示

    当我们做报表导入的时候,我们不得不思考这样一个问题,遇到的数据是时间格式的,而在EXCEL中,时间格式的单元格实际上是以1900年以后来计算的,例如,1900年是闰年(显然可以被4整除),那么1900 ...

  7. mysql性能优化-慢查询分析、优化索引和配置

    一.优化概述 二.查询与索引优化分析 1性能瓶颈定位 Show命令 慢查询日志 explain分析查询 profiling分析查询 2索引及查询优化 三.配置优化 1)      max_connec ...

  8. oracle如何获取每个月的最后一天

    SELECT LAST_DAY(DATE'2016-09-23') FROM DUAL;

  9. 开发错误记录1:解决:Only the original thread that created a view hierarchy can touch its views.

    今天在项目中要使用圆角头像,导入开源 CircleImageView ,然后setImageBitmap()时 运行时就会发现,它会报一个致命性的异常:: · ERROR/AndroidRuntime ...

  10. [转]Mybatis3.x与Spring4.x整合

    原文地址:http://www.cnblogs.com/xdp-gacl/p/4271627.html 一.搭建开发环境 1.1.使用Maven创建Web项目 执行如下命令: mvn archetyp ...