数学建模美赛O奖论文总结
Anil S. Damle Colin G. West Eric J. Benzel
University of Colorado–Boulder
Boulder, CO
Advisor: Anne Dougherty
Abstract
Research shows that most violent serial criminals tend to commit crimes in a radial band around a central point: home, workplace, We will give a predicting of a criminal’s spatial patterns is called geographic profiling. we assume that the offender is a ―violent serial criminal, since research suggests that serial burglars and arsonists are less likely to follow spatial patterns. We treat the single-anchor-point case first taking the spatial coordinates of the criminal’s last strikes and the sequence of the crimes as inputs. For the multiple-anchor-point case, we use a cluster-finding and sorting method
Assumptions
Domain is Approximately Urban
- Entire domain is a potential crime spot
- Criminal’s movement is unconstrained.
- Domain contains all possible strike points.
###Developing a Serial Crime Test Set - Existing Crime Sets
- The Problem with Simulation
- Pixel Point Analysis
Metrics of Success
The Effectiveness Multiplier
\(\kappa=\frac{Z_1(CrimePoint)}{Z_2(CrimePoint)}\) \(\kappa_s=\frac{Z_{our\ model}(CrimePoint)}{Z_{flat}(CrimePoint)}\)
Two Schemes for Spatial Prediction
Serial crime is patterned around place of daily activity. The key is the crime center. The two scenarios are shown below
Single Anchor Point: Centroid Method
Algorithm
Create Search Domain
Constructs the smallest rectangle that contains all existing offenses, and scales each dimension three times.It meets the requirements
Find Centroid of Crime Sites
The anchor point is the average of the n crime coordinates\((x_i,y_i)\).
Build Likelihood Crater
We use cratering technique. The two dimensional crime points \(x_i\) are mapped to their radius from the anchor point \(a_i\). We have \(f:x_i→r_i\),where \(f(x_i)= \Arrowvert x_i−a_i\Arrowvert_2\) (a shifted modulus). Then using the set \(r_i\) to generate a crater around the anchor point. The following two methods can be used:
- There is a buffer zone around the anchor point.
- Crimes follow a decaying exponential pattern from the anchor point.
We use the gamma distribution. Define the random variable \(X_i\) to be the distance between the with crime point and the anchor point \(r\). We let each \(X_i\) have a gamma distribution with parameters \(\kappa\) and \(θ\): \(X_i ∼ Γ(\kappa, θ)\), with probability density function pdf
\(f(x;\kappa;\theta)=\frac{\theta^k}{\ulcorner(\kappa)}x^ {\kappa-1}\theta^{-\theta x}\)
Suppose \(X_i\) is independent, using the maximum likelihood estimates of \(\kappa\) and \(θ\). Use the resulting distribution to calculate possible crime locations. The pdf was evaluated for each point and normalized to give a volume of 1 under the likelihood surface.
Adjust for Temporal Trends
The outward or inward trend of \(r_i\) might indicate that the next crime will follow this trend. We let \(\stackrel{\sim}{X}=X +\overline{\Delta r}\),Where \(r=r_n-r_{n-1}\). The new random variable \(\stackrel{\sim}{X}\) Temporal adjustment in expected value:
\(E[\stackrel{\sim}{X}]=E[X +\overline{\Delta r}]=E[X]+\overline{\Delta r}\)
Results and Analysis
Analysis of three criminals by removal the final criminal data. Produce the likelihood plane \(Z(x,y)\). Then estimate the location of the final crime, and calculate the standard effect multiplier \(\kappa_s\).
For the offenders B&C, the model is relatively successful, \(\kappa_s ≈ 12\). And \(\Delta r=-0.276\), the temporal corrections in this distribution are negligible.
Since two outlier models failed for crime A (\( \kappa_s≈0.4 \)). There is a problem with the model. But the model still applies to most crimes. Unless some external influence distracts off the previous models of criminals.
Multiple Anchor Points: Cluster Method
Algorithm
We force a minimum of 2 clusters and a maximum of 4. The clustering algorithm is accomplished in a 3-step process.
- Compute the Euclidean distances between all crime locations.
- Organize the distances into a hierarchical cluster tree, represented by a dendrogram.
- Merge the two clusters that are the closest, and continue such merging until the desired number of clusters is reached. The height is based on the distance between merged clusters at the time of merging.
To determine the optimal number of clusters, we use the notion of silhouettes. We denote by a\((P_i)\) the average distance from \(P_i\) to all other points in its cluster and by \(b(P_i,\kappa)\) the average distance from \(P_i\) to points in a different cluster \(C_k\) . Then the silhouette of \(P_i\) is
\(s(P_i)=\frac{\left[\min\limits_{\kappa|P_i\notin C_k}b(P_i,\kappa)\right] - a(P_i)}{\max\left(a(P_i), \min\limits_{\kappa|P_i\notin C_k}b(P_i,\kappa)\right)}\)
The silhouette s can take values in \([−1, 1]\): The closer \(s(P_i)\) is to \(1\), the better \(P_i\) fits into its current cluster; and the closer \(s(P_i)\) is to \(−1\), the worse it fits within its current cluster. To optimize the number of clusters, we compute the clusterings for 2, 3, and 4 clusters. We then find the maximum of the three average silhouette values.
Cluster Loop Algorithm and Combining Cluster Predictions
We compute the likelihood surface for the centroid of each cluster. We use a Gaussian distribution centered at the point as the likelihood surface, with mean the expected value of the gamma distribution placed over every anchor point of a cluster that has more than one point.
we create our final surface as a normalized linear combination of the individual surfaces, using weights for the number of points in the cluster and for the average temporal index of the events in the cluster.
Results and Analysis
- Offender C: The cluster method identifies the point directly below the centroid as an outlier and therefore excludes it, which slightly reduces the variance and therefore narrowing the fit function.
- Offender B: Although the actual crime point no longer appears in the band of maximum likelihood, the cluster method still outperforms the centroid method with a \(\kappa_s≈23\), for fewer resources are “wasted” at high-likelihood areas where no crime is committed.
- Offender A: Since the outlier points are excluded from the centroid calculation for the larger cluster, the model bets even more aggressively on this cluster, with a resulting \(\kappa_s≈0\).
Summary
- The predictions are based on the assumption of trends in serial crime behavior which has been tested on large sets of real-world data. Similar mathematical techniques are used in the anchor-point estimation solutions currently employed, which consistently outperform random guesses when tested across data samples.
- The model is applicable only to violent serial criminals. Simultaneously, it has not been validated on a large set of empirical data, and cannot make use of underlying map data.
数学建模美赛O奖论文总结的更多相关文章
- 2019建模美赛B题(派送无人机)M奖论文
昨天上午出了建模美赛的结果,我们小组获得的是M奖,感觉挺开心的.我一直觉得拿O奖那种是个概率事件,需要天时地利人和的各种因素都合适才行,所以看到自己是M奖,感觉自己的能力已经得到了认可就很满意了.今天 ...
- 2018年数学建模国赛B题 智能RGV的动态调度策略
第一种情况大致思路: 每秒判断各个CNC的状态,若工作完成或者是出于空闲状态下则向RGV发出一个请求.同时,RGV每秒判断自己的状态(上下料.移动.闲置.清洗等),如果是处于闲置状态,则启用调度算法, ...
- 2018数学建模国赛总结(A题/编程选手视角)
2018数学建模已经告一段落了,先说说基本情况吧,我们队伍专业分别为:金融(A),会计(B),计算机(我),配置还算可以,他们俩会数据分析软件也会写论文,我可以写代码,画图.他们俩打过美赛(M奖),我 ...
- 2017年第六届数学中国数学建模国际赛(小美赛)C题解题思路
这篇文章主要是介绍下C题的解题思路,首先我们对这道C题进行一个整体的概括,结构如下: C题:经济类 第一问:发现危险人群. 发现:欺诈的方式开始.雇佣或浪漫的承诺. 数据→确定特定的经济萧条地区→确定 ...
- Python小白的数学建模课-A3.12 个新冠疫情数模竞赛赛题与点评
新冠疫情深刻和全面地影响着社会和生活,已经成为数学建模竞赛的背景帝. 本文收集了与新冠疫情相关的的数学建模竞赛赛题,供大家参考,欢迎收藏关注. 『Python小白的数学建模课 @ Youcans』带你 ...
- 在数学建模中学MATLAB
为期三周的数学建模国赛培训昨天正式结束了,还是有一定的收获的,尤其是在MATLAB的使用上. 1. 一些MATLAB的基础性东西: 元胞数组的使用:http://blog.csdn.net/z1137 ...
- Python小白的数学建模课-17.条件最短路径
条件最短路径问题,指带有约束条件.限制条件的最短路径问题.例如: 顶点约束,包括必经点或禁止点的限制: 边的约束,包括必经路段.禁行路段和单向路段:无权路径长度的限制,如要求经过几步或不超过几步到达终 ...
- 2018年中国研究生数学建模竞赛C题 二等奖 赛题论文
2018年中国研究生数学建模竞赛C题 对恐怖袭击事件记录数据的量化分析 恐怖袭击是指极端分子或组织人为制造的.针对但不仅限于平民及民用设施的.不符合国际道义的攻击行为,它不仅具有极大的杀伤性与破坏力, ...
- 2017年研究生数学建模D题(前景目标检测)相关论文与实验结果
一直都想参加下数学建模,通过几个月培训学到一些好的数学思想和方法,今年终于有时间有机会有队友一起参加了研究生数模,but,为啥今年说不培训直接参加国赛,泪目~_~~,然后比赛前也基本没看,直接硬刚.比 ...
随机推荐
- Hive- 大数据仓库Hive
什么是 Hive? Hive 是由 FaceBook 开源用于解决少量数据结构化日志的数据统计.Hive是基于 Hadoop 的一个数据仓库工具,可以将结构化的数据文件映射成一张表,并提供类SQL查询 ...
- Codeforces 876C Classroom Watch:枚举
题目链接:http://codeforces.com/contest/876/problem/C 题意: 定义函数:f(x) = x + 十进制下x各位上的数字之和 给你f(x)的值(f(x) < ...
- [SQL类] SQL优化大全(推荐)
概要 优化的理由 1. 大小写对SQL语句的影响(ORACLE) 2. 尽量使用(NOT) EXISTS 替代( NOT)IN这样的操作 3. 在海量查询时尽量少用格式转换 4. 查询海量数据是,可以 ...
- 分享知识-快乐自己:Liunx 根目录结构
- 2017-2018-1 20179215《Linux内核原理与分析》第三周作业
本次作业分为两部分:第一部分为实验.主要目的是进行基于MYKERNEL的一个简单的时间片轮转多道程序内核代码分析.第二部分为阅读教材,了解LINUX进程调度等. 一.实验部分 实验过程如过程所述:使用 ...
- 脚本手动执行正常,放cron中执行有问题的原因
问题原因: 1. crond服务没启动 2. 环境变量如 PATH LANG SHELL 等设置不对 3. 脚本中引用的文件地址是相对路径,而非绝对路径. 排查步骤: 以 check ...
- ACM学习历程—HDU 5023 A Corrupt Mayor's Performance Art(广州赛区网赛)(线段树)
Problem Description Corrupt governors always find ways to get dirty money. Paint something, then sel ...
- Oracle12c多租户如何启动关闭CDB或PDB (PDB自动启动)
Oracle 数据库 12 c 中介绍了多租户选项允许单个容器数据库 (CDB) 来承载多个单独的可插拔数据库 (PDB).下面我们一起来启动和关闭容器数据库 (CDB) 和可插拔数据库 (PDB). ...
- HDOJ1059(多重部分和问题)
#include<cstdio> #include<cstring> using namespace std; +; ]; int dp[SIZE]; bool check() ...
- TS学习之基础类型
1.布尔值 let isDone:boolean = false 2.数字(支持二,八,十,十六进制) let width:number = 20 3.字符串 let name:string = &q ...