KNN in c++

Pseudo Code of KNN

We can implement a KNN model by following the below steps:

Load the data
Initialise the value of k
For getting the predicted class, iterate from 1 to total number of training data points

Calculate the distance between test data and each row of training data. Here we will use Euclidean distance as our distance metric since it’s the most popular method. The other metrics that can be used are Chebyshev, cosine, etc.
Sort the calculated distances in ascending order based on distance values
Get top k rows from the sorted array
Get the most frequent class of these rows
Return the predicted class

把数据作为string类型处理，进行string和double类型转换。

#include <iostream>
#include <string>
#include <fstream>
#include <sstream>
#include <numeric>
#include <functional>
#include <vector>
#include <algorithm>
#include <cmath>
#include <map>

template <class T1, class T2>
double ManhattanDistance(std::vector<T1> &inst1, std::vector<T2> &inst2) {
if(inst1.size() != inst2.size()) {
std::cout<<"the size of the vectors is not the same\n";
return -1;
}
std::vector<double> temp;
for(size_t i=0;i<inst1.size();++i) {
temp.push_back(std::abs(inst1.at(i)-inst2.at(i)));
}
double distance=accumulate(temp.begin(), temp.end(), 0.0);

return distance;
}

template <class DataType1, class DataType2>
double EuclideanDistance(const std::vector<DataType1> &inst1, const std::vector<DataType2> &inst2) {
if(inst1.size() != inst2.size()) {
std::cout<<"the size of the vectors is not the same\n";
return -1;
}
std::vector<double> temp;
for(size_t i=0; i<inst1.size(); ++i) {
temp.push_back(pow(inst1.at(i)-inst2.at(i), 2.0));
}
double distance=accumulate(temp.begin(), temp.end(), 0.0);
distance=sqrt(distance);

return distance;
}

void vstr2vdouble(std::vector<std::string>::const_iterator beg, std::vector<std::string>::const_iterator end, std::vector<double> &vdouble) {
for(std::vector<std::string>::const_iterator it=beg; it!=end; ++it) {
double d;
std::stringstream ss;
ss<<*it;
ss>>d;
vdouble.push_back(d);
}
}

void knn(std::vector<std::vector<std::string> > &trainset, std::vector<std::string> &testdata, int &k) {
std::vector<double> testitem;
vstr2vdouble(testdata.begin(), testdata.end(), testitem);
std::multimap<std::string, std::string> mmap;

for(size_t i=0;i<trainset.size();++i) {
std::vector<double> trainitem;
vstr2vdouble(trainset[i].begin(), trainset[i].end()-1, trainitem);
double distance=EuclideanDistance(testitem, trainitem);
std::string strdis;
std::stringstream ss;
ss<<distance;
ss>>strdis;
mmap.insert(std::pair<std::string, std::string>(strdis, trainset[i].back()));
}
size_t i=0;
for(std::multimap<std::string, std::string>::const_iterator it=mmap.begin(); i<k; ++i,++it) {
std::cout<<it->first<<" "<<it->second<<"\n";
}
}

template <class DataType>
void ReadDataFromFile(std::string &filename, std::vector<std::vector<DataType> > &lines_feat) {
std::ifstream vm_info(filename.c_str());
std::string lines, var;
std::vector<std::string> row;

lines_feat.clear();

while(!vm_info.eof()) {
getline(vm_info, lines);
if(lines.empty())
break;
std::replace(lines.begin(), lines.end(), ',', ' ');
std::stringstream stringin(lines);
row.clear();

while(stringin >> var) {
row.push_back(var);
}
lines_feat.push_back(row);
}
}

template <class DataType>
void Display2DVector(std::vector<std::vector<DataType> > &vv) {
std::cout<<"the total rows of 2d vector_data: "<<vv.size()<<std::endl;

for(size_t i=0;i<vv.size();++i) {
for(typename::std::vector<DataType>::const_iterator it=vv[i].begin();it!=vv[i].end();++it) {
std::cout<<*it<<" ";
}
std::cout<<"\n";
}
std::cout<<"--------the end of the Display2DVector()--------\n";
}

int main() {
std::string trainpath="Iris.data", testpath="knntest.data";
std::vector<std::vector<std::string> > knn_data, test_data;

ReadDataFromFile(trainpath, knn_data);
ReadDataFromFile(testpath, test_data);

Display2DVector(test_data);

int k=3;
for(size_t i=0;i<test_data.size();++i) {
knn(knn_data, test_data[i], k);
}

return 0;
}

KNN in c++的更多相关文章

【Machine Learning】KNN算法虹膜图片识别
K-近邻算法虹膜图片识别实战作者:白宁超 2017年1月3日18:26:33 摘要:随着机器学习和深度学习的热潮,各种图书层出不穷.然而多数是基础理论知识介绍,缺乏实现的深入理解.本系列文章是作者结 ...
K近邻法(KNN)原理小结
K近邻法(k-nearst neighbors,KNN)是一种很基本的机器学习方法了,在我们平常的生活中也会不自主的应用.比如,我们判断一个人的人品,只需要观察他来往最密切的几个人的人品好坏就可以得出 ...
kd树和knn算法的c语言实现
基于kd树的knn的实现原理可以参考文末的链接,都是一些好文章. 这里参考了别人的代码.用c语言写的包括kd树的构建与查找k近邻的程序. code: #include<stdio.h> # ...
k近邻算法(knn)的c语言实现
最近在看knn算法,顺便敲敲代码. knn属于数据挖掘的分类算法.基本思想是在距离空间里,如果一个样本的最接近的k个邻居里,绝大多数属于某个类别,则该样本也属于这个类别.俗话叫,"随大流&q ...
室内定位系列（三）——位置指纹法的实现（KNN）
位置指纹法中最常用的算法是k最近邻(kNN):选取与当前RSS最邻近的k个指纹的位置估计当前位置,简单直观有效.本文介绍kNN用于定位的基本原理与具体实现(matlab.python). 基本原理位 ...
KNN识别图像上的数字及python实现
领导让我每天手工录入BI系统中的数据并判断数据是否存在异常,若有异常点,则检测是系统问题还是业务问题.为了解放双手,我决定写个程序完成每天录入管理驾驶舱数据的任务.首先用按键精灵录了一套脚本把系统中的 ...
k近邻(KNN)复习总结
摘要: 1.算法概述 2.算法推导 3.算法特性及优缺点 4.注意事项 5.实现和具体例子 6.适用场合内容: 1.算法概述 K近邻算法是一种基本分类和回归方法:分类时,根据其K个最近邻的训练实例的类 ...
KNN算法
1.算法讲解 KNN算法是一个最基本.最简单的有监督算法,基本思路就是给定一个样本,先通过距离计算,得到这个样本最近的topK个样本,然后根据这topK个样本的标签,投票决定给定样本的标签: 训练过程 ...
【十大经典数据挖掘算法】kNN
[十大经典数据挖掘算法]系列 C4.5 K-Means SVM Apriori EM PageRank AdaBoost kNN Naïve Bayes CART 1. 引言顶级数据挖掘会议ICDM ...
K近邻模型（k-NN）
原理 K最近邻(k-Nearest Neighbor,KNN)分类算法,是一个理论上比较成熟的方法,也是最简单的机器学习算法之一.该方法的思路是:如果一个样本在特征空间中的k个最相似(即特征空间中最邻 ...

随机推荐

dom4j.jar 的调试方法
1.将jar包的路径写在 classpath下在cmd窗口中,查看 classpath的内容是否已经加上该路径,win7 下cmd窗口一定要是管理员身份执行 2.在D盘新建一个DOM4JWriter ...
【转载】HTTP 请求头与请求体
原文地址: https://segmentfault.com/a/1190000006689767 HTTP Request HTTP 的请求报文分为三个部分请求行.请求头和请求体,格式如图:一个典 ...
【技术累积】【点】【java】【30】代理模式
基础代理模式是Java常见的设计模式之一.所谓代理模式是指客户端并不直接调用实际的对象,而是通过调用代理,来间接的调用实际的对象. 什么是代理参考现实生活中的代理比如某个品牌的某个省的代理商,作 ...
【技术累积】【点】【java】【26】@Value默认值
@Value 该注解可以把配置文件中的值赋给属性 @Value("${shit.config}") private String shit; 要在xml文件中设置扫描包+place ...
【VHDL】组合逻辑电路和时序逻辑电路的区别
简单的说,组合电路,没有时钟:时序电路,有时钟. ↓ 也就是说,组合逻辑电路没有记忆功能,而时序电路具有记忆功能. ↓ 在VHDL语言中,不完整条件语句对他们二者的影响分别是什么?组合逻辑中可能生成锁 ...
Centos7安装gitlab服务器
1.先按照官方教程 https://about.gitlab.com/downloads/#centos7 大概内容如下: 1. Install and configure the necessary ...
[C++] muParser 的简单使用方法
关于 muParser 库许多应用程序需要解析数学表达式.该库的主要目的是提供一种快速简便的方法. muParser是一个用C ++编写的可扩展的高性能数学表达式解析器库. 它的工作原理是将数学表达 ...
Marshal.ReleaseComObject() vs. Marshal.FinalReleaseComObject()
很简单,不翻译了. If you are using COM components on your .NET code, you might be already aware of the Marsh ...
Serverless（baas & faas）无服务器计算
自从2014年AWS推出Lambda服务后,Serverless一词越来越热,已经成为一种新型的软件设计架构,即Serverless Architecture.作为一种原生于公共云的架构,Server ...
noip模拟赛 radius
分析:这道题实在是不好想,一个可以骗分的想法是假定要求的那个点在中心点上,可以骗得不少分.但是在边上的点要怎么确定呢?理论复杂度O(﹢无穷).答案一定是和端点有关的,涉及到最大值最小,考虑二分最大值, ...

KNN in c++

KNN in c++的更多相关文章

随机推荐

热门专题