[OpenCV] Samples 06: logistic regression
logistic regression,这个算法只能解决简单的线性二分类,在众多的机器学习分类算法中并不出众,但它能被改进为多分类,并换了另外一个名字softmax, 这可是深度学习中响当当的分类算法。
Reference: denny的学习专栏 // 臭味相投的一个博客
- Xml保存图片的方法和读取的方式。
- Mat显示内部的多个图片。
- Mat::t() 显示矩阵内容。
本文用它来进行手写数字分类。
在opencv3.0中提供了一个xml文件,里面存放了40个样本,分别是20个数字0的手写体和20个数字1的手写体。本来每个数字的手写体是一张28*28的小图片,在xml使用1*784 的向量保存在<data>中。
这个文件的位置: \opencv\sources\samples\data\data01.xml
/*//////////////////////////////////////////////////////////////////////////////////////
// IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING. // By downloading, copying, installing or using the software you agree to this license.
// If you do not agree to this license, do not download, install,
// copy or use the software. // This is a implementation of the Logistic Regression algorithm in C++ in OpenCV. // AUTHOR:
// Rahul Kavi rahulkavi[at]live[at]com
// // contains a subset of data from the popular Iris Dataset (taken from
// "http://archive.ics.uci.edu/ml/datasets/Iris") // # You are free to use, change, or redistribute the code in any way you wish for
// # non-commercial purposes, but please maintain the name of the original author.
// # This code comes with no warranty of any kind. // #
// # You are free to use, change, or redistribute the code in any way you wish for
// # non-commercial purposes, but please maintain the name of the original author.
// # This code comes with no warranty of any kind. // # Logistic Regression ALGORITHM // License Agreement
// For Open Source Computer Vision Library // Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
// Copyright (C) 2008-2011, Willow Garage Inc., all rights reserved.
// Third party copyrights are property of their respective owners. // Redistribution and use in source and binary forms, with or without modification,
// are permitted provided that the following conditions are met: // * Redistributions of source code must retain the above copyright notice,
// this list of conditions and the following disclaimer. // * Redistributions in binary form must reproduce the above copyright notice,
// this list of conditions and the following disclaimer in the documentation
// and/or other materials provided with the distribution. // * The name of the copyright holders may not be used to endorse or promote products
// derived from this software without specific prior written permission. // This software is provided by the copyright holders and contributors "as is" and
// any express or implied warranties, including, but not limited to, the implied
// warranties of merchantability and fitness for a particular purpose are disclaimed.
// In no event shall the Intel Corporation or contributors be liable for any direct,
// indirect, incidental, special, exemplary, or consequential damages
// (including, but not limited to, procurement of substitute goods or services;
// loss of use, data, or profits; or business interruption) however caused
// and on any theory of liability, whether in contract, strict liability,
// or tort (including negligence or otherwise) arising in any way out of
// the use of this software, even if advised of the possibility of such damage.*/ #include <iostream> #include <opencv2/core.hpp>
#include <opencv2/ml.hpp>
#include <opencv2/highgui.hpp> using namespace std;
using namespace cv;
using namespace cv::ml; /*
* Jeff --> Show mutiple-photos from Mat.
*/
static void showImage(const Mat &data, int columns, const String &name)
{
// columns = 28
Mat bigImage;
for(int i = 0; i < data.rows; ++i)
{
//rows: number of photos.
// vector --> reshape --> col 28, col 28 ...
// push_back: show each pic from left to right.
bigImage.push_back(data.row(i).reshape(0, columns)); }
imshow(name, bigImage.t());
} static float calculateAccuracyPercent(const Mat &original, const Mat &predicted)
{
return 100 * (float)countNonZero(original == predicted) / predicted.rows;
} int main()
{
const String filename = "../data/data01.xml";
cout << "**********************************************************************" << endl;
cout << filename
<< " contains digits 0 and 1 of 20 samples each, collected on an Android device" << endl;
cout << "Each of the collected images are of size 28 x 28 re-arranged to 1 x 784 matrix"
<< endl;
cout << "**********************************************************************" << endl; Mat data, labels;
{
/*
* Jeff --> Load xml.
* transform to Mat.
* FileStorage.
*/
cout << "loading the dataset...";
// Step 1.
FileStorage f;
if(f.open(filename, FileStorage::READ))
{
// Step 2.
f["datamat"] >> data;
f["labelsmat"] >> labels;
f.release();
}
else
{
cerr << "file can not be opened: " << filename << endl;
return 1;
}
// Step 3.
data.convertTo(data, CV_32F);
labels.convertTo(labels, CV_32F); cout << "read " << data.rows << " rows of data" << endl;
} Mat data_train, data_test;
Mat labels_train, labels_test;
for(int i = 0; i < data.rows; i++)
{
// Step 4.
if(i % 2 == 0)
{
data_train.push_back(data.row(i));
labels_train.push_back(labels.row(i));
}
else
{
data_test.push_back(data.row(i));
labels_test.push_back(labels.row(i));
}
}
cout << "training/testing samples count: " << data_train.rows << "/" << data_test.rows << endl; // display sample image
showImage(data_train, 28, "train data");
showImage(data_test, 28, "test data"); /**************************************************************************/ // simple case with batch gradient
cout << "training..."; // Step (1), create classifier.
Ptr<LogisticRegression> lr1 = LogisticRegression::create(); // Step (2),
lr1->setLearningRate(0.001);
lr1->setIterations(10);
lr1->setRegularization(LogisticRegression::REG_L2);
lr1->setTrainMethod(LogisticRegression::BATCH);
lr1->setMiniBatchSize(1); // Step (3), train.
//! [init]
lr1->train(data_train, ROW_SAMPLE, labels_train);
cout << "done!" << endl; //-------------------------------------------------------------------------- cout << "predicting..."; // Step (4), predict.
Mat responses;
lr1->predict(data_test, responses);
cout << "done!" << endl; // Step (5), show prediction report
cout << "original vs predicted:" << endl;
// Jeff --> CV_32S is a signed 32bit integer value for each pixel.
labels_test.convertTo(labels_test, CV_32S); cout << labels_test.t() << endl;
cout << responses.t() << endl;
cout << "accuracy: " << calculateAccuracyPercent(labels_test, responses) << "%" << endl; // Step (6), save the classfier
const String saveFilename = "NewLR_Trained.xml";
cout << "saving the classifier to " << saveFilename << endl;
lr1->save(saveFilename); /****************************** End ***************************************/ // load the classifier onto new object
cout << "loading a new classifier from " << saveFilename << endl;
Ptr<LogisticRegression> lr2 = StatModel::load<LogisticRegression>(saveFilename); // predict using loaded classifier
cout << "predicting the dataset using the loaded classfier...";
Mat responses2;
lr2->predict(data_test, responses2);
cout << "done!" << endl; // calculate accuracy
cout << labels_test.t() << endl;
cout << responses2.t() << endl;
cout << "accuracy: " << calculateAccuracyPercent(labels_test, responses2) << "%" << endl; waitKey(0);
return 0;
}
关于逻辑回归:http://blog.csdn.net/pakko/article/details/37878837
什么是逻辑回归?
Logistic回归与多重线性回归实际上有很多相同之处,最大的区别就在于它们的因变量不同,其他的基本都差不多。正是因为如此,这两种回归可以归于同一个家族,即广义线性模型(generalizedlinear model)。
这一家族中的模型形式基本上都差不多,不同的就是因变量不同。
- 如果是连续的,就是多重线性回归;
- 如果是二项分布,就是Logistic回归;
- 如果是Poisson分布,就是Poisson回归;
- 如果是负二项分布,就是负二项回归。
Logistic回归的因变量可以是二分类的,也可以是多分类的,但是二分类的更为常用,也更加容易解释。所以实际中最常用的就是二分类的Logistic回归。
Logistic回归的主要用途:
- 寻找危险因素:寻找某一疾病的危险因素等;
- 预测:根据模型,预测在不同的自变量情况下,发生某病或某种情况的概率有多大;
- 判别:实际上跟预测有些类似,也是根据模型,判断某人属于某病或属于某种情况的概率有多大,也就是看一下这个人有多大的可能性是属于某病。
Logistic回归主要在流行病学中应用较多,比较常用的情形是探索某疾病的危险因素,根据危险因素预测某疾病发生的概率,等等。例如,想探讨胃癌发生的危险因素,可以选择两组人群,一组是胃癌组,一组是非胃癌组,两组人群肯定有不同的体征和生活方式等。这里的因变量就是是否胃癌,即“是”或“否”,自变量就可以包括很多了,例如年龄、性别、饮食习惯、幽门螺杆菌感染等。自变量既可以是连续的,也可以是分类的。
常规步骤
Regression问题的常规步骤为:
- 寻找h函数(即hypothesis); ==> Sigmoid函数
- 构造J函数(loss函数);
- 想办法使得J函数最小并求得回归参数(θ)
详见reference博客。
[OpenCV] Samples 06: logistic regression的更多相关文章
- [OpenCV] Samples 06: [ML] logistic regression
logistic regression,这个算法只能解决简单的线性二分类,在众多的机器学习分类算法中并不出众,但它能被改进为多分类,并换了另外一个名字softmax, 这可是深度学习中响当当的分类算法 ...
- [OpenCV] Samples 10: imagelist_creator
yaml写法的简单例子.将 $ ./ 1 2 3 4 5 命令的参数(代表图片地址)写入yaml中. 写yaml文件. 参考:[OpenCV] Samples 06: [ML] logistic re ...
- 在opencv3中实现机器学习之:利用逻辑斯谛回归(logistic regression)分类
logistic regression,注意这个单词logistic ,并不是逻辑(logic)的意思,音译过来应该是逻辑斯谛回归,或者直接叫logistic回归,并不是什么逻辑回归.大部分人都叫成逻 ...
- SAS PROC MCMC example in R: Logistic Regression Random-Effects Model(转)
In this post I will run SAS example Logistic Regression Random-Effects Model in four R based solutio ...
- logistic regression的一些问题,不平衡数据,时间序列,求解惑
Logistic Regression 1.在有时间序列的特征数据中,怎么运用LR? 不光是LR,其他的模型也是. 有很多基本的模型变形之后,变成带时序的模型.但,个人觉得,这类模型大多不靠谱. 我觉 ...
- Python实践之(七)逻辑回归(Logistic Regression)
机器学习算法与Python实践之(七)逻辑回归(Logistic Regression) zouxy09@qq.com http://blog.csdn.net/zouxy09 机器学习算法与Pyth ...
- Logistic Regression 算法向量化实现及心得
Author: 相忠良(Zhong-Liang Xiang) Email: ugoood@163.com Date: Sep. 23st, 2017 根据 Andrew Ng 老师的深度学习课程课后作 ...
- Logistic Regression – Geometric Intuition
Logistic Regression – Geometric Intuition Everybody who has taken a machine learning course probably ...
- 机器学习算法与Python实践之(七)逻辑回归(Logistic Regression)
http://blog.csdn.net/zouxy09/article/details/20319673 机器学习算法与Python实践之(七)逻辑回归(Logistic Regression) z ...
随机推荐
- 莫衷一是——i+++j 该怎样计算?
这是一个有趣的计算, 3 个加号相连.那么,究竟是怎样结合的呢?是依照: i + (++j)来运算,还是依照(i++) + j 来运算呢? 这个问题在相似于 C / C++中讨论是没有多大意义的,由于 ...
- Map与Url查询参数相互转换
package com.thunisoft.maybee.engine.utils; import org.apache.commons.lang3.StringUtils; import java. ...
- SpringMVC深度探险(一) —— SpringMVC前传
在我们熟知的建立在三层结构(表示层.业务逻辑层.持久层)基础之上的J2EE应用程序开发之中,表示层的解决方案最多.因为在表示层自身的知识触角很多,需要解决的问题也不少,这也就难免造成与之对应的解决方案 ...
- 理解javascript中的回调函数(callback)【转】
在JavaScrip中,function是内置的类对象,也就是说它是一种类型的对象,可以和其它String.Array.Number.Object类的对象一样用于内置对象的管理.因为function实 ...
- 安卓程序代写 网上程序代写[转]eclipse快捷键
F 键类 F2 显示详细信息 F3 跳到声明或定义的地方 Ctrl + 键类 Ctrl+1 快速修复 ( 最经典的快捷键 , 就不用多说了 ) Ctrl+D 删除当前行 Ctrl+E 快速显示当前 E ...
- css实现三角形及应用示例
css实现三角形,网上讲了很多,但我发现一般都是三角向上或者向下的,向左向右这两方向似乎讲得很少,本人试了一下,发现原来在IE下很难搞~~(万恶的IE)...css实现三角形的原理是:当元素的宽高为0 ...
- Qt实现探测当前有没有网络连接(Wi-Fi)——QNetworkConfigurationManager.isOnline()
1.只需要探测当前有没有连上Wi-Fi(不用获取网络状态),可以调用<QNetworkConfigurationManager>类. QNetworkConfigurationManage ...
- Zookeeper 快速理解
转自:http://blog.csdn.net/colorant/article/details/8444283 == 是什么 == 目标Scope(解决什么问题) 为分布式系统提供高可靠性的协同工作 ...
- 解决连锁零售行业IT运维管理四大困境
解决连锁零售行业IT运维管理四大困境 中国近年来,连锁零售行业进入了行业的发展高潮,迅速崛起一批大型连锁业态.而随着IT技术的不断进步,连锁零售企业已经步入IT信息化快速发展的重要阶段:在面对激烈 ...
- Google File System 学习
摘要 Google的人设计并实现了Google File System,一个可升级的分布式文件系统,用于大的分布式数据应用.可以运行在廉价的日用硬件上,具备容错性,且为大量客户端提供了高聚合的性能. ...