Mahout源码分析之 -- QR矩阵分解
一、算法原理
请参考我在大学时写的《QR方法求矩阵全部特征值》,其包含原理、实例及C语言实现:http://www.docin.com/p-114587383.html
二、源码分析
这里有一篇文章《使用MapRedece进行QR分解的步骤》可以看看
/**
For an <tt>m x n</tt> matrix <tt>A</tt> with <tt>m >= n</tt>, the QR decomposition is an <tt>m x n</tt>
orthogonal matrix <tt>Q</tt> and an <tt>n x n</tt> upper triangular matrix <tt>R</tt> so that
<tt>A = Q*R</tt>.
<P>
The QR decomposition always exists, even if the matrix does not have
full rank, so the constructor will never fail. The primary use of the
QR decomposition is in the least squares solution of non-square systems
of simultaneous linear equations. This will fail if <tt>isFullRank()</tt>
returns <tt>false</tt>.
*/ public class QRDecomposition implements QR {
private final Matrix q;
private final Matrix r;
private final boolean fullRank;
private final int rows;
private final int columns; /**
* Constructs and returns a new QR decomposition object; computed by Householder reflections; The
* decomposed matrices can be retrieved via instance methods of the returned decomposition
* object.
*
* @param a A rectangular matrix.
* @throws IllegalArgumentException if <tt>A.rows() < A.columns()</tt>.
*/
public QRDecomposition(Matrix a) { rows = a.rowSize();//m
int min = Math.min(a.rowSize(), a.columnSize());
columns = a.columnSize();//n Matrix qTmp = a.clone(); boolean fullRank = true; r = new DenseMatrix(min, columns); for (int i = 0; i < min; i++) {
Vector qi = qTmp.viewColumn(i);
double alpha = qi.norm(2);
if (Math.abs(alpha) > Double.MIN_VALUE) {
qi.assign(Functions.div(alpha));
} else {
if (Double.isInfinite(alpha) || Double.isNaN(alpha)) {
throw new ArithmeticException("Invalid intermediate result");
}
fullRank = false;
}
r.set(i, i, alpha); for (int j = i + 1; j < columns; j++) {
Vector qj = qTmp.viewColumn(j);
double norm = qj.norm(2);
if (Math.abs(norm) > Double.MIN_VALUE) {
double beta = qi.dot(qj);
r.set(i, j, beta);
if (j < min) {
qj.assign(qi, Functions.plusMult(-beta));
}
} else {
if (Double.isInfinite(norm) || Double.isNaN(norm)) {
throw new ArithmeticException("Invalid intermediate result");
}
}
}
}
if (columns > min) {
q = qTmp.viewPart(0, rows, 0, min).clone();
} else {
q = qTmp;
}
this.fullRank = fullRank;
} /**
* Generates and returns the (economy-sized) orthogonal factor <tt>Q</tt>.
*
* @return <tt>Q</tt>
*/
@Override
public Matrix getQ() {
return q;
} /**
* Returns the upper triangular factor, <tt>R</tt>.
*
* @return <tt>R</tt>
*/
@Override
public Matrix getR() {
return r;
} /**
* Returns whether the matrix <tt>A</tt> has full rank.
*
* @return true if <tt>R</tt>, and hence <tt>A</tt>, has full rank.
*/
@Override
public boolean hasFullRank() {
return fullRank;
} /**
* Least squares solution of <tt>A*X = B</tt>; <tt>returns X</tt>.
*
* @param B A matrix with as many rows as <tt>A</tt> and any number of columns.
* @return <tt>X</tt> that minimizes the two norm of <tt>Q*R*X - B</tt>.
* @throws IllegalArgumentException if <tt>B.rows() != A.rows()</tt>.
*/
@Override
public Matrix solve(Matrix B) {
if (B.numRows() != rows) {
throw new IllegalArgumentException("Matrix row dimensions must agree.");
} int cols = B.numCols();
Matrix x = B.like(columns, cols); // this can all be done a bit more efficiently if we don't actually
// form explicit versions of Q^T and R but this code isn't so bad
// and it is much easier to understand
Matrix qt = getQ().transpose();
Matrix y = qt.times(B); Matrix r = getR();
for (int k = Math.min(columns, rows) - 1; k >= 0; k--) {
// X[k,] = Y[k,] / R[k,k], note that X[k,] starts with 0 so += is same as =
x.viewRow(k).assign(y.viewRow(k), Functions.plusMult(1 / r.get(k, k))); // Y[0:(k-1),] -= R[0:(k-1),k] * X[k,]
Vector rColumn = r.viewColumn(k).viewPart(0, k);
for (int c = 0; c < cols; c++) {
y.viewColumn(c).viewPart(0, k).assign(rColumn, Functions.plusMult(-x.get(k, c)));
}
}
return x;
} /**
* Returns a rough string rendition of a QR.
*/
@Override
public String toString() {
return String.format(Locale.ENGLISH, "QR(%d x %d,fullRank=%s)", rows, columns, hasFullRank());
}
}
Mahout源码分析之 -- QR矩阵分解的更多相关文章
- Mahout源码分析之 -- 文档向量化TF-IDF
fesh个人实践,欢迎经验交流!Blog地址:http://www.cnblogs.com/fesh/p/3775429.html Mahout之SparseVectorsFromSequenceFi ...
- Mahout源码分析:并行化FP-Growth算法
FP-Growth是一种常被用来进行关联分析,挖掘频繁项的算法.与Aprior算法相比,FP-Growth算法采用前缀树的形式来表征数据,减少了扫描事务数据库的次数,通过递归地生成条件FP-tree来 ...
- mahout源码分析之Decision Forest 三部曲之二BuildForest(1)
Mahout版本:0.7,hadoop版本:1.0.4,jdk:1.7.0_25 64bit. BuildForest是在mahout-examples-0.7-job.jar包的org\apache ...
- mahout源码分析之DistributedLanczosSolver(五)Job over
Mahout版本:0.7,hadoop版本:1.0.4,jdk:1.7.0_25 64bit. 1. Job 篇 接上篇,分析到EigenVerificationJob的run方法: public i ...
- mahout源码分析之DistributedLanczosSolver(六)完结篇
Mahout版本:0.7,hadoop版本:1.0.4,jdk:1.7.0_25 64bit. 接上篇,分析完3个Job后得到继续往下:其实就剩下两个函数了: List<Map.Entry< ...
- mahout算法源码分析之Collaborative Filtering with ALS-WR (四)评价和推荐
Mahout版本:0.7,hadoop版本:1.0.4,jdk:1.7.0_25 64bit. 首先来总结一下 mahout算法源码分析之Collaborative Filtering with AL ...
- mahout算法源码分析之Collaborative Filtering with ALS-WR拓展篇
Mahout版本:0.7,hadoop版本:1.0.4,jdk:1.7.0_25 64bit. 额,好吧,心头的一块石头总算是放下了.关于Collaborative Filtering with AL ...
- zxing源码分析——QR码部分
Android应用横竖屏切换 zxing源码分析——DataMatrix码部分 zxing源码分析——QR码部分 2013-07-10 17:16:03| 分类: 默认分类 | 标签: |字号大中 ...
- mahout算法源码分析之Collaborative Filtering with ALS-WR 并行思路
Mahout版本:0.7,hadoop版本:1.0.4,jdk:1.7.0_25 64bit. mahout算法源码分析之Collaborative Filtering with ALS-WR 这个算 ...
随机推荐
- oc实例变量初始化方法
1 使用实例setter方法 默认初始化方法 + setName:xxx setAge:xxx 2 使用实例功能类方法,默认初始化方法 + setName:xxx age:xxx3 使用实例初始化方法 ...
- python 函数可变长参数
python中的可变长参数有两种: 一种是非关键字参数(*元组),另一种是关键字参数(**字典) 非关键字可变长参数: """ 非关键字可变参数,一个星号作为元组传入函数 ...
- ajax 中$.each(json,function(index,item){ }); 中的2个参数表示什么意思?
$.each(json,function(index,item)里面的index代表当前循环到第几个索引,item表示遍历后的当前对象,比如json数据为:[{"name":&qu ...
- hiho一下122周 后缀数组三·重复旋律
后缀数组三·重复旋律3 时间限制:5000ms 单点时限:1000ms 内存限制:256MB 描述 小Hi平时的一大兴趣爱好就是演奏钢琴.我们知道一个音乐旋律被表示为长度为 N 的数构成的数列.小Hi ...
- JavaBean基本用法示例(二)
JavaBean的第二种用法,是接收form组件的请求赋值. 一.修改person类.因为这一次是两个网页之间的数据传输,受中文乱码问题的影响,所以在person类中添加一个用于转码的函数,并且在每一 ...
- 《统计推断(Statistical Inference)》读书笔记——第4章 统计分布族
数据分析工作中最常和多维随机变量打交道,第四章介绍了多维随机变量的基本知识,其中核心概念是条件分布和条件概率.条件分布和条件概率可以抽象出条件期望的概念,在随机分析的研究中,理解随机积分和鞅理论和关键 ...
- Mac下Nginx环境配置
环境信息: Mac OS X 10.11.1 Homebrew 0.9.5 正文 一.安装 Nginx 终端执行: brew search nginx brew install nginx 当前版本 ...
- 利用硬链接和truncate降低drop table对线上环境的影响
众所周知drop table会严重的消耗服务器IO性能,如果被drop的table容量较大,甚至会影响到线上的正常. 首先,我们看一下为什么drop容量大的table会影响线上服务 直接执行drop ...
- PHP中float变量转换为int时,结果有误的问题!
先上例子: <?php $money = 100; $rate = 1.15; $result = $money * $rate; var_dump( intval( $result ) ); ...
- (转)AVI文件格式解析+AVI文件解析工具
AVI文件解析工具下载地址:http://download.csdn.net/detail/zjq634359531/7556659 AVI(Audio Video Interleaved的缩写)是一 ...