Sampling Matrix
这些天看了一些关于采样矩阵(大概是这么翻译的)的论文,简单做个总结。
- FAST MONTE CARLO ALGORITHMS FOR MATRICES I: APPROXIMATING MATRIX MULTIPLICATION
算法如下:
目的是为了毕竟矩阵的乘积AB, 以CR来替代。
其中右上角带有i_t的A表示A的第i_t列,右下角带有i_t的B表示B的第i_t行。
关于 c 的选择,以及误差的估计,请回看论文。
下面是一个小小的测试:
代码:
import numpy as np
def Generate_P(A, B): #生成概率P
try:
n1 = len(A[1,:])
n2 = len(B[:,1])
if n1 == n2:
n = n1
else:
print('Bad matrices')
return 0
except:
print('The matrices are not fit...')
A_New = np.square(A)
B_New = np.square(B)
P_A = np.array([np.sqrt(np.sum(A_New[:,i])) for i in range(n)])
P_B = np.array([np.sqrt(np.sum(B_New[i,:])) for i in range(n)])
P = P_A * P_B / (np.sum(P_A * P_B))
return P
def Generate_S(n, c, P): #生成采样矩阵S 简化了一下算法
S = np.zeros((n, c))
T = np.random.choice(np.array([i for i in range(n)]), size = c, replace = True, p = P)
for i in range(c):
S[T[i], i] = 1 / np.sqrt(c * P[T[i]])
return S
def Summary(times, n, c, P, A_F, B_F, AB): #总结和分析
print('{0:^15} {1:^15} {2:^15} {3:^15} {4:^15} {5:^15} {6:^15}'.format('A_F', 'B_F', 'NEW_F', 'A_F * B_F', 'AB_F', 'RATIO', 'RATIO2'))
print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
A_F_B_F = A_F * B_F
AB_F = np.sqrt(np.sum(np.square(AB)))
Max = -1
Min = 99999999999
Max2 = -1
Min2 = 99999999999
Max_NEW_F = 0
Min_NEW_F = 0
Mean_NEW_F = 0
Mean_ratio = 0
Mean_ratio2 = 0
for i in range(times):
S = Generate_S(n, c, P)
CR = np.dot(A.dot(S), (S.T).dot(B))
NEW = AB - CR
NEW_F = np.sqrt(np.sum(np.square(NEW)))
ratio = NEW_F / A_F_B_F
ratio2 = NEW_F / AB_F
Mean_NEW_F += NEW_F
Mean_ratio += ratio
Mean_ratio2 += ratio2
if ratio > Max:
Max = ratio
Max2 = ratio2
Max_NEW_F = NEW_F
if ratio < Min:
Min = ratio
Min2 = ratio2
Min_NEW_F = NEW_F
print('{0:^15.5f} {1:^15.5f} {2:^15.5f} {3:^15.5f} {4:^15.5f} {5:^15.3%} {6:^15.3%}'.format(A_F, B_F, NEW_F, A_F_B_F, AB_F, ratio, ratio2))
Mean_NEW_F = Mean_NEW_F / times
Mean_ratio = Mean_ratio / times
Mean_ratio2 = Mean_ratio2 / times
print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
print('{0:^15.5f} {1:^15.5f} {2:^15.5f} {3:^15.5f} {4:^15.5f} {5:^15.3%} {6:^15.3%}'.format(A_F, B_F, Mean_NEW_F, A_F_B_F, AB_F, Mean_ratio, Mean_ratio2))
print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
print('Count: {0} times'.format(times))
print('Max_ratio: {0:<15.3%} Min_ratio: {1:<15.3%}'.format(Max, Min))
print('Max_ratio2: {0:<15.3%} Min_ratio2: {1:<15.3%}'.format(Max2, Min2))
print('Max_NEW_F: {0:<15.5f} Min_NEW_F: {1:<15.5f}'.format(Max_NEW_F, Min_NEW_F))
#下面是关于矩阵行列的一些参数,我是采用均匀分布产生的矩阵
m = 47
n = 120
p = 55
A = np.array([[np.random.rand() * 100 for j in range(n)] for i in range(m)])
B = np.array([[np.random.rand() * 100 for j in range(p)] for i in range(n)])
#构建c的一些参数 这个得参考论文
Thelta = 1/4
Belta = 1
Yita = 1 + np.sqrt((8/Belta * np.log(1/Thelta)))
e = 1/5
c = int(1 / (Belta * e ** 2)) + 1
P = Generate_P(A, B)
#结果分析
AB = A.dot(B)
A_F = np.sqrt(np.sum(np.square(A)))
B_F = np.sqrt(np.sum(np.square(B)))
times = 1000
Summary(times, n, c, P, A_F, B_F, AB)
粗略的结果:

用了原矩阵的一半的维度,代价是约17%的误差。
用正态分布生成矩阵的时候,发现,如果是标准正态分布,效果很差,我猜是由计算机舍入误差引起的,这样的采样的性能不好。当均值增加的时候,和”均匀分布“差不多,甚至更优(F范数的意义上)。
补充:
Sampling Matrix的更多相关文章
- 【NLP】Conditional Language Modeling with Attention
Review: Conditional LMs Note that, in the Encoder part, we reverse the input to the ‘RNN’ and it per ...
- Sampling Distributions and Central Limit Theorem in R(转)
The Central Limit Theorem (CLT), and the concept of the sampling distribution, are critical for unde ...
- [LeetCode] Random Flip Matrix 随机翻转矩阵
You are given the number of rows n_rows and number of columns n_cols of a 2D binary matrix where all ...
- 【RS】Sparse Probabilistic Matrix Factorization by Laplace Distribution for Collaborative Filtering - 基于拉普拉斯分布的稀疏概率矩阵分解协同过滤
[论文标题]Sparse Probabilistic Matrix Factorization by Laplace Distribution for Collaborative Filtering ...
- 470. Implement Rand10() Using Rand7() (拒绝采样Reject Sampling)
1. 问题 已提供一个Rand7()的API可以随机生成1到7的数字,使用Rand7实现Rand10,Rand10可以随机生成1到10的数字. 2. 思路 简单说: (1)通过(Rand N - 1) ...
- [Python] 01 - Number and Matrix
故事背景 一.大纲 如下,chapter4 是个概览,之后才是具体讲解. 二. 编译过程 Ref: http://www.dsf.unica.it/~fiore/LearningPython.pdf
- 目录:Matrix Differential Calculus with Applications in Statistics and Econometrics,3rd_[Magnus2019]
目录:Matrix Differential Calculus with Applications in Statistics and Econometrics,3rd_[Magnus2019] Ti ...
- 【论文笔记】SamWalker: Social Recommendation with Informative Sampling Strategy
SamWalker: Social Recommendation with Informative Sampling Strategy Authors: Jiawei Chen, Can Wang, ...
- angular2系列教程(十一)路由嵌套、路由生命周期、matrix URL notation
今天我们要讲的是ng2的路由的第二部分,包括路由嵌套.路由生命周期等知识点. 例子 例子仍然是上节课的例子:
随机推荐
- Java中 try--catch-- finally、throw、throws 的用法
一.try {..} catch {..}finally {..}用法 try { 执行的代码,其中可能有异常.一旦发现异常,则立即跳到catch执行.否则不会执行catch里面的内容 } catch ...
- c/c++ 数组和指针
c/c++ 数组和指针 知识点 1,数组就是指针,对应代码里的test1 2,用auto声明,得到的是指针,对应代码里的test2 3,用decltype声明,得到的不是指针 ,对应代码里的test3 ...
- c/c++ const this指针
const this指针 方法列表后面的const是什么含义呢?答案:不可以在方法里修改成员变量 class Test{ public: void fun()const{ //data = 10;// ...
- LeetCode算法题-Valid Perfect Square(Java实现-四种解法)
这是悦乐书的第209次更新,第221篇原创 01 看题和准备 今天介绍的是LeetCode算法题中Easy级别的第77题(顺位题号是367).给定正整数num,写一个函数,如果num是一个完美的正方形 ...
- June 10. 2018, Week 24th, Sunday
There is no friend as loyal as a book. 好书如挚友,情谊永不渝. From Ernest Miller Hemingway. Books are my frien ...
- IntelliJ IDEA 创建maven管理的webapp项目
因为使用框架时基本需要使用maven管理项目,所以单独写一个搭建maven项目的流程 第一步: File-->New--Project 第二步: 选择maven框架 第三步: 输入工程id ...
- Linux 基本操作 (day2)
一.用户的基本操作 1.添加和删除用户(管理员): useradd 用户名: useradd taibai passwd 用户名: passwd taibai [root@localhost ~] ...
- https 建立连接过程
http://blog.csdn.net/wangjun5159/article/details/51510594 思考问题的顺序 学技术时,总是会问什么?这里也不例外,https为什么会存在,它有什 ...
- dp 动态规划 蘑菇
蘑菇真的贵,友情价更高 Description 由于提莫为巡逻准备的蘑菇太多了,多余的蘑菇路上种不下,于是他精心挑选了一些蘑菇拜访他的好朋友小炮 提莫的蘑菇一共有n个,对于编号为i的蘑菇魔力值是a ...
- Linux三剑客-SED
1.Sed是什么 Sed:字符流编辑器,Stream Editor 2.Sed功能与版本 处理日志文件,日志,配置文件等 增加.删除.修改.查询 sed --version 可以通过man sed 来 ...