Sampling Matrix
这些天看了一些关于采样矩阵(大概是这么翻译的)的论文,简单做个总结。
- FAST MONTE CARLO ALGORITHMS FOR MATRICES I: APPROXIMATING MATRIX MULTIPLICATION
算法如下:
目的是为了毕竟矩阵的乘积AB, 以CR来替代。
其中右上角带有i_t的A表示A的第i_t列,右下角带有i_t的B表示B的第i_t行。
关于 c 的选择,以及误差的估计,请回看论文。
下面是一个小小的测试:
代码:
import numpy as np
def Generate_P(A, B): #生成概率P
try:
n1 = len(A[1,:])
n2 = len(B[:,1])
if n1 == n2:
n = n1
else:
print('Bad matrices')
return 0
except:
print('The matrices are not fit...')
A_New = np.square(A)
B_New = np.square(B)
P_A = np.array([np.sqrt(np.sum(A_New[:,i])) for i in range(n)])
P_B = np.array([np.sqrt(np.sum(B_New[i,:])) for i in range(n)])
P = P_A * P_B / (np.sum(P_A * P_B))
return P
def Generate_S(n, c, P): #生成采样矩阵S 简化了一下算法
S = np.zeros((n, c))
T = np.random.choice(np.array([i for i in range(n)]), size = c, replace = True, p = P)
for i in range(c):
S[T[i], i] = 1 / np.sqrt(c * P[T[i]])
return S
def Summary(times, n, c, P, A_F, B_F, AB): #总结和分析
print('{0:^15} {1:^15} {2:^15} {3:^15} {4:^15} {5:^15} {6:^15}'.format('A_F', 'B_F', 'NEW_F', 'A_F * B_F', 'AB_F', 'RATIO', 'RATIO2'))
print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
A_F_B_F = A_F * B_F
AB_F = np.sqrt(np.sum(np.square(AB)))
Max = -1
Min = 99999999999
Max2 = -1
Min2 = 99999999999
Max_NEW_F = 0
Min_NEW_F = 0
Mean_NEW_F = 0
Mean_ratio = 0
Mean_ratio2 = 0
for i in range(times):
S = Generate_S(n, c, P)
CR = np.dot(A.dot(S), (S.T).dot(B))
NEW = AB - CR
NEW_F = np.sqrt(np.sum(np.square(NEW)))
ratio = NEW_F / A_F_B_F
ratio2 = NEW_F / AB_F
Mean_NEW_F += NEW_F
Mean_ratio += ratio
Mean_ratio2 += ratio2
if ratio > Max:
Max = ratio
Max2 = ratio2
Max_NEW_F = NEW_F
if ratio < Min:
Min = ratio
Min2 = ratio2
Min_NEW_F = NEW_F
print('{0:^15.5f} {1:^15.5f} {2:^15.5f} {3:^15.5f} {4:^15.5f} {5:^15.3%} {6:^15.3%}'.format(A_F, B_F, NEW_F, A_F_B_F, AB_F, ratio, ratio2))
Mean_NEW_F = Mean_NEW_F / times
Mean_ratio = Mean_ratio / times
Mean_ratio2 = Mean_ratio2 / times
print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
print('{0:^15.5f} {1:^15.5f} {2:^15.5f} {3:^15.5f} {4:^15.5f} {5:^15.3%} {6:^15.3%}'.format(A_F, B_F, Mean_NEW_F, A_F_B_F, AB_F, Mean_ratio, Mean_ratio2))
print('{0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15} {0:-<15}'.format(''))
print('Count: {0} times'.format(times))
print('Max_ratio: {0:<15.3%} Min_ratio: {1:<15.3%}'.format(Max, Min))
print('Max_ratio2: {0:<15.3%} Min_ratio2: {1:<15.3%}'.format(Max2, Min2))
print('Max_NEW_F: {0:<15.5f} Min_NEW_F: {1:<15.5f}'.format(Max_NEW_F, Min_NEW_F))
#下面是关于矩阵行列的一些参数,我是采用均匀分布产生的矩阵
m = 47
n = 120
p = 55
A = np.array([[np.random.rand() * 100 for j in range(n)] for i in range(m)])
B = np.array([[np.random.rand() * 100 for j in range(p)] for i in range(n)])
#构建c的一些参数 这个得参考论文
Thelta = 1/4
Belta = 1
Yita = 1 + np.sqrt((8/Belta * np.log(1/Thelta)))
e = 1/5
c = int(1 / (Belta * e ** 2)) + 1
P = Generate_P(A, B)
#结果分析
AB = A.dot(B)
A_F = np.sqrt(np.sum(np.square(A)))
B_F = np.sqrt(np.sum(np.square(B)))
times = 1000
Summary(times, n, c, P, A_F, B_F, AB)
粗略的结果:
用了原矩阵的一半的维度,代价是约17%的误差。
用正态分布生成矩阵的时候,发现,如果是标准正态分布,效果很差,我猜是由计算机舍入误差引起的,这样的采样的性能不好。当均值增加的时候,和”均匀分布“差不多,甚至更优(F范数的意义上)。
补充:
Sampling Matrix的更多相关文章
- 【NLP】Conditional Language Modeling with Attention
Review: Conditional LMs Note that, in the Encoder part, we reverse the input to the ‘RNN’ and it per ...
- Sampling Distributions and Central Limit Theorem in R(转)
The Central Limit Theorem (CLT), and the concept of the sampling distribution, are critical for unde ...
- [LeetCode] Random Flip Matrix 随机翻转矩阵
You are given the number of rows n_rows and number of columns n_cols of a 2D binary matrix where all ...
- 【RS】Sparse Probabilistic Matrix Factorization by Laplace Distribution for Collaborative Filtering - 基于拉普拉斯分布的稀疏概率矩阵分解协同过滤
[论文标题]Sparse Probabilistic Matrix Factorization by Laplace Distribution for Collaborative Filtering ...
- 470. Implement Rand10() Using Rand7() (拒绝采样Reject Sampling)
1. 问题 已提供一个Rand7()的API可以随机生成1到7的数字,使用Rand7实现Rand10,Rand10可以随机生成1到10的数字. 2. 思路 简单说: (1)通过(Rand N - 1) ...
- [Python] 01 - Number and Matrix
故事背景 一.大纲 如下,chapter4 是个概览,之后才是具体讲解. 二. 编译过程 Ref: http://www.dsf.unica.it/~fiore/LearningPython.pdf
- 目录:Matrix Differential Calculus with Applications in Statistics and Econometrics,3rd_[Magnus2019]
目录:Matrix Differential Calculus with Applications in Statistics and Econometrics,3rd_[Magnus2019] Ti ...
- 【论文笔记】SamWalker: Social Recommendation with Informative Sampling Strategy
SamWalker: Social Recommendation with Informative Sampling Strategy Authors: Jiawei Chen, Can Wang, ...
- angular2系列教程(十一)路由嵌套、路由生命周期、matrix URL notation
今天我们要讲的是ng2的路由的第二部分,包括路由嵌套.路由生命周期等知识点. 例子 例子仍然是上节课的例子:
随机推荐
- Javascript 高级程序设计--总结【三】
******************** Chapter 8 BOM ******************** BOM由浏览器提供商扩展 window: 既是js访问浏览器窗口的接口,又是Globa ...
- DNS区域传送漏洞实验以及二级域名爆破
DNS区域传送漏洞实验以及二级域名爆破 目录: 1.DNS服务器的域传送漏洞(nslookup交互式.非交互式.批处理三种方式) 2.写个二级域名爆破脚本 一.DNS服务器的域传送漏洞 实验环境: 服 ...
- JavaSe: 不要小看了 Serializable
Java中,一个类要支持序列化,我们通常实现Serializable.在使用Serializable,应当制定一个SerialVersionUID,用于代表类的版本.如果不指定会有什么影响呢?在了解这 ...
- JavaScript -- 时光流逝(十二):DOM -- Element 对象
JavaScript -- 知识点回顾篇(十二):DOM -- Element 对象 (1) element.accessKey: 设置或返回accesskey一个元素,使用 Alt + 指定快捷键 ...
- linux远程目录共享
一.环境介绍 1.服务器说明: 有两台服务器,(1)101报表服务器,上面是tomcat跑的原生FineReport报表系统,(2)103业务服务器,上面是具体的业务系统. 2.需求说明: 报表文件由 ...
- SharePoint 2010 在同意匿名訪问的站点中隐藏登陆链接
版权声明:本文为博主原创文章,未经博主同意不得转载. https://blog.csdn.net/u012025054/article/details/37565787 SharePoint 2010 ...
- ndim 与 shape的区别
[[ ., ., .], [ ., ., .]] 在上面这个例子中,数组的ndim为2(它有两个维度(简单的辨别两层方括号)). 第一个维度的长度为2,也就是有两个子数组 第二个维度的长度为3,也就是 ...
- Spring batch
学习了解 https://www.ibm.com/developerworks/cn/java/j-lo-springbatch1/index.html?ca=drs-#ibm-pcon
- No.3
1.查看httpd进程数(即prefork模式下Apache能够处理的并发请求数): ps -ef | grep httpd | wc -l 返回结果示例: 1388 表示Apache能够处理1388 ...
- 三.js实例
1.完成一个双色球,红球的自选效果 规则:1-33 用表格画出一个1到33的格子,点击一个自选按钮,将随机选中6个数字,每个表格对应的数字的背景就改为一个红球的效果 双色球.html <!DOC ...