Let AB be two square matrices over a ring R. We want to calculate the matrix product C as

{\displaystyle \mathbf {C} =\mathbf {A} \mathbf {B} \qquad \mathbf {A} ,\mathbf {B} ,\mathbf {C} \in R^{2^{n}\times 2^{n}}}

If the matrices AB are not of type 2n × 2n we fill the missing rows and columns with zeros.

We partition AB and C into equally sized block matrices

{\displaystyle \mathbf {A} ={\begin{bmatrix}\mathbf {A} _{1,1}&\mathbf {A} _{1,2}\\\mathbf {A} _{2,1}&\mathbf {A} _{2,2}\end{bmatrix}}{\mbox{ , }}\mathbf {B} ={\begin{bmatrix}\mathbf {B} _{1,1}&\mathbf {B} _{1,2}\\\mathbf {B} _{2,1}&\mathbf {B} _{2,2}\end{bmatrix}}{\mbox{ , }}\mathbf {C} ={\begin{bmatrix}\mathbf {C} _{1,1}&\mathbf {C} _{1,2}\\\mathbf {C} _{2,1}&\mathbf {C} _{2,2}\end{bmatrix}}}

with

{\displaystyle \mathbf {A} _{i,j},\mathbf {B} _{i,j},\mathbf {C} _{i,j}\in R^{2^{n-1}\times 2^{n-1}}}

then

{\displaystyle \mathbf {C} _{1,1}=\mathbf {A} _{1,1}\mathbf {B} _{1,1}+\mathbf {A} _{1,2}\mathbf {B} _{2,1}}
{\displaystyle \mathbf {C} _{1,2}=\mathbf {A} _{1,1}\mathbf {B} _{1,2}+\mathbf {A} _{1,2}\mathbf {B} _{2,2}}
{\displaystyle \mathbf {C} _{2,1}=\mathbf {A} _{2,1}\mathbf {B} _{1,1}+\mathbf {A} _{2,2}\mathbf {B} _{2,1}}
{\displaystyle \mathbf {C} _{2,2}=\mathbf {A} _{2,1}\mathbf {B} _{1,2}+\mathbf {A} _{2,2}\mathbf {B} _{2,2}}

With this construction we have not reduced the number of multiplications. We still need 8 multiplications to calculate the Ci,j matrices, the same number of multiplications we need when using standard matrix multiplication.

Now comes the important part. We define new matrices

{\displaystyle \mathbf {M} _{1}:=(\mathbf {A} _{1,1}+\mathbf {A} _{2,2})(\mathbf {B} _{1,1}+\mathbf {B} _{2,2})}
{\displaystyle \mathbf {M} _{2}:=(\mathbf {A} _{2,1}+\mathbf {A} _{2,2})\mathbf {B} _{1,1}}
{\displaystyle \mathbf {M} _{3}:=\mathbf {A} _{1,1}(\mathbf {B} _{1,2}-\mathbf {B} _{2,2})}
{\displaystyle \mathbf {M} _{4}:=\mathbf {A} _{2,2}(\mathbf {B} _{2,1}-\mathbf {B} _{1,1})}
{\displaystyle \mathbf {M} _{5}:=(\mathbf {A} _{1,1}+\mathbf {A} _{1,2})\mathbf {B} _{2,2}}
{\displaystyle \mathbf {M} _{6}:=(\mathbf {A} _{2,1}-\mathbf {A} _{1,1})(\mathbf {B} _{1,1}+\mathbf {B} _{1,2})}
{\displaystyle \mathbf {M} _{7}:=(\mathbf {A} _{1,2}-\mathbf {A} _{2,2})(\mathbf {B} _{2,1}+\mathbf {B} _{2,2})}

only using 7 multiplications (one for each Mk) instead of 8. We may now express the Ci,j in terms of Mk, like this:

{\displaystyle \mathbf {C} _{1,1}=\mathbf {M} _{1}+\mathbf {M} _{4}-\mathbf {M} _{5}+\mathbf {M} _{7}}
{\displaystyle \mathbf {C} _{1,2}=\mathbf {M} _{3}+\mathbf {M} _{5}}
{\displaystyle \mathbf {C} _{2,1}=\mathbf {M} _{2}+\mathbf {M} _{4}}
{\displaystyle \mathbf {C} _{2,2}=\mathbf {M} _{1}-\mathbf {M} _{2}+\mathbf {M} _{3}+\mathbf {M} _{6}}

We iterate this division process n times (recursively) until the submatrices degenerate into numbers (elements of the ring R). The resulting product will be padded with zeroes just like A and B, and should be stripped of the corresponding rows and columns.

Practical implementations of Strassen's algorithm switch to standard methods of matrix multiplication for small enough submatrices, for which those algorithms are more efficient. The particular crossover point for which Strassen's algorithm is more efficient depends on the specific implementation and hardware. Earlier authors had estimated that Strassen's algorithm is faster for matrices with widths from 32 to 128 for optimized implementations. However, it has been observed that this crossover point has been increasing in recent years, and a 2010 study found that even a single step of Strassen's algorithm is often not beneficial on current architectures, compared to a highly optimized traditional multiplication, until matrix sizes exceed 1000 or more, and even for matrix sizes of several thousand the benefit is typically marginal at best (around 10% or less).

from Wikipedia

--------------------------------------------------------------------------------------------------------------------------------------------------------------

it substitude the 8th recursive invocation(multiplication) by the liner combination of the submatrices above(cause A4,4 and B4,4 has been used before).
like a*(b+c) can have less steps than a*b+a*c,it uses liner combination to simplify the tranditional multiplicate way.

Strassen algorithm(O(n^lg7))的更多相关文章

  1. strassen algorithm

    the explaination that is clear in my view is from wiki.

  2. Conquer and Divide经典例子之Strassen算法解决大型矩阵的相乘

    在通过汉诺塔问题理解递归的精髓中我讲解了怎么把一个复杂的问题一步步recursively划分了成简单显而易见的小问题.其实这个解决问题的思路就是算法中常用的divide and conquer, 这篇 ...

  3. [Algorithm] 如何正确撸<算法导论>CLRS

    其实算法本身不难,第一遍可以只看伪代码和算法思路.如果想进一步理解的话,第三章那些标记法是非常重要的,就算要花费大量时间才能理解,也不要马马虎虎略过.因为以后的每一章,讲完算法就是这样的分析,精通的话 ...

  4. Strassen优化矩阵乘法(复杂度O(n^lg7))

    按照算法导论写的 还没有测试复杂度到底怎么样 不过这个真的很卡内存,挖个坑,以后写空间优化 还有Matthew Anderson, Siddharth Barman写了一个关于矩阵乘法的论文 < ...

  5. [Algorithm] 面试题之犄角旮旯 第贰章

    闲下来后,需要讲最近涉及到的算法全部整理一下,有个indice,方便记忆宫殿的查找 MIT的算法课,地球上最好: Design and Analysis of Algorithms 本篇需要重新整理, ...

  6. 挑子学习笔记:两步聚类算法(TwoStep Cluster Algorithm)——改进的BIRCH算法

    转载请标明出处:http://www.cnblogs.com/tiaozistudy/p/twostep_cluster_algorithm.html 两步聚类算法是在SPSS Modeler中使用的 ...

  7. PE Checksum Algorithm的较简实现

    这篇BLOG是我很早以前写的,因为现在搬移到CNBLOGS了,经过整理后重新发出来. 工作之前的几年一直都在搞计算机安全/病毒相关的东西(纯学习,不作恶),其中PE文件格式是必须知识.有些PE文件,比 ...

  8. [异常解决] windows用SSH和linux同步文件&linux开启SSH&ssh client 报 algorithm negotiation failed的解决方法之一

    1.安装.配置与启动 SSH分客户端openssh-client和openssh-server 如果你只是想登陆别的机器的SSH只需要安装openssh-client(ubuntu有默认安装,如果没有 ...

  9. [Algorithm] 使用SimHash进行海量文本去重

    在之前的两篇博文分别介绍了常用的hash方法([Data Structure & Algorithm] Hash那点事儿)以及局部敏感hash算法([Algorithm] 局部敏感哈希算法(L ...

随机推荐

  1. STLC - 软件测试生命周期

    什么是软件测试生命周期(STLC)? 软件测试生命周期(STLC)定义为执行软件测试的一系列活动. 它包含一系列在方法上进行的活动,以帮助认证您的软件产品. 图 - 软件测试生命周期的不同阶段 每个阶 ...

  2. 第二类Stirling数

    第二类斯特林数 第二类Stirling数:S2(p, k) 1.组合意义:第二类Stirling数计数的是把p个互异元素划分为k个非空集合的方法数 2.递推公式: S2(0, 0) = 1 S2(p, ...

  3. python-flask基本应用模板

    1.模板继承 <!DOCTYPE html> <html lang="en"> <head> <meta charset="UT ...

  4. nyoj-1250-exgcd

    机器人 时间限制:1000 ms  |  内存限制:65535 KB 难度:4   描述 Dr. Kong 设计的机器人卡尔非常活泼,既能原地蹦,又能跳远.由于受软硬件设计所限,机器人卡尔只能定点跳远 ...

  5. Leetcode 109

    //这种也是空间复杂度为O(K)的解法,就是边界有点难写class Solution { public: vector<int> getRow(int rowIndex) { vector ...

  6. HTML5 <li> <ol> <ul> 用法

    定义和用法 <li> 标签定义列表项目. <li> 标签可用在有序列表 (<ol>) 和无序列表 (<ul>) 中. HTML 与 XHTML 之间的差 ...

  7. 二、工作中常用的SQL优化

    除了给table建立索引之外,保持良好的SQL语句编写. 1.通过变量的方式来设置参数 比如动态查询的时候,尽量这样写 好:string strSql=" SELECT * FROM PEO ...

  8. Mysql for Linux安装配置之—— rpm(bundle)安装

    1.准备及安装1)下载rpm安装包(或rpm bundle)  rpm安装包包括两个(bundle会更多),一个是client,另一个是server,例如:MySQL-client-5.5.44-1. ...

  9. Binary Analysis Tool安装使用教程

    Binary Analysis Tool(BAT)是一个用于检测二进制文件使用到的开源组件,协助及早发现程序发布后可能会面临的开源协议解执的开源免费检测工具. 一.安装BAT和bat-extratoo ...

  10. DDOS 攻击的防范

    ddos 攻击介绍 可以看下面的文章 http://www.ruanyifeng.com/blog/2018/06/ddos.html 下面转自:  http://www.escorm.com/arc ...