see Spare Matrix wikipedia item,
and scipy's documentation on different choices of sparse matrix type

sparse matrix storage, only store non-zero entries. there're multiple possible data structures for this, and can be divided into 2 groups

  • support efficient modification

    • DOK (dictory of keys)
    • LIL (list of lists)
    • COO (coordiate list)
  • support efficient access
    • CSR/CSC (compressed sparse row/column)

Dictionary of Keys (DOK)

  • a dictionary that maps (row, col)-pair to the value;
  • good for incremental build;
  • poor for iterating;
  • often used for building matrix, and convert to another format

List of Lists (LIL)

  • matrix is a list of lists, one list for each row;
  • each row list stores the (col, val) pair list;
  • efficient for creation/insertion

Coordinate List (COO) aka IJV format

  • sotre a list of (row, col, value) triplets, and ideally sorted by row then col;
  • also known as IJV or Triplet format.

Compressed Sparse Row (CSR)

  • an m*n matrix is represented as 3 vectors: vals, row_ptr, col_idx;
  • vals: all values in row-major; length is number of non-zero matrix elements;
  • col_idx: all values' column index in row-major order; same length with vals;
  • row_ptr: row_ptr[0] = 0, row_ptr[k] = number-of-vals in first k rows; i.e. row_ptr[k+1]-row_ptr[k] is number of elements at row k;
  • this is extremely optimized for row-by-row iteration: only access current portion of vals and col_idx, and 2 elements of row_ptr to determine the portion - super cache friendly;
  • thus very suitable for cases like matrix-multiplication, matrix-vector-multiplication;

sparse matrix format的更多相关文章

  1. 理解Compressed Sparse Column Format (CSC)

    最近在看<Spark for Data Science>这本书,阅读到<Machine Learning>这一节的时候被稀疏矩阵的存储格式CSC给弄的晕头转向的.所以专门写一篇 ...

  2. sparse matrix

    w https://en.wikipedia.org/wiki/Sparse_matrix 稀疏矩阵存储格式总结+存储效率对比:COO,CSR,DIA,ELL,HYB - Bin的专栏 - 博客园ht ...

  3. 311. Sparse Matrix Multiplication

    题目: Given two sparse matrices A and B, return the result of AB. You may assume that A's column numbe ...

  4. 用R的dgCMatrix包来构建稀疏矩阵 | sparse matrix by dgCMatrix

    sparse matrix是用来存储大型稀疏矩阵用得,单细胞表达数据基本都用这个格式来存储,因为单细胞很大部分都是0,用普通文本矩阵存储太占空间. 使用也是相当简单: library("Ma ...

  5. [leetcode]311. Sparse Matrix Multiplication 稀疏矩阵相乘

    Given two sparse matrices A and B, return the result of AB. You may assume that A's column number is ...

  6. 稀疏矩阵乘法 · Sparse Matrix Multiplication

    [抄题]: 给定两个 稀疏矩阵 A 和 B,返回AB的结果.您可以假设A的列数等于B的行数. [暴力解法]: 时间分析: 空间分析: [思维问题]: [一句话思路]: 如果为零则不相乘,优化常数的复杂 ...

  7. Sparse Matrix Multiplication

    Given two sparse matrices A and B, return the result of AB. You may assume that A's column number is ...

  8. [LeetCode] Sparse Matrix Multiplication 稀疏矩阵相乘

    Given two sparse matrices A and B, return the result of AB. You may assume that A's column number is ...

  9. [LeetCode] Sparse Matrix Multiplication

    Problem Description: Given two sparse matrices A and B, return the result of AB. You may assume that ...

随机推荐

  1. Trie树,又称单词查找树、字典

    在百度或淘宝搜索时,每输入字符都会出现搜索建议,比如输入“北京”,搜索框下面会以北京为前缀,展示“北京爱情故事”.“北京公交”.“北京医院”等等搜索词.实现这类技术后台所采用的数据结构是什么?[中国某 ...

  2. python编程基础:《http://www.cnblogs.com/wiki-royzhang/category/466416.html》

    windows自动化 http://www.cnblogs.com/wiki-royzhang/category/466416.html

  3. poj 3071 Football <DP>

    链接:http://poj.org/problem?id=3071 题意: 有 2^n 支足球队,编号 1~2^n,现在给出每支球队打败其他球队的概率,问哪只球队取得冠军的概率最大? 思路: 设dp[ ...

  4. js 获取地理位置经纬度

    1. 加载百度API的核心js,ak表示获取百度地图的开发密钥,免费的需要申请下 <script type="text/javascript" src="http: ...

  5. 如果数据需要被多个应用程序消费的话,推荐使用 Kafka,如果数据只是面向 Hadoop 的,可以使用 Flume

    https://www.ibm.com/developerworks/cn/opensource/os-cn-kafka/index.html Kafka 与 Flume 很多功能确实是重复的.以下是 ...

  6. 记录Elasticsearch的一次坑

    Elasticsearch建立mapping关系时,默认会给string类型加上分词. 所以例如openid这种,如果你用默认的分词,就可能会出现查不到数据的情况. 解决方案: 1.将数据备份 2.r ...

  7. cron表达式(转)

    原文地址:http://www.cnblogs.com/linjiqin/archive/2013/07/08/3178452.html Cron表达式是一个字符串,字符串以5或6个空格隔开,分为6或 ...

  8. winform中通过事件实现窗体传值思路【待修改】

    Form2向Form1传值         private Form1 form1;//定义一个类型为Form1类型的字段,用于存储传递过来的Form对象         public void Se ...

  9. 算法(Algorithms)第4版 练习 1.3.14

    方法实现: //1.3.14 package com.qiusongde; import java.util.Iterator; import java.util.NoSuchElementExcep ...

  10. myeclipse内存不足的处理

    Myeclipse内存溢出解决方案 1.tomcat内存扩展 修改tomcat中bin目录下catalina.bat文件在echo Using CATALINA_BASE:  "%CATAL ...