The Step-by-Step Approach

break down a tricky problem and to solve problems using what you do know.

Step 1: Make Believe

Pretend that the data can all fit on one machine and there are no memory limitations. Provide the general outline for your solution.

Step 2:  Get Real

figure out how to logically divide the data up, and how one machine would identify where to look up a different piece of data.

Step 3: Solve Problems

  Dividing Up Lots of Data:

By Order of Appearance:

By Hash Value: 1)pick some sort of key relating to the data 2)hash the key 3)mod the hash value by the number of machines 4)store data on the machine with that value

        there is no relationship between what the data represents and which machine stores data.

By Acutal Value: reduce system latency by using information about what the data represents.

Arbitrarily:

Good Example: Find all documents that contains a list of words.


10.1 build some sort of service that will be called by up to 1000 client applications to get simple end-of-day stock price information.

We want to start off by thinking about what the different aspects we should consider in a given proposal are:

1. Client Ease of Use: we want the service to be easy for the clients to implement and useful for them

2. Ease for Ourselves: consider in this not only the cost of implementing, but also the cost of maintenance

3. Flexibility for Future Demands:

4. Scalability and Efficiency: not to overly burden our service.

DataBase vs XML(json) P 343


10.2 good problem

10.7 LRU Cache

Chp10: Scalability and Memory Limits的更多相关文章

  1. is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.6 GB of 40 GB virtual memory used

    昨天使用hadoop跑五一的数据,发现报错: Container [pid=,containerID=container_1453101066555_4130018_01_000067] GB phy ...

  2. Memory Limits for Windows and Windows Server Releases

    来源:https://msdn.microsoft.com/en-us/library/windows/desktop/aa366778(v=vs.85).aspx Limits on memory ...

  3. [hadoop] - Container [xxxx] is running beyond physical/virtual memory limits.

    当运行mapreduce的时候,有时候会出现异常信息,提示物理内存或者虚拟内存超出限制,默认情况下:虚拟内存是物理内存的2.1倍.异常信息类似如下: Container [pid=13026,cont ...

  4. hive: insert数据时Error during job, obtaining debugging information 以及beyond physical memory limits

    insert overwrite table canal_amt1...... 2014-10-09 10:40:27,368 Stage-1 map = 100%, reduce = 32%, Cu ...

  5. hadoop is running beyond virtual memory limits问题解决

    单机搭建了2.6.5的伪分布式集群,写了一个tf-idf计算程序,分词用的是结巴分词,使用standalone模式运行没有任何问题,切换到伪分布式模式运行一直报错: hadoop is running ...

  6. hadoop的job执行在yarn中内存分配调节————Container [pid=108284,containerID=container_e19_1533108188813_12125_01_000002] is running beyond virtual memory limits. Current usage: 653.1 MB of 2 GB physical memory used

    实际遇到的真实问题,解决方法: 1.调整虚拟内存率yarn.nodemanager.vmem-pmem-ratio (这个hadoop默认是2.1) 2.调整map与reduce的在AM中的大小大于y ...

  7. [转载]Memory Limits for Windows and Windows Server Releases

    Memory Limits for Windows and Windows Server Releases This topic describes the memory limits for sup ...

  8. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

  9. Container [pid=6263,containerID=container_1494900155967_0001_02_000001] is running beyond virtual memory limits

    以Spark-Client模式运行,Spark-Submit时出现了下面的错误: User: hadoop Name: Spark Pi Application Type: SPARK Applica ...

随机推荐

  1. VS默认环境设置

    VS2010的工具菜单-->导入导出设置-->重置所有设置

  2. Vim保存文件命令 ":wq" 与 ":x" 的区别

    CSDN转载 [1] Vim是Unix/Linux系统最常用的编辑器之一,在保存文件时,我通常选择":wq",因为最开始学习vim的时候,就只记住了几个常用的命令:也没有细究命令的 ...

  3. Low-poly低面建模(低像素多边形)

    概念 继拟物化.扁平化(Flat Design).长阴影(Long Shadow)之后,低多边形(Low Poly)又火速掀起了最新设计风潮.这种设计风格在早期计算机建模和动效中就被广泛采用,在快要被 ...

  4. SQL语句执行顺寻

    SQL语句执行的时候是有一定顺序的.理解这个顺序对SQL的使用和学习有很大的帮助. 1.from 先选择一个表,或者说源头,构成一个结果集. 2.where 然后用where对结果集进行筛选.筛选出需 ...

  5. 真正理解KMP算法

    作者:jostree 转载请注明出处 http://www.cnblogs.com/jostree/p/4403560.html 所谓KMP算法,就是判断一个模式串是否是一个字符串的子串,通常的算法当 ...

  6. [转]AIX下调整分区大小

    AIX下调整文件系统大小 - [work] 版权声明:转载时请以超链接形式标明文章原始出处和作者信息及本声明http://wangsuiri.blogbus.com/logs/35448074.htm ...

  7. JS预览图像将本地图片显示到浏览器上的代码

    js代码实现: 从file域获取本地图片url并将本地图片显示到浏览器上. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitio ...

  8. PHP中include和require绝对路径、相对路径问题

    在写PHP程序时,经常要用到include或require包含其他文件,但是各文件里包含的文件多了之后,就会产生路径问题. 如下目录: <web>(网站根目录) ├<A>文件夹 ...

  9. PHP获取Cookie模拟登录CURL

    要提取google搜索的部分数据,发现google对于软件抓取它的数据屏蔽的厉害,以前伪造下 USER-AGENT 就可以抓数据,但是现在却不行了.利用抓包数据发现,Google 判断了 cookie ...

  10. java 语法糖

    package syntax.autoCase; import java.util.Arrays; import java.util.List; public class autoCase { pub ...