2.HBase In Action 第一章-HBase简介(1.1数据管理系统:快速学习)
Relational database systems have been around for a few decades and have been hugely successful in solving data storage, serving, and processing problems over the years. Several large companies have built their systems using relational database systems, online transactional systems, as well as back-end analytics applications.
关系数据库系统已经存在了几十年,在解决数据存储、服务和处理问题上都取得了巨大的成功。不少大公司都使用关系数据库建设了自己的信息系统,如在线交易系统以及后端分析应用程序。
Online transaction processing (OLTP) systems are used by applications to record transactional information in real time. They’re expected to return responses quickly, typically in milliseconds. For instance, the cash registers in retail stores record purchases and payments in real time as customers make them. Banks have large OLTP systems that they use to record transactions between users like transferring of funds and such. OLTP systems aren’t limited to money transactions. Web companies like LinkedIn also have such applications—for instance, when users connect with other users. The term transaction in OLTP refers to transactions in the context of databases, not financial transactions.
在线事务处理(OLTP)系统,是应用程序用来记录实时事务信息的。他们都是通常以毫秒为单位,迅速返回响应。例如,收银机在零售店实时地记录客户的购买记录和支付记录,就像客户所期待的那样工作。银行拥有庞大的OLTP系统,用于记录用户之间转移交易资金等。OLTP系统并不局限于金钱交易。像LinkedIn的网络公司也有这样应用系统,比如,用户与其他用户进行信息沟通。这里讲的OLTP事务是指数据库上下文中的事务,而不是金融交易中的事务。
Online analytical processing (OLAP) systems are used to answer analytical queries about the data stored in them. In the context of retailers, these would mean systems that generate daily, weekly, and monthly reports of sales and slice and dice the information to allow analysis of it from several different perspectives. OLAP falls in the domain of business intelligence, where data is explored, processed, and analyzed to glean information that could further be used to drive business decisions. For a company like LinkedIn, where the establishing of connections counts as transactions, analyzing the connectedness of the graph and generating reports on things like the number of average connections per user falls in the category of business intelligence; this kind of processing would likely be done using OLAP systems.
联机分析处理(OLAP)系统通常解决存储数据的分析查询问题。零售商的信息系统需要分析数据中的不同维度,来生成每日,每周和每月的销售报告,数据切片和多维数据立方体信息。OLAP定位于商业智能领域,进行探索数据,它处理,分析收集的信息可能会进一步被用来促进与支撑业务上的决策。像LinkedIn这样的公司,把用户的联系看作是交易,它会分用户析联系图的连通性,分析和生成报告之类的东西,如每个用户平均的联系人数量,这种事情就是属于商业智能的范畴,比较适合于使用OLAP系统来完成。
Relational databases, both open source and proprietary, have been successfully used at scale to solve both these kinds of use cases. This is clearly highlighted by the balance sheets of companies like Oracle, Vertica, Teradata, and others. Microsoft and
IBM have their share of the pie too. All such systems provide full ACID guarantees. Some scale better than others; some are open source, and others require you to pay steep licensing fees.
关系数据库,不管是开源还是公司专门拥有的,都已经成功地用于这两种类型的应用。这显然突出了像Oracle,Vertica,Teradata等等这样的大公司的资产负债表(呵呵,赚钱了嘛),微软和IBM也分得一杯羹。所有这些数据库系统都提供了完整的ACID特性。这些数据库系统当中,有一些是规模伸缩性比别人好,一些是开源的,还有一些需要你支付了高昂的许可费用的,反正各有千秋。
The internal design of relational databases is driven by relational math, and these systems require an up-front definition of schemas and types that the data will thereafter adhere to. Over time, SQL be came the standard way of interacting with these systems, and it has been widely used for several years. SQL is arguably a lot easier to write and takes far less time than coding up custom access code in programming languages. But it might not be the best way to express the access patterns in every situation, and that’s where issues like object-relational mismatch arose.
关系数据库的内部设计是以关系型数学运算为基础的,和这些系统需要一个预先定义的模型和明确的类型。随着时间的推移,SQL成为了与这些系统交互的标准方式,而且它已经广泛使用了好几年。SQL比编程语言,是更容易编写,花费更少的时间的。但它可能并不是解决每一种问题的最好实现模式,如,不适合对象关系不匹配的应用系统。
Any problem in computer science can be solved with a level of indirection. Solving problems like object-relational mismatch was no different and led to frameworks being built to alleviate the pain.
计算机科学中任何问题都可以通过增加一个间接层来解决。解决像对象关系不匹配的这种问题也是同样的,可以依靠一些框架来缓解这个问题。
For those who don’t know (or don’t remember), ACID is an acronym standing for atomicity, consistency, isolation, and durability. These are fundamental principles used to reason about data systems. See http:// en.wikipedia.org/wiki/ACID for an introduction.
如果你不知道不知道(或者不记得了),ACID是几个首字母缩写,代表原子性、一致性、隔离性和持久性。这些基本原则是用来判断数据系统的特性用的。具体介绍,请看 http:// en.wikipedia.org/wiki/ACID
2.HBase In Action 第一章-HBase简介(1.1数据管理系统:快速学习)的更多相关文章
- 1.HBase In Action 第一章-HBase简介(后续翻译中)
This chapter covers ■ The origins of Hadoop, HBase, and NoSQL ■ Common use cases for HBase ■ A basic ...
- 3.HBase In Action 第一章-HBase简介(1.1.1 大数据你好呀)
Let's take a closer look at the term Big Data. To be honest, it's become something of a loaded term, ...
- 8.HBase In Action 第一章-HBase简介(1.2.2 捕获增量数据)
Data often trickles in and is added to an existing data store for further usage, such as analytics, ...
- 7.HBase In Action 第一章-HBase简介(1.2.1 典型的网络搜索问题:Bigtable的起原)
Search is the act of locating information you care about: for example, searching for pages in a text ...
- 6.HBase In Action 第一章-HBase简介(1.2 HBase的使用场景和成功案例)
Sometimes the best way to understand a software product is to look at how it's used. The kinds of pr ...
- 5.HBase In Action 第一章-HBase简介(1.1.3 HBase的兴起)
Pretend that you're working on an open source project for searching the web by crawling websites and ...
- 4.HBase In Action 第一章-HBase简介(1.1.2 数据创新)
As we now know, many prominent internet companies, most notably Google, Amazon, Yahoo!, and Facebook ...
- 第一章 C++简介
第一章 C++简介 1.1 C++特点 C++融合了3种不同的编程方式:C语言代表的过程性语言,C++在C语言基础上添加的类代表的面向对象语言,C++模板支持的泛型编程. 1.2 C语言及其编程 ...
- python 教程 第一章、 简介
第一章. 简介 官方介绍: Python是一种简单易学,功能强大的编程语言,它有高效率的高层数据结构,简单而有效地实现面向对象编程.Python简洁的语法和对动态输入的支持,再加上解释性语言的本质,使 ...
随机推荐
- Postman中使用post方式调用接口
选择body-row,输入data
- Spring基础(1) : 自动装配
1.自动装配 1.1 byType 1.1.1根据类型自动匹配,若当前没有类型可以注入或者存在多个类型可以注入,则失败.必须要有对于的setter方法 public class Person{ pub ...
- 爬虫、网页分析解析辅助工具 Xpath-helper
每一个写爬虫.或者是做网页分析的人,相信都会因为在定位.获取xpath路径上花费大量的时间,甚至有时候当爬虫框架成熟之后,基本上主要的时间都花费在了页面的解析上.在没有这些辅助工具的日子里,我们只能通 ...
- SQL Server无法打开物理文件,操作系统错误 5:"5(拒绝访问。)的解决办法
在新装的系统中使用SQL Server附加以前的数据库的时候可能会遇到“无法打开物理文件,拒绝访问”的错误,如下图: 解决方法为使用windows验证登录,或者更改SQL Server内置账户类型为L ...
- LINQ to Objects系列(3)深入理解Lambda表达式
Lambda表达式是学好LINQ很重要的一个知识点,后面的LINQ查询中会大量地使用到Lambda表达式.这篇文章从以下几点进行总结. 1,Lambda表达式的前世今生 2,Lambda表达式的实际运 ...
- 记一次Full GC问题的排查
今天看到监控平台显示项目的Full GC次数过多,查看了一下监控曲线,如下图,发现发生的时间点基本上都是在上午十点之后,到下午五点. 分析:考虑到业务形态,开始初步怀疑是访问人数增多引起的虚拟机内存不 ...
- Linux常用基本命令(xargs )
xargs:能够将管道或者标准输入传递的数据转换成xargs命令后面跟随的参数 ghostwu@dev:~/linux/cp$ ls ghostwu_hardlink ghostwu_home gho ...
- Codeforces550C(SummerTrainingDay01-H)
C. Divisibility by Eight time limit per test : 2 seconds memory limit per test : 256 megabytes input ...
- django-缓存的应用
为什么需要缓存? django中文文档: 通常,计算值是昂贵的(即资源匮乏和缓慢),因此将值保存到可快速访问的缓存中可以有巨大的好处,为下一次需要做好准备. 这是一个足够重要和强大的技术,Django ...
- vue.js关于路由的跳转
1.路由demo示例 <div id="app"> <h1>Hello App!</h1> <p> <!-- 使用 route ...