Recovery

Types of Failures

Wrong data entry

  • Prevent by having constraints in the database
  • Fix with data cleaning

Disk crashes

  • Prevent by using redundancy (RAID, archive)
  • Fix by using archives

Fire, theft, bankruptcy…

  • Buy insurance, change profession…

System failures: most frequent (e.g. power)

  • Use recovery

System Failures

Each transaction has internal state

When system crashes, internal state is lost

  • Don’t know which parts executed and which didn’t

Remedy: use a log

  • A file that records every single action of the transaction

Transactions

Assumption: the database is composed of elements

Usually 1 element = 1 block

Can be smaller (=1 record) or larger (=1 relation)

Assumption: each transaction reads/writes some elements

Correctness Principle

There exists a notion of correctness for the database

  • Explicit constraints (e.g. foreign keys)
  • Implicit conditions (e.g. sum of sales = sum of invoices)

Correctness principle: if a transaction starts in a correct database state, it ends in a correct database state

Consequence: we only need to guarantee that transactions are atomic, and the database will be correct forever

Primitive Operations of Transactions

INPUT(X)

  • read element X to memory buffer

READ(X,t)

  • copy element X to transaction local variable t

WRITE(X,t)

  • copy transaction local variable t to element X

OUTPUT(X)

  • write element X to disk

The Log

An append-only file containing log records

Note: multiple transactions run concurrently, log records are interleaved

After a system crash, use log to:

  • Redo some transaction that didn’t commit
  • Undo other transactions that didn’t commit

Undo Logging

Log records

transaction T has begun

T has committed

T has aborted

<T,X,v> T has updated element X, and its old value was v

Undo-Logging Rules

U1: If T modifies X, then <T,X,v> must be written to disk before X is written to disk

U2: If T commits, then must be written to disk only after all changes by T are written to disk

Hence: OUTPUTs are done early

Recovery with Undo Log

After system’s crash, run recovery manager

Idea 1. Decide for each transaction T whether it is completed or not

Idea 2. Undo all modifications by incompleted transactions

Recovery manager:

Read log from the end; cases:

  • : mark T as completed
  • : mark T as completed
  • <T,X,v>: if T is not completed
    then write X=v to disk
    else ignore
  • : ignore

阅读方向,从下向上

Note: all undo commands are idempotent, If we perform them a second time, no harm is done

stop reading the log:

  • We cannot stop until we reach the beginning of the log file
  • This is impractical
  • Better idea: use checkpointing

Checkpointing

Checkpoint the database periodically

  • Stop accepting new transactions
  • Wait until all curent transactions complete
  • Flush log to disk
  • Write a log record, flush
  • Resume transactions

Redo Logging

Log records

<T,X,v>= T has updated element X, and its new value is v

R1: If T modifies X, then both <T,X,v> and must be written to disk before X is written to disk

Hence: OUTPUTs are done late

After system’s crash, run recovery manager

Step 1. Decide for each transaction T whether it is completed or not

Step 2. Read log from the beginning, redo all updates of committed transactions

Undo/Redo Logging

Log records, only one change

<T,X,u,v>= T has updated element X, its old value was u, and its new value is v

Recovery with Undo/Redo Log

After system’s crash, run recovery manager

Redo all committed transaction, top-down

Undo all uncommitted transactions, bottom-up

总结

日志undo先写日志(从下向上读)redo先写磁盘(从上到下读)

冲突可串行 & 两阶锁

两个事物使用同一个资源并有一个是写就是冲突的,简单讲就是在冲突可串行并发操作的前驱图中是没有环路的,前驱图无环就是冲突可串行的。

每个事物在使用资源的时候都是先统一取再统一放的,也就是其图示先增后减,斜率不会出现其他变动。

E.g.

Consider the following schedule:
T1 STARTS
T1 reads item B
T1 writes item B with old value 11, new value 12
T2 STARTS
T2 reads item B
T2 writes item B with old value 12, new value 13
T3 STARTS
T3 reads item A
T3 writes item A with old value 29, new value 30
T2 reads item A
T2 writes item A with old value 30, new value 31
T2 COMMITS
T1 reads item D
T1 writes item D with old value 44, new value 45
T3 COMMITS
T1 COMMITS

(a) What serial schedule is this equivalent to? If none, then explain why.

The serializability graph for the above schedule is: T1T2  T3. Any order that complies with the
topological order of the graph like T1  T3  T2 is an equivalent serial schedule for our schedule

(b) Is this schedule consistent with two phase locking? Explain why.

If we assume that all
transactions get the locks exactly before the operation and release them
afterwards, it is not consistent with two phase locking. This is
because T1 releases its lock on B after its second operation while
acquiring a lock on D at its last two operations. By removing the last
two operations of T1 the schedule becomes 2PL.

If we assume that the
transactions get all the locks they need at the beginning of the
transaction, and release them after the finish the operation, this
schedule will be 2PL. The minimum operations that could be added to the
schedule will be “T1 reads item A”. In this case, T1 has to acquire the
lock on A again after releasing its lock on A after its first
write.(这段话太深奥了,我用百度翻译都没看懂。。。)

Data Management Technology(5) -- Recovery的更多相关文章

  1. Data Management Technology(1) -- Introduction

    1.Database concepts (1)Data & Information Information Is any kind of event that affects the stat ...

  2. Data Management Technology(3) -- SQL

    SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data ...

  3. Data Management Technology(2) -- Data Model

    1.Data Model Model Is the abstraction of real world Reveal the essence of objects, help people to lo ...

  4. Data Management Technology(4) -- 关系数据库理论

    规范化问题的提出 在规范化理论出现以前,层次和网状数据库的设计只是遵循其模型本身固有的原则,而无具体的理论依据可言,因而带有盲目性,可能在以后的运行和使用中发生许多预想不到的问题. 在关系数据库系统中 ...

  5. [Windows Azure] Data Management and Business Analytics

    http://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/ Managing and analyzing dat ...

  6. Intel Active Management Technology

    http://en.wikipedia.org/wiki/Intel_Active_Management_Technology Intel Active Management Technology F ...

  7. MySQL vs. MongoDB: Choosing a Data Management Solution

    原文地址:http://www.javacodegeeks.com/2015/07/mysql-vs-mongodb.html 1. Introduction It would be fair to ...

  8. 场景3 Data Management

    场景3 Data Management 数据管理 性能优化 OLTP OLAP 物化视图 :表的快照 传输表空间 :异构平台的数据迁移 星型转换 :事实表 OLTP : 在线事务处理 1. trans ...

  9. Data Management and Data Management Tools

    Data Management ObjectivesBy the end o this module, you should understand the fundamentals of data m ...

随机推荐

  1. Hibernate 框架 -HQL 语法

    HQL ( Hibernate Query Language ) 查询语言是面向对象的查询语言,也是在 Hibernate 中最常见的.其语法和 SQL 语法有一些相似,功能十分强大,几乎支持除特殊 ...

  2. 控制台提示“Invalid string length”的原因

    控制台提示“Invalid string length”,浏览器直接卡掉,是为什么呢? 答:因为在写嵌套循环时,定义的变量重名了,内层和外层用了同一个i变量. -THE END-

  3. Dockerfile编写

    Dockerfile 是一个文本文件,其内包含了一条条的指令,每一条指令构建一层,因此每一条指令的内容,就是描述该层应当如何构建编写命令: 1.FROM作用:声明使用哪个基础镜像格式:FROM IMA ...

  4. 团队项目之Scrum6

    小组:BLACK PANDA 时间:2019.11.26 每天举行站立式会议 提供当天站立式会议照片一张 2 昨天已完成的工作 2 编辑功能优化 实现主页内容展示 今天计划完成的工作 2 内容展示 根 ...

  5. 数据库死锁的问题,Deadlock found when trying to get lock; try restarting transaction at Query.formatError

    场景: 应用刚上线排除大批量请求的问题 线上多次出现的Deadlock found when trying to get lock错误 代码: async batchUpdate(skus, { tr ...

  6. Opencv中图像height width X 轴 Y轴 rows cols之间的对应关系

    这里做一个备忘录:

  7. 2019年跨越速递Java工程师笔试题

    1.下面哪个选项可以用于JSP页面之间传递对象(A C) A application B page C session D error  E response 评语:这道题考察的是对JSP内置对象的了 ...

  8. java之JavaBean

    JavaBean是一种java语言编写而成的可重用组件. 所谓JavaBean,是指符合以下标准的java类: 类是公共的: 有一个无参的公共构造器: 有属性,属性一般是私有的,且有对应的set.ge ...

  9. 分享Java程序员50多道热门的多线程和并发面试题(答案解析)

    下面是Java程序员相关的热门面试题,你可以用它来好好准备面试. 1) 什么是线程? 线程是操作系统能够进行运算调度的最小单位,它被包含在进程之中,是进程中的实际运作单位.程序员可以通过它进行多处理器 ...

  10. HTML5 3D 在智慧物业/地产管理系统中的应用

    概述 该博文主要展示采用 HT for Web 提供的可视化技术,对智慧房产.智慧物业相关方向的可视化呈现做的一点尝试. 传统的 智慧房产/楼宇自动化/智慧物业 常会采用 BIM(建筑信息模型 Bui ...