Recovery

Types of Failures

Wrong data entry

  • Prevent by having constraints in the database
  • Fix with data cleaning

Disk crashes

  • Prevent by using redundancy (RAID, archive)
  • Fix by using archives

Fire, theft, bankruptcy…

  • Buy insurance, change profession…

System failures: most frequent (e.g. power)

  • Use recovery

System Failures

Each transaction has internal state

When system crashes, internal state is lost

  • Don’t know which parts executed and which didn’t

Remedy: use a log

  • A file that records every single action of the transaction

Transactions

Assumption: the database is composed of elements

Usually 1 element = 1 block

Can be smaller (=1 record) or larger (=1 relation)

Assumption: each transaction reads/writes some elements

Correctness Principle

There exists a notion of correctness for the database

  • Explicit constraints (e.g. foreign keys)
  • Implicit conditions (e.g. sum of sales = sum of invoices)

Correctness principle: if a transaction starts in a correct database state, it ends in a correct database state

Consequence: we only need to guarantee that transactions are atomic, and the database will be correct forever

Primitive Operations of Transactions

INPUT(X)

  • read element X to memory buffer

READ(X,t)

  • copy element X to transaction local variable t

WRITE(X,t)

  • copy transaction local variable t to element X

OUTPUT(X)

  • write element X to disk

The Log

An append-only file containing log records

Note: multiple transactions run concurrently, log records are interleaved

After a system crash, use log to:

  • Redo some transaction that didn’t commit
  • Undo other transactions that didn’t commit

Undo Logging

Log records

transaction T has begun

T has committed

T has aborted

<T,X,v> T has updated element X, and its old value was v

Undo-Logging Rules

U1: If T modifies X, then <T,X,v> must be written to disk before X is written to disk

U2: If T commits, then must be written to disk only after all changes by T are written to disk

Hence: OUTPUTs are done early

Recovery with Undo Log

After system’s crash, run recovery manager

Idea 1. Decide for each transaction T whether it is completed or not

Idea 2. Undo all modifications by incompleted transactions

Recovery manager:

Read log from the end; cases:

  • : mark T as completed
  • : mark T as completed
  • <T,X,v>: if T is not completed
    then write X=v to disk
    else ignore
  • : ignore

阅读方向,从下向上

Note: all undo commands are idempotent, If we perform them a second time, no harm is done

stop reading the log:

  • We cannot stop until we reach the beginning of the log file
  • This is impractical
  • Better idea: use checkpointing

Checkpointing

Checkpoint the database periodically

  • Stop accepting new transactions
  • Wait until all curent transactions complete
  • Flush log to disk
  • Write a log record, flush
  • Resume transactions

Redo Logging

Log records

<T,X,v>= T has updated element X, and its new value is v

R1: If T modifies X, then both <T,X,v> and must be written to disk before X is written to disk

Hence: OUTPUTs are done late

After system’s crash, run recovery manager

Step 1. Decide for each transaction T whether it is completed or not

Step 2. Read log from the beginning, redo all updates of committed transactions

Undo/Redo Logging

Log records, only one change

<T,X,u,v>= T has updated element X, its old value was u, and its new value is v

Recovery with Undo/Redo Log

After system’s crash, run recovery manager

Redo all committed transaction, top-down

Undo all uncommitted transactions, bottom-up

总结

日志undo先写日志(从下向上读)redo先写磁盘(从上到下读)

冲突可串行 & 两阶锁

两个事物使用同一个资源并有一个是写就是冲突的,简单讲就是在冲突可串行并发操作的前驱图中是没有环路的,前驱图无环就是冲突可串行的。

每个事物在使用资源的时候都是先统一取再统一放的,也就是其图示先增后减,斜率不会出现其他变动。

E.g.

Consider the following schedule:
T1 STARTS
T1 reads item B
T1 writes item B with old value 11, new value 12
T2 STARTS
T2 reads item B
T2 writes item B with old value 12, new value 13
T3 STARTS
T3 reads item A
T3 writes item A with old value 29, new value 30
T2 reads item A
T2 writes item A with old value 30, new value 31
T2 COMMITS
T1 reads item D
T1 writes item D with old value 44, new value 45
T3 COMMITS
T1 COMMITS

(a) What serial schedule is this equivalent to? If none, then explain why.

The serializability graph for the above schedule is: T1T2  T3. Any order that complies with the
topological order of the graph like T1  T3  T2 is an equivalent serial schedule for our schedule

(b) Is this schedule consistent with two phase locking? Explain why.

If we assume that all
transactions get the locks exactly before the operation and release them
afterwards, it is not consistent with two phase locking. This is
because T1 releases its lock on B after its second operation while
acquiring a lock on D at its last two operations. By removing the last
two operations of T1 the schedule becomes 2PL.

If we assume that the
transactions get all the locks they need at the beginning of the
transaction, and release them after the finish the operation, this
schedule will be 2PL. The minimum operations that could be added to the
schedule will be “T1 reads item A”. In this case, T1 has to acquire the
lock on A again after releasing its lock on A after its first
write.(这段话太深奥了,我用百度翻译都没看懂。。。)

Data Management Technology(5) -- Recovery的更多相关文章

  1. Data Management Technology(1) -- Introduction

    1.Database concepts (1)Data & Information Information Is any kind of event that affects the stat ...

  2. Data Management Technology(3) -- SQL

    SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data ...

  3. Data Management Technology(2) -- Data Model

    1.Data Model Model Is the abstraction of real world Reveal the essence of objects, help people to lo ...

  4. Data Management Technology(4) -- 关系数据库理论

    规范化问题的提出 在规范化理论出现以前,层次和网状数据库的设计只是遵循其模型本身固有的原则,而无具体的理论依据可言,因而带有盲目性,可能在以后的运行和使用中发生许多预想不到的问题. 在关系数据库系统中 ...

  5. [Windows Azure] Data Management and Business Analytics

    http://www.windowsazure.com/en-us/develop/net/fundamentals/cloud-storage/ Managing and analyzing dat ...

  6. Intel Active Management Technology

    http://en.wikipedia.org/wiki/Intel_Active_Management_Technology Intel Active Management Technology F ...

  7. MySQL vs. MongoDB: Choosing a Data Management Solution

    原文地址:http://www.javacodegeeks.com/2015/07/mysql-vs-mongodb.html 1. Introduction It would be fair to ...

  8. 场景3 Data Management

    场景3 Data Management 数据管理 性能优化 OLTP OLAP 物化视图 :表的快照 传输表空间 :异构平台的数据迁移 星型转换 :事实表 OLTP : 在线事务处理 1. trans ...

  9. Data Management and Data Management Tools

    Data Management ObjectivesBy the end o this module, you should understand the fundamentals of data m ...

随机推荐

  1. 常见SQL编写和优化

    常见的SQL优化方式 对查询进行优化,应尽量避免全表扫描,首先应考虑在where及order by 涉及的列上建立索引. 应尽量避免在 where 子句中对字段进行null 值判断,否则将导致引擎放弃 ...

  2. Cortex-A7处理器算数运算指令和逻辑运算指令

      汇编中也可以进行算术运算, 比如加减乘除,常用的运算指令用法如表所示: 常用运算指令 在嵌入式开发中最常会用的就是加减指令,乘除基本用不到. 我们用 C 语言进行CPU 寄存器配置的时候常常需要用 ...

  3. git 版本检出checkout的方法笔记

    想检出指定版本,比如回退版本,将代码检出到老代码 git checkout 版本号 git reflog git checkout  标签名 1.git log 查看版本信息,复制版本号,执行git ...

  4. 非常详细的 (VMware安装Centos7超详细过程)

    本篇文章主要介绍了VMware安装Centos7超详细过程(图文),具有一定的参考价值,感兴趣的小伙伴们可以参考一下 1.软硬件准备 软件:推荐使用VMwear,我用的是VMwear 12 镜像:Ce ...

  5. 《Netty Zookeeper Redis 高并发实战》 图书简介

    <Netty Zookeeper Redis 高并发实战> 图书简介 本书为 高并发社群 -- 疯狂创客圈 倾力编著, 高度剖析底层原理,深度解读面试难题 疯狂创客圈 Java 高并发[ ...

  6. swoole加密可破解吗

    程序的执行和加解密过程合二唯一,无论是内部开发人员和外部黑客攻击,即使拿到了数据和私钥和服务器的root权限,也无法解密还原数据. Swoole将加解密分成了3部分(程序+算法+私钥),缺一不可解密. ...

  7. 计算几何 val.2

    目录 计算几何 val.2 几何单位结构体板子 旋转卡壳 基础概念 求法 模板 半平面交 前置芝士:线段交 S&I算法 模板 最小圆覆盖 随机增量法 时间复杂度 模板 后记 计算几何 val. ...

  8. 几种设计良好结构以提高.NET应用性能的方法

    写在前面 设计良好的系统,除了架构层面的优良设计外,剩下的大部分就在于如何设计良好的代码,.NET提供了很多的类型,这些类型非常灵活,也非常好用,比如List,Dictionary.HashSet.S ...

  9. Nlog配置

    初次使用nlog,里里外外找了好久,终于搞会了. 使用nlog建日志输出到txt文件.数据库.邮件 nlog配置,如图 码云dome

  10. winform删除dataGridView列报异常:System.IndexOutOfRangeException:“索引 7 没有值

    winform界面如下: using System; using System.Collections.Generic; using System.ComponentModel; using Syst ...