有时候,你会在ORACLE数据库的告警日志中发现“Thread <number> cannot allocate new log, sequence <number>  Checkpoint not complete”这类告警。具体案例如下所示:

Thread 1 cannot allocate new log, sequence 279334

Checkpoint not complete

Current log# 4 seq# 279333 mem# 0: /u01/oradata/GSP/redo04.log

Current log# 4 seq# 279333 mem# 1: /u03/oradata/GSP/redo04.log

当然Thread或sequence的数值可能有所不同,基本上是类似下面这样的告警信息

Thread <number> cannot allocate new log, sequence <number>

Checkpoint not complete

也有可能是因为在等待重做日志的归档,出现的是下面这类告警信息

ORACLE Instance <name> - Can not allocate log, archival required

Thread <number> cannot allocate new log, sequence <number>

那么出现这类告警的具体原因是什么呢? 以及要如何去解决这个问题呢?

原因分析:

通常来说是因为重做日志(redo log)在写满后就会切换日志组,这个时候就会触发一次检查点事件(checkpoint),检查点(checkpoint)激活时会触发数据库写进程(DBWR),将数据缓冲区里的脏数据块写回到磁盘的数据文件中,只要这个脏数据写回磁盘事件没结束,那么数据库就不会释放这个日志组。在归档模式下,还会伴随着ARCH进程将重做日志进行归档的过程。如果重做日志(redo log)产生的过快,当CPK或归档还没完成,LGWR已经把其余的日志组写满,又要往当前的日志组里面写redo log的时候,这个时候就会发生冲突,数据库就会被挂起。并且一直会往alert.log中写类似上面的错误信息。

另外,重做日志在不同业务时段的切换频率不一样,所以出现这个错误,一般是业务繁忙或者出现大量DML操作的时候。

解决方法:

 

1:增大REDO LOG FILE的大小

增大redo log file的大小容易操作,但是redo log file设置为多大才是合理的呢?

1:参考V$INSTANCE_RECOVERY中OPTIMAL_LOGFILE_SIZE字段值,但是这个字段有可能为Null值,除非你调整FAST_START_MTTR_TARGET参数的值大于0

Redo log file size (in megabytes) that is considered optimal based on the current setting of FAST_START_MTTR_TARGET. It is recommended that the user configure all online redo logs to be at least this value.

官方文档的建议如下:

You can use the V$INSTANCE_RECOVERY view column OPTIMAL_LOGFILE_SIZE to determine the size of your online redo logs. This field shows the redo log file size in megabytes that is considered optimal based on the current setting of FAST_START_MTTR_TARGET. If this field consistently shows a value greater than the size of your smallest online log, then you should configure all your online logs to be at least this size.

Note, however, that the redo log file size affects the MTTR. In some cases, you may be able to refine your choice of the optimal FAST_START_MTTR_TARGET value by re-running the MTTR Advisor with your suggested optimal log file size.

SQL> SELECT OPTIMAL_LOGFILE_SIZE FROM V$INSTANCE_RECOVERY;

2:根据重做日志切换次数和重做日志生成的量来判断

可以用awr_redo_size_history脚本统计分析一下,每个小时、每天生成的归档日志的大小,然后可以某些时间段(切换频繁的时间段)的归档日志大小和15~ 20分钟(如果某个时间段切换非常频繁,几乎无法使用这个规则,因为重组日志会非常大)切换一次计算重做日志大小。当然这个不是放之四海而皆准的规则,需要根据实际业务判断,大部分情况下还是可以参考这个

计算重做日志的一个脚本,仅供参考

SELECT

(SELECT ROUND(AVG(BYTES) / 1024 / 1024, 2) FROM V$LOG) AS "Redo size (MB)",

ROUND((20 / AVERAGE_PERIOD) * (SELECT AVG(BYTES)

FROM V$LOG) / 1024 / 1024, 2) AS "Recommended Size (MB)"

FROM (SELECT AVG((NEXT_TIME - FIRST_TIME) * 24 * 60) AS AVERAGE_PERIOD

FROM V$ARCHIVED_LOG

WHERE FIRST_TIME > SYSDATE - 3

    AND TO_CHAR(FIRST_TIME, 'HH24:MI') BETWEEN

    &START_OF_PEAK_HOURS AND &END_OF_PEAK_HOURS

);

2:增加REDO LOG Group的数量

增加日志组的数量,其实并不能解决“Thread <number> cannot allocate new log, sequence <number> Checkpoint not complete” 这个问题,但是他能解决下面这个问题:

ORACLE Instance <name> - Can not allocate log, archival required

Thread <number> cannot allocate new log, sequence <number>

这个是因为ARCH进程,尚未完成将重做日志文件复制到归档目标(需要存档),而此时由于重做日志切换太快或日志组过少,必须等待ARCR进程完成归档后,才能循环覆盖日志组。

3:Tune checkpoint

 

这个比较难,参考官方文档:Note 147468.1 Checkpoint Tuning and Troubleshooting Guide

4:Increase I/O speed for writing online REDO log/Archived REDO

This applies to Thread <number> cannot allocate new log, sequence <number>

Checkpoint not complete

- use ASYNC I/O if not already so

- use DBWR I/O slaves or multiple DBWR processes

Reference:

Oracle Database Performance Tuning Guide

Instance Tuning Using Performance Views

Consider Multiple Database Writer (DBWR) Processes or I/O Slaves

10.2 - http://docs.oracle.com/cd/B19306_01/server.102/b14211/instance_tune.htm#i42802

11.1 - http://docs.oracle.com/cd/B28359_01/server.111/b28274/instance_tune.htm#i42802

11.2 - http://docs.oracle.com/cd/E11882_01/server.112/e16638/instance_tune.htm#PFGRF94511

- consider the generic recommendations for REDO log files:

If the high I/O files are redo log files, then consider splitting the redo log files from the other files. Possible configurations can include the following:

1. Placing all redo logs on one disk without any other files. Also consider availability; members of the same group should be on different physical disks and controllers for recoverability purposes.

2. Placing each redo log group on a separate disk that does not store any other files.

3. Striping the redo log files across several disks, using an operating system striping tool. (Manual striping is not possible in this situation.)

4. Avoiding the use of RAID 5 for redo logs.

Reference:

Oracle Database Performance Tuning Guide

Redo Log Files

10.2 - http://docs.oracle.com/cd/B19306_01/server.102/b14211/iodesign.htm#sthref534

11.1 - http://docs.oracle.com/cd/B28359_01/server.111/b28274/iodesign.htm#CHDBCDHG

11.2 - http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#PFGRF94396

For

ORACLE Instance <name> - Can not allocate log, archival required

Thread <number> cannot allocate new log, sequence <number>

In the above document you may check section "Archived Redo Logs"

5: 找到产生大量重做日志的SQL,如果这个SQL有业务或逻辑上不合理的地方,就要修改,或者将相关表设置为NOLOGGING,减少重做日志的产生

关于如何定位那些SQL产生了大量的重做日志,可以使用LogMiner工具,也可以参考我这篇博客“如何定位那些SQL产生了大量的redo日志

参考资料:

https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:69012348056

Manual Log Switching Causing "Thread 1 Cannot Allocate New Log" Message in the Alert Log (文档 ID 435887.1)

Can Not Allocate Log (文档 ID 1265962.1)

https://gokhanatil.com/2009/08/optimum-size-of-the-online-redo-log-files.html

Thread <number> cannot allocate new log, sequence <number>浅析的更多相关文章

  1. Thread 1 cannot allocate new log, sequence 187398

    报错信息: Thread 1 cannot allocate new log, sequence 187398Checkpoint not complete 处理方法: 查看REDO日志组 selec ...

  2. Thread 1 cannot allocate new log的问题分析 (转载)

    Thread 1 cannot allocate new log的问题分析 发生oracle宕机事故,alert文件中报告如下错误: Fri Jan 12 04:07:49 2007Thread 1 ...

  3. Thread 1 cannot allocate new log 的处理办法

    ALTER SYSTEM ARCHIVE LOG Thread 1 cannot allocate new log, sequence 2594 Checkpoint not complete 这个实 ...

  4. InnoDB: The log sequence number in ibdata files does not match

    InnoDB: The log sequence number in ibdata files does not matchInnoDB的:在ibdata文件的日志序列号不匹配 可能ibdata文件损 ...

  5. mysql oom之后的page 447 log sequence number 292344272 is in the future

    mysql oom之后,重启时发生130517 16:00:10 InnoDB: Error: page 447 log sequence number 292344272InnoDB: is in ...

  6. Thread 1 cannot allocate new log的问题分析

    http://blog.csdn.net/zonelan/article/details/7613519 http://leoguan.blog.51cto.com/816378/584494 htt ...

  7. The log scan number (620023:3702:1) passed to log scan in database 'xxxx' is not valid

    昨天一台SQL Server 2008R2的数据库在凌晨5点多抛出下面告警信息: The log scan number (620023:3702:1) passed to log scan in d ...

  8. [crypto][ipsec] 简述ESP协议的sequence number机制

    预备 首先提及一个概念叫重放攻击,对应的机制叫做:anti-replay https://en.wikipedia.org/wiki/Anti-replay IPsec协议的anti-replay特性 ...

  9. Sequence Number

    1570: Sequence Number 时间限制: 1 Sec  内存限制: 1280 MB 题目描述 In Linear algebra, we have learned the definit ...

随机推荐

  1. React劲爆新特性Hooks 重构去哪儿网火车票PWA

    React劲爆新特性Hooks 重构去哪儿网火车票PWA 获取课程资料链接:点击这里获取 本课程先带你细数最近一年来React的新特性,如Hooks.Redux API,让你从头理解Hooks对传统R ...

  2. Java程序员月薪三万,需要技术达到什么水平?

    最近跟朋友在一起聚会的时候,提了一个问题,说 Java 程序员如何能月薪达到二万,技术水平需要达到什么程度?人回答说这只能是大企业或者互联网企业工程师才能拿到.也许是的,小公司或者非互联网企业拿二万的 ...

  3. C语言笔记 06_作用域&数组

    作用域 任何一种编程中,作用域是程序中定义的变量所存在的区域,超过该区域变量就不能被访问.C 语言中有三个地方可以声明变量: 在函数或块内部的局部变量 在所有函数外部的全局变量 在形式参数的函数参数定 ...

  4. 深度好文:PHP写时拷贝与垃圾回收机制(转)

    原文地址:http://www.php100.com/9/20/87255.html 写入拷贝(Copy-on-write,简称COW)是一种计算机程序设计领域的优化策略.其核心思想是,如果有多个调用 ...

  5. logback日志文件位置动态指定

    logback日志文件位置动态指定 参考:https://stackoverflow.com/questions/19518843/logback-configuration-via-jvm-argu ...

  6. Parallel.ForEach 使用多线遍历循环

    Parallel.ForEach相对于foreach是多线程,并行操作;foreach是单线程品德操作. static void Main(string[] args) { Console.Write ...

  7. layui table+复杂表头+合并单元格

    效果图: 问题:行hover效果感觉错乱  所以改为透明色 代码: <!DOCTYPE html> <html lang="en"> <head> ...

  8. Mac环境安装非APP STORE中下载的软件,运行报错:“XXX” is damaged and can’t be opened. You should move it to the Trash. 解决办法

    出现这个错误的大多数原因都是因为系统设置的问题,因为系统不信任你从其他地方下载的软件安装包,所以运行时就给你阻止了.具体的设置步骤如下: 1. 打开系统偏好设置 (System Preferences ...

  9. Data Guard Physical Standby - RAC Primary to RAC Standby 使用第二个网络 (Doc ID 1349977.1)

    Data Guard Physical Standby - RAC Primary to RAC Standby using a second network (Doc ID 1349977.1) A ...

  10. 检测服务器是否开启重协商功能(用于CVE-2011-1473漏洞检测)

    背景 由于服务器端的重新密钥协商的开销至少是客户端的10倍,因此攻击者可利用这个过程向服务器发起拒绝服务攻击.OpenSSL 1.0.2及以前版本受影响. 方法 使用OpenSSL(linux系统基本 ...