小知识:什么叫做workaround?
技术人当遇到具体问题,能给出的各种解决方案,有一种类型叫做workaround,翻译过来通常为“应变方法”、“变通方法”;
其实这种方式通常是没有找到根本的解决方案,但是为了快速恢复业务而采用的一种巧妙规避/跳过的方式。
举个具体的例子:我有测试需求要在主库创建一个新的PDB:
1.创建新的PDB
创建一个专门存放AWRDUMP的PDB
CREATE PLUGGABLE DATABASE awr
ADMIN USER awr IDENTIFIED BY awr
ROLES = (dba)
DEFAULT TABLESPACE tbs_awr
DATAFILE '/flash/oradata/DEMO/awr/awr01.dbf' SIZE 250M AUTOEXTEND ON maxsize 10G
FILE_NAME_CONVERT = ('/flash/oradata/DEMO/pdbseed/',
'/flash/oradata/DEMO/awr/')
STORAGE (MAXSIZE 2G)
PATH_PREFIX = '/flash/oradata/DEMO/awr/';
主库创建成功没问题。
2.发现DG备库出现同步问题
发现备库没有同步,排查alert日志发现因为这个awr的目录不会自动创建,导致报错,最终引发MRP进程终止,详细告警日志如下:
2023-03-15T18:03:53.650192+08:00
Recovery of Online Redo Log: Thread 1 Group 11 Seq 446 Reading mem 0
Mem# 0: +DATADG/DEMORAC/ONLINELOG/group_11.276.1127389115
PR00 (PID:11733): Media Recovery Waiting for T-1.S-447 (in transit)
2023-03-15T18:03:55.023213+08:00
Recovery of Online Redo Log: Thread 1 Group 13 Seq 447 Reading mem 0
Mem# 0: +DATADG/DEMORAC/ONLINELOG/group_13.278.1127389119
2023-03-15T18:38:28.999341+08:00
Recovery created pluggable database AWR
Automatic Copy of Standby datafiles for create pdb failed with error - 65169. Files need to be copied manually
2023-03-15T18:38:29.121847+08:00
Errors in file /u01/app/oracle/diag/rdbms/demorac/jydb1/trace/jydb1_pr00_11733.trc:
ORA-65169: error encountered while attempting to copy file +DATADG/DEMORAC/pdbseed/system01.dbf
ORA-19504: failed to create file "+DATADG/DEMORAC/awr/system01.dbf"
ORA-17502: ksfdcre:4 Failed to create file +DATADG/DEMORAC/awr/system01.dbf
ORA-15173: entry 'awr' does not exist in directory 'DEMORAC'
PR00 (PID:11733): MRP0: Background Media Recovery terminated with error 1274
2023-03-15T18:38:29.188919+08:00
Errors in file /u01/app/oracle/diag/rdbms/demorac/jydb1/trace/jydb1_pr00_11733.trc:
ORA-01274: cannot add data file that was originally created as '/flash/oradata/DEMO/awr/system01.dbf'
ORA-01565: error in identifying file '+DATADG/DEMORAC/awr/system01.dbf'
ORA-17503: ksfdopn:2 Failed to open file +DATADG/DEMORAC/awr/system01.dbf
ORA-15173: entry 'awr' does not exist in directory 'DEMORAC'
2023-03-15T18:38:29.195972+08:00
.... (PID:5407): Managed Standby Recovery not using Real Time Apply
2023-03-15T18:38:29.340321+08:00
Recovery interrupted!
IM on ADG: Start of Empty Journal
IM on ADG: End of Empty Journal
Recovered data files to a consistent state at change 23401386
2023-03-15T18:38:29.903857+08:00
Increasing priority of 2 RS
Reconfiguration started (old inc 6, new inc 8)
List of instances (total 2) :
1 2
My inst 1
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
2023-03-15T18:38:29.983551+08:00
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2023-03-15T18:38:29.986436+08:00
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
Set master node info
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Reconfiguration complete (total time 0.3 secs)
Decreasing priority of 2 RS
2023-03-15T18:38:30.492490+08:00
Stopping change tracking
2023-03-15T18:38:30.501521+08:00
Errors in file /u01/app/oracle/diag/rdbms/demorac/jydb1/trace/jydb1_pr00_11733.trc:
ORA-01274: cannot add data file that was originally created as '/flash/oradata/DEMO/awr/system01.dbf'
ORA-01565: error in identifying file '+DATADG/DEMORAC/awr/system01.dbf'
ORA-17503: ksfdopn:2 Failed to open file +DATADG/DEMORAC/awr/system01.dbf
ORA-15173: entry 'awr' does not exist in directory 'DEMORAC'
2023-03-15T18:38:30.524601+08:00
Background Media Recovery process shutdown (jydb1)
3.如何解决?
如果是找根本解决方案,那就是要研究为何无法自动创建成功,是否能够自动创建成功,具体的改进方案?
如果一时找不到方案,急于恢复同步,那其实手工创建这个目录,然后重新开启同步,就能快速解决,但这种解决方式就叫做workaround:
ASMCMD> pwd
+DATADG/DEMORAC
ASMCMD> mkdir AWR
SQL> recover managed standby database disconnect;
Media recovery complete.
SQL> set lines 180
SQL> col name for a22
SQL> col value for a22
SQL> select * from v$dataguard_stats;
SOURCE_DBID SOURCE_DB_ NAME VALUE UNIT TIME_COMPUTED DATUM_TIME CON_ID
----------- ---------- ---------------------- ---------------------- ------------------------------ ------------------------------ ------------------------------ ----------
0 transport lag +00 00:00:00 day(2) to second(0) interval 03/15/2023 20:23:51 03/15/2023 20:23:50 0
0 apply lag +00 00:00:00 day(2) to second(0) interval 03/15/2023 20:23:51 03/15/2023 20:23:50 0
0 apply finish time +00 00:00:00.000 day(2) to second(3) interval 03/15/2023 20:23:51 0
0 estimated startup time 18 second 03/15/2023 20:23:51 0
SQL> !ps -ef|grep mrp
oracle 27472 1 0 20:22 ? 00:00:00 ora_mrp0_jydb1
oracle 27677 27436 0 20:24 pts/2 00:00:00 /bin/bash -c ps -ef|grep mrp
oracle 27679 27677 0 20:24 pts/2 00:00:00 grep mrp
从alert日志也可以看到,快速解决了问题:
2023-03-15T20:22:38.685163+08:00
ALTER DATABASE RECOVER managed standby database disconnect
2023-03-15T20:22:38.720205+08:00
Attempt to start background Managed Standby Recovery process (jydb1)
Starting background process MRP0
2023-03-15T20:22:38.745104+08:00
MRP0 started with pid=76, OS id=27472
2023-03-15T20:22:38.747543+08:00
Background Managed Standby Recovery process started (jydb1)
2023-03-15T20:22:43.778604+08:00
Starting single instance redo apply (SIRA)
Started logmerger process
2023-03-15T20:22:43.859358+08:00
IM on ADG: Start of Empty Journal
IM on ADG: End of Empty Journal
2023-03-15T20:22:43.877044+08:00
.... (PID:5407): Managed Standby Recovery starting Real Time Apply
2023-03-15T20:22:43.940962+08:00
max_pdb is 5
2023-03-15T20:22:44.268320+08:00
Increasing priority of 2 RS
Reconfiguration started (old inc 8, new inc 10)
List of instances (total 2) :
1 2
My inst 1
Global Resource Directory frozen
Communication channels reestablished
Master broadcasted resource hash value bitmaps
Non-local Process blocks cleaned out
2023-03-15T20:22:44.337010+08:00
LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2023-03-15T20:22:44.337017+08:00
LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
Set master node info
Dwn-cvts replayed, VALBLKs dubious
All grantable enqueues granted
Reconfiguration complete (total time 0.2 secs)
Decreasing priority of 2 RS
2023-03-15T20:22:45.062909+08:00
Parallel Media Recovery started with 4 slaves
2023-03-15T20:22:45.260072+08:00
Stopping change tracking
PR00 (PID:27481): Media Recovery Waiting for T-1.S-447 (in transit)
2023-03-15T20:22:45.456197+08:00
Recovery of Online Redo Log: Thread 1 Group 13 Seq 447 Reading mem 0
Mem# 0: +DATADG/DEMORAC/ONLINELOG/group_13.278.1127389119
2023-03-15T20:22:45.548608+08:00
ALTER SYSTEM SET remote_listener='db01rac-scan:1521' SCOPE=MEMORY SID='jydb1';
2023-03-15T20:22:45.550060+08:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='jydb1';
2023-03-15T20:22:45.765532+08:00
Completed: ALTER DATABASE RECOVER managed standby database disconnect
2023-03-15T20:22:49.050096+08:00
Recovery copied files for tablespace SYSTEM
Recovery successfully copied file +DATADG/DEMORAC/awr/system01.dbf from +DATADG/DEMORAC/pdbseed/system01.dbf
AWR(5):Recovery created file +DATADG/DEMORAC/awr/system01.dbf
AWR(5):Successfully added datafile 21 to media recovery
AWR(5):Datafile #21: '+DATADG/DEMORAC/awr/system01.dbf'
2023-03-15T20:22:55.588650+08:00
Recovery copied files for tablespace SYSAUX
Recovery successfully copied file +DATADG/DEMORAC/awr/sysaux01.dbf from +DATADG/DEMORAC/pdbseed/sysaux01.dbf
AWR(5):Recovery created file +DATADG/DEMORAC/awr/sysaux01.dbf
AWR(5):Successfully added datafile 22 to media recovery
AWR(5):Datafile #22: '+DATADG/DEMORAC/awr/sysaux01.dbf'
2023-03-15T20:22:58.602144+08:00
Recovery copied files for tablespace UNDOTBS1
Recovery successfully copied file +DATADG/DEMORAC/awr/undotbs01.dbf from +DATADG/DEMORAC/pdbseed/undotbs01.dbf
AWR(5):Recovery created file +DATADG/DEMORAC/awr/undotbs01.dbf
AWR(5):Successfully added datafile 23 to media recovery
AWR(5):Datafile #23: '+DATADG/DEMORAC/awr/undotbs01.dbf'
2023-03-15T20:23:00.926349+08:00
AWR(5):Recovery created file +DATADG/DEMORAC/awr/awr01.dbf
AWR(5):Successfully added datafile 24 to media recovery
AWR(5):Datafile #24: '+DATADG/DEMORAC/awr/awr01.dbf'
最近高层给全员开会时提到要向前走一步,如果是落到技术岗位上,那就是要不满足workaround的解决方式,去找根本的解决方案。
实际情况当然要灵活变通,比如当生产环境需要快速恢复业务/功能时,可以使用各种workaround去应对确保快速恢复,但事后排查根本原因时还是要找到根本的解决方案。
所以,文中举的这个问题根本解决方案是?先留给大家一起思考吧。
小知识:什么叫做workaround?的更多相关文章
- 蓝牙Bluetooth技术小知识
蓝牙Bluetooth技术以及广泛的应用于各种设备,并将继续在物联网IoT领域担任重要角色.下面搜集整理了一些关于蓝牙技术的小知识,以备参考. 蓝牙Bluetooth技术始创于1994年,其名字来源于 ...
- HTML+CSS中的一些小知识
今天分享一些HTML.CSS的小知识,希望能够对大家有所帮助! 1.解决网页乱码的问题:最重要的是要保证各个环节的字符编码一致! (1)编辑器的编辑环境的字符集(默认字符集):Crtl+U 常见的编码 ...
- iOS APP开发的小知识(分享)
亿合科技小编发现从2007年第一款智能手机横空出世,由此开启了人们的移动智能时代.我们从一开始对APP的陌生,到现在的爱不释手,可见APP开发的出现对我们的生活改变有多巨大.而iOS AP ...
- Unix系统小知识(转)
Unix操作系统的小知识 2.VI添加行号/翻页/清屏 .在对话模式时(即输完Esc再输入: ),输入“:set number”可以将编辑的文本加上行号.跟玩俄罗斯方块一样方便的上下左右移动箭头的快捷 ...
- salesforce 零基础开发入门学习(十)IDE便捷小知识
在这里介绍两个IDE的便捷开发的小知识. 一) 本地调试 由于salesforce代码只能提交以后才能调试,所以很多时候调试代码很麻烦.新版增加了一个特性:即可以在本地调试相关的代码或者查看相关代码运 ...
- Jquery:小知识;
Jquery:小知识: jQuery学习笔记(二):this相关问题及选择器 上一节的遗留问题,关于this的相关问题,先来解决一下. this的相关问题 this指代的是什么 这个应该是比较好理 ...
- HTML小知识---Label
今天知道了一个html小知识: <input type="checkbox" id="chkVersion" /> ...
- Unicode和汉字编码小知识
Unicode和汉字编码小知识 将汉字进行UNICODE编码,如:“王”编码后就成了“\王”,UNICODE字符以\u开始,后面有4个数字或者字母,所有字符都是16进制的数字,每两位表示的256以内的 ...
- Java异常的一个小知识
有以下两个代码: package com.lk.A; public class Test3 { public static void main(String[] args) { try { int a ...
- 12个你未必知道的CSS小知识
虽然CSS并不是一种很复杂的技术,但就算你是一个使用CSS多年的高手,仍然会有很多CSS用法/属性/属性值你从来没使用过,甚至从来没听说过. 1.CSS的color属性并非只能用于文本显示 对于CSS ...
随机推荐
- vue-awesome-swiper swiper/dist/css/swiper.css 报not found错误
解决办法删除package-lock.json文件写死package.json版本号 "vue-awesome-swiper": "^3.1.3", 删除no ...
- slot-具名插槽
定义组件:NamedSlot组件 <div class=""> <header> <slot name="header">& ...
- COM组件开发-关于在开发环境下COM组件的(来自 HRESULT 的异常:0x80080005 (CO_E_SERVER_EXEC_FAILURE)) 以及 在CLR语言下可能报错 未能加载文件或程序集“Interop.xxx 的问题
1.关于在开发环境下COM组件的(来自 HRESULT 的异常:0x80080005 (CO_E_SERVER_EXEC_FAILURE)) 开发环境下,COM组件注册的文件 不一定是你自己现在程序调 ...
- /etc/profile,/etc/bashrc,~/.profile,~/.bashrc 的区别及使用
转载请注明出处: /etc/profile 为系统的全局环境变量设置,此文件为系统的每个用户设置环境信息 /etc/bashrc 为每一个运行bash shell的用户执行此文件.当bash ...
- kafka 性能优化与常见问题优化处理方案
本文为博主原创,未经允许不得转载: 1. JVM参数优化设置 kafka是scala语言开发,运行在JVM上,需要对JVM参数合理设置,修改bin/kafka-start-server.sh中的jv ...
- Postman 接口测试配置 Pre-request Script
本文为博主原创,转载请注明出处: Pre-request Script 为Postman预置脚本,用于在postman 发送请求之前执行,封装计算或获取某些请求参数. 1. postman 脚本提供 ...
- docker容器中执行GPU环境中的tensorflow和pytorch任务
1. 背景 (1) 业务方提供了一台有GPU的服务器,且已经安装了显卡等组件,cuda版本10.2,具体信息如下 (2) 在裸机上部署anaconda.pytorch.tensorflow较为麻烦,因 ...
- 12-异步FIFO
1.异步FIFO的应用 跨时钟域 批量数据 传输效率高 2.异步FIFO结构 FIFO深度 - 双端口RAM设计 3.异步FIFO深度计算 4.异步FIFO读写地址的编码 5.异步FIFO读写时钟域的 ...
- P1914 小书童——凯撒密码
1.题目介绍 小书童--凯撒密码 题目背景 某蒟蒻迷上了 "小书童",有一天登陆时忘记密码了(他没绑定邮箱 or 手机),于是便把问题抛给了神犇你. 题目描述 蒟蒻虽然忘记密码,但 ...
- [转帖]MySQL Decimal 的实现方法
码: 背景 数字运算在数据库中是很常见的需求, 例如计算数量.重量.价格等, 为了满足各种需求, 数据库系统通常支持精准的数字类型和近似的数字类型. 精准的数字类型包含 int, decimal 等, ...