今天在看OGG的日志时。发现例如以下OGG-01224 Bad file number错误。查阅资料才知道port不可用,看了一下mgr的參数,发现是设置的DYNAMICPORTLIST 动态port。为什么还不可用。看看MOS上面咋说的:

OGG GoldenGate Extract | Pump
Abends with: "TCP/IP Error 9 (Bad File Number)" (文档 ID 1359087.1)
转究竟部

In this Document

_afrLoop=3534080417780040&id=1359087.1&_adf.ctrl-state=z21n9swk4_33#SYMPTOM" style="">Symptoms

 

_afrLoop=3534080417780040&id=1359087.1&_adf.ctrl-state=z21n9swk4_33#CAUSE" style="">Cause

  Solution
  References

APPLIES TO:

Oracle GoldenGate – Version 11.1.1.1.0 and later

Information in this document applies to any platform.

***Checked for relevance on 24-Mar-2013***

SYMPTOMS

Extract or pump abends with "TCP/IP error 9 (Bad file number)" when starting.

The exact error number may vary:

e.g.,

1. Version 11.1.1.1 12771498

ERROR OGG-01224 TCP/IP error 9 (Bad file number).

ERROR OGG-01668 PROCESS ABENDING.

2. pre GA release version:  v11_1_1_1_024

GGS ERROR 150 TCP/IP error 9 (Bad file number). 

GGS ERROR 190 PROCESS ABENDING.

Let's say you only have 4 Extract pump processes communicating with this target system, and have specified 21 dynamic ports on the target system. Why can't your Extract connect
to the remote system?

CAUSE

The cause is that all the ports listed in DYNAMICPORTLIST in the manager parameter at downstream server are in use.

This error message is usually due to port allocation failure or some orphan collector processes on target preventing the startup of new collectors.

In the GoldenGate environment on the target system, check the ports that Manager is using with this command:

GGSCI (ggdb1) 21> send mgr getportinfo detail

Sending GETPORTINFO, request to MANAGER …

Dynamic Port List

Starting Index 21

Reassign Delay 3 seconds

Entry Port Error Process Assigned Program

—– —– —– ———- ——————- ——-

0 7810 98

1 7811 98

2 7812 0

3 7813 0 18713 2011/07/28 20:39:12 Server

4 7814 0

5 7815 0 3662 2011/07/29 22:11:02 Server

6 7816 0 27070 2011/07/30 02:16:11 Server

7 7817 0 7789 2011/07/31 16:56:10 Server

8 7818 0 14116 2011/07/31 17:36:18 Server

9 7819 0 10900 2011/07/31 17:59:59 Server

10 7820 0 28045 2011/08/01 04:26:01 Server

11 7821 98

12 7822 0 31379 2011/08/01 05:29:31 Server

13 7823 0 23538 2011/08/02 10:21:54 Server

14 7824 0 23593 2011/08/02 10:22:01 Server

15 7825 0 6687 2011/08/03 07:17:50 Server

16 7826 0 17339 2011/08/03 07:22:01 Server

17 7827 0 24905 2011/08/03 08:51:51 Server

18 7828 98

19 7829 0 1881 2011/08/03 08:55:45 Server

20 7830 0 5884 2011/08/03 08:57:03 Server

It is reporting error 98 on the open ports from 7810-7828. The error 98 is "address in use".

You can also check the Manager report file on the target system (dirrpt/MGR.rpt) for an error message.

GGS INFO 301 Command received from EXTRACT on host 10.30.113.28 (START SERVER CPU -1 PRI -1 PARAMS ).

GGS INFO 302 No Dynamic Ports Available.

This confirms that the Manager cannot allocate anymore dynamic ports. Since there should be plenty of ports available, this indicates that there may be "orphaned" server collector
processes.

SOLUTION

The workaround is to increase the port numbers in DYNAMICPORTLIST in mgr.prm.

Bounce the manager afterwards.

If the source Extract dies without communicating to the target server collector, that server will be orphaned and must be killed. Development plans to improve this behavior in a future release (tracked via enhancement Bug 10430342), but until that time, they
must be handled manually in this manner.

You can determine which processes are orphans by stopping the upstream pumps and then seeing what Server processes are still running, and killing them. Once these servers are killed, you should be able restart all the pumps.

In ggsci on the target system:

send mgr childstatus debug

This will retrieve status information about processes started by Manager, and the corresponding port numbers that have been allocated by Manager.

GGSCI (ggdb1) 23> send mgr childstatus debug

Sending CHILDSTATUS, request to MANAGER …

Child Process Status – 14 Entries

ID Group Process Retry Retry Time Start Time Port

—- ——– ———- —– —————— ———– —-

0 PEESIS 27760 0 None 2011/08/04 09:30:45 7843

1 PPESIS 27767 0 None 2011/08/04 09:30:45 7845

2 SRESIP 31149 1 2011/08/04 09:44:32 2011/08/04 09:30:45 8003

3 PEASPIS 3177 2 2011/08/04 10:12:07 2011/08/04 09:30:45 8002

4 PPASPIS 27784 0 None 2011/08/04 09:30:45 7854

5 SRASPIP 27792 0 None 2011/08/04 09:30:45 7860

6 PEIPATIS 27798 0 None 2011/08/04 09:30:46 7861

8 SRIPATIP 27800 0 None 2011/08/04 09:30:46 8000

9 PEDTIS 28879 0 None 2011/08/04 09:30:53 8001

13 PETIBCOS 28959 0 None 2011/08/04 09:30:58 8004

19 SRDTIP 29051 0 None 2011/08/04 09:31:07 8005

20 SREEXIP 29060 0 None 2011/08/04 09:31:08 8006

21 SREXCIP 29097 0 None 2011/08/04 09:31:09 8007

22 SRIPIP 29098 0 None 2011/08/04 09:31:10 8008

You can also use this command to determine what server collector processes are running:

ps -ef | grep server

{ggate}sintegoradb1.aeso.ca:/usr/ggate/product/10.4.0/ggs >ps -ef | grep server

ggate 1881 7278 0 08:55 ? 00:00:02 ./server -p 7829 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 3556 1 0 Jul25 ?

00:01:57 ./server -p 7877 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 3558 1 0 Jul20 ?

00:03:51 ./server -p 7828 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 5884 7278 0 08:57 ?

00:00:02 ./server -p 7830 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

root 6312 1 0 Jul05 ? 00:00:00 /usr/bin/hidd –server

ggate 6687 7278 0 07:17 ? 00:00:04 ./server -p 7825 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 9121 1 0 Jul06 ? 00:07:34 ./server -p 7953 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

root 10924 1 0 Jul05 ?

00:00:00 /usr/libexec/gam_server

ggate 13060 1 0 Jul22 ? 00:02:35 ./server -p 7867 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 14345 1382 0 13:32 pts/4 00:00:00 grep server

ggate 17339 7278 0 07:22 ? 00:00:03 ./server -p 7826 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 18528 1 0 Jul06 ? 00:05:54 ./server -p 7967 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 23233 1 0 Jul06 ? 00:05:55 ./server -p 7965 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 23538 7278 0 Aug02 ? 00:00:14 ./server -p 7823 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 23593 7278 0 Aug02 ? 00:00:15 ./server -p 7824 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 24905 7278 0 08:51 ? 00:00:02 ./server -p 7827 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

ggate 31379 7278 0 Aug01 ? 00:00:28 ./server -p 7822 -k -l /usr/ggate/product/10.4.0/ggs/ggserr.log

In summary:

1) Stop the Manager on the target system, and all of the upstream GoldenGate pump Extract processes.

2) Kill any remaining server processes.

3) Restart the Manager, and then restart the Extracts.

REFERENCES

OGG-01224 Bad file number的更多相关文章

  1. 解决 TortoiseGit 诡异的 Bad file number 问题

    http://blog.csdn.net/renfufei/article/details/41648061 问题描述 昨天,以及今天(2014-11-29),使用 TortoiseGit 时碰到了一 ...

  2. GitHub上传不了的解决 ssh: connect to host github.com port 22: Bad file number git did not exit cleanly (exit code 128)

    问题情况 本来一直用的是github的客户端,结果现在上传的时候出问题了,去网站上看,新项目已经创建,但是代码却怎么都上传不上去.于是只好用命令行的方式解决. Tortoisegit上是这样说的: g ...

  3. 解决 TortoiseGit 诡异的 Bad file number 问题(转)

    问题描述 昨天,以及今天(2014-11-29),使用 TortoiseGit 时碰到了一个诡异的问题. 卸载,清理注册表,重装,重启,各种折腾以后,还是不能解决. 但是23.45分一过,突然灵光一闪 ...

  4. (转)解决 TortoiseGit 诡异的 Bad file number 问题

    此问题,请不要使用 rebase, 下载最新的 TortoiseGit 即可: TortoiseGit-2.3中文版与Git安装包_手册: http://download.csdn.net/detai ...

  5. Unable to open the physical file xxxx. Operating system error 2

    在新UAT服务器上,需要将tempdb放置在SSD(固态硬盘)上.由于SSD(固态硬盘)特性,所以tempdb的文件只能放置在D盘下面,而不能是D盘下的某一个目录下面. ALTER  DATABASE ...

  6. InnoDB: Error number 24 means ‘Too many open files’.--转载

    一.问题的描述 备份程序 执行前滚的时候报错.(-apply-log) InnoDB: Errornumber 24 means 'Too many open files'. InnoDB: Some ...

  7. Oracle中 根据 file# 和 block# 找到对象

    我们在10046生产的trace 文件里经常看到下面的信息. 表示系统在等待散列读取某个文件号的某个块开始的8个块. WAIT #6: nam='db file scattered read' ela ...

  8. 左右v$datafile和v$tempfile中间file#

    v$datafile关于存储在文件中的数据视图的信息,v$tempfile查看存储在一个临时文件中的信息. 有两种观点file#现场,首先来看看官方文件的定义: V$DATAFILE This vie ...

  9. Natas Wargame Level 13 Writeup(文件上传漏洞,篡改file signature,Exif)

    aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAqMAAADDCAYAAAC29BgbAAAABHNCSVQICAgIfAhkiAAAIABJREFUeF

随机推荐

  1. 定位- CLGeoencoder - 反编码

    #import "ViewController.h" #import "MBProgressHUD+MJ.h" #import <CoreLocation ...

  2. htm、html、shtml区别

    htm.html.shtml都是静态网页的后缀,三者也可以说都是只是扩展名不同,其他一样,都是静态的网页. htm和html是完全静态的网页不通过服务器编译解释直接送出给浏览器读取的静态网页,以htm ...

  3. 小Z的创业经历 谢谢支持

    写这篇文章的目的是跟大家分享下创业的一些想法,经历.希望对你有所帮助或有所思考.  我想用6篇文章介绍下前期创业经历 1.怎么创业了? 2.万事开头难,怎么开始呢? 3.我们的系统详情(上) 4.我们 ...

  4. JavaScript 语言基础知识点总结

    网上找到的一份JavaScript 语言基础知识点总结,还不错,挺全面的. (来自:http://t.cn/zjbXMmi @刘巍峰 分享 )  

  5. 【简译】JavaScript闭包导致的闭合变量问题以及解决方法

    本文是翻译此文 预先阅读此文:闭合循环变量时被认为有害的(closing over the loop variable considered harmful) JavaScript也有同样的问题.考虑 ...

  6. 同一個Loader對象傳入不同參數時,从数据库中查询的結果每次都一樣

    發現問題: LoaderManager().initLoader()方法調用時會根據第一個參數ID去判斷是否已經存在一個Loader加載器,如果存在則複 用,不存在則建一個新的加載器.由於我第一次已經 ...

  7. matlab使用reshape时按照列优先原则取元素和摆放元素

    参考 http://blog.csdn.net/superdont/article/details/3992033 a=[1  2 3  4] 如果使用b=reshape(a,1,4),则得到的结果是 ...

  8. WCF 托管在IIS中遇到Http的错误

    IIS8中部署WCF服务出错:HTTP 错误 404.3 - Not Found http://www.cnblogs.com/xwgli/archive/2013/03/15/2961022.htm ...

  9. Android webview中cookie增加/修改

    最近项目需求中,需要满足往webview传递cookie,而且cookie需要增加修改: public class MainActivity extends Activity { private We ...

  10. 提升你的Java应用性能:改善数据处理

    许多应用程序在压力测试阶段或在生产环境中都会遇到性能问题.如果我们看一下性能问题背后的原因,会发现很多是由数据处理不当造成.数据处理在应用面对大数据量时是非常关键的.这里有一些实用的数据处理技巧可以帮 ...