在ELK的数据库报警系统中,发现有台机器报出了下面的错误:

2018-12-04 18:55:26.842 CST,"XXX","XXX",21106,"XXX",5c065c3d.5272,4,"idle",2018-12-04 18:51:41 CST,117/0,0,ERROR,54000,"out of memory","Cannot enlarge string buffer containing 0 bytes by 1342177281 more bytes.",,,,,,,"enlargeStringInfo, stringinfo.c:268",""

当看到是发生了OOM时,以为是整个数据库实例存在了问题,线上检查发现数据库正常,后查阅资料了解到,pg对于一次执行的查询语句长度是有限制的,如果长度超过了1G,则会报出上面的错误。

上面日志中的1342177281 bytes是查询的长度。

在使用copy的时候,也常会报出类似的问题,此时就要根据报错,查看对应的行数是不是由于引号或转义问题导致了对应行没有恰当的结束,或者是一整行的内容大于了1G。

下面是翻阅pg9.6源码找到的相关内容:

结合注释,pg的源码很容易看懂。

src/include/utils/memutils.h

/*
* MaxAllocSize, MaxAllocHugeSize
* Quasi-arbitrary limits on size of allocations.
*
* Note:
* There is no guarantee that smaller allocations will succeed, but
* larger requests will be summarily denied.
*
* palloc() enforces MaxAllocSize, chosen to correspond to the limiting size
* of varlena objects under TOAST. See VARSIZE_4B() and related macros in
* postgres.h. Many datatypes assume that any allocatable size can be
* represented in a varlena header. This limit also permits a caller to use
* an "int" variable for an index into or length of an allocation. Callers
* careful to avoid these hazards can access the higher limit with
* MemoryContextAllocHuge(). Both limits permit code to assume that it may
* compute twice an allocation's size without overflow.
*/
#define MaxAllocSize ((Size) 0x3fffffff) /* 1 gigabyte - 1 */

src/backend/lib/stringinfo.c

/*
* enlargeStringInfo
*
* Make sure there is enough space for 'needed' more bytes
* ('needed' does not include the terminating null).
*
* External callers usually need not concern themselves with this, since
* all stringinfo.c routines do it automatically. However, if a caller
* knows that a StringInfo will eventually become X bytes large, it
* can save some palloc overhead by enlarging the buffer before starting
* to store data in it.
*
* NB: because we use repalloc() to enlarge the buffer, the string buffer
* will remain allocated in the same memory context that was current when
* initStringInfo was called, even if another context is now current.
* This is the desired and indeed critical behavior!
*/
void
enlargeStringInfo(StringInfo str, int needed)
{
int newlen; /*
* Guard against out-of-range "needed" values. Without this, we can get
* an overflow or infinite loop in the following.
*/
if (needed < 0) /* should not happen */
elog(ERROR, "invalid string enlargement request size: %d", needed);
if (((Size) needed) >= (MaxAllocSize - (Size) str->len))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("out of memory"),
errdetail("Cannot enlarge string buffer containing %d bytes by %d more bytes.",
str->len, needed))); needed += str->len + 1; /* total space required now */ /* Because of the above test, we now have needed <= MaxAllocSize */ if (needed <= str->maxlen)
return; /* got enough space already */ /*
* We don't want to allocate just a little more space with each append;
* for efficiency, double the buffer size each time it overflows.
* Actually, we might need to more than double it if 'needed' is big...
*/
newlen = 2 * str->maxlen;
while (needed > newlen)
newlen = 2 * newlen; /*
* Clamp to MaxAllocSize in case we went past it. Note we are assuming
* here that MaxAllocSize <= INT_MAX/2, else the above loop could
* overflow. We will still have newlen >= needed.
*/
if (newlen > (int) MaxAllocSize)
newlen = (int) MaxAllocSize; str->data = (char *) repalloc(str->data, newlen); str->maxlen = newlen;
}

src/include/lib/stringinfo.h

下面是字符串存储用到的结构体:

/*-------------------------
* StringInfoData holds information about an extensible string.
* data is the current buffer for the string (allocated with palloc).
* len is the current string length. There is guaranteed to be
* a terminating '\0' at data[len], although this is not very
* useful when the string holds binary data rather than text.
* maxlen is the allocated size in bytes of 'data', i.e. the maximum
* string size (including the terminating '\0' char) that we can
* currently store in 'data' without having to reallocate
* more space. We must always have maxlen > len.
* cursor is initialized to zero by makeStringInfo or initStringInfo,
* but is not otherwise touched by the stringinfo.c routines.
* Some routines use it to scan through a StringInfo.
*-------------------------
*/
typedef struct StringInfoData
{
char *data;
int len;
int maxlen;
int cursor;
} StringInfoData; typedef StringInfoData *StringInfo;

从存放字符串或二进制的结构体StringInfoData中,可以看出pg字符串类型不支持\u0000的原因,因为在pg中的字符串形式是C strings,是以\0结束的字符串,\0在ASCII中叫做NUL,Unicode编码表示为\u0000,八进制则为0x00,如果字符串中包含\0,pg会当做字符串的结束符。

pg中的字符串不支持其中包含NULL(\0x00),这个很明显是不同于NULL值的,NULL值pg是支持的。

在具体的使用中,可以将\u0000替换掉再导入pg数据库。

在其他数据库导入pg时,可以使用下面方式替换:

regexp_replace(stringWithNull, '\\u0000', '', 'g')

java程序中替换:

str.replaceAll('\u0000', '')

vim替换:

s/\x00//g;

参考:

src/backend/lib/stringinfo.c

src/include/lib/stringinfo.h

src/include/utils/memutils.h

https://en.wikipedia.org/wiki/Null-terminated_string

https://stackoverflow.com/questions/1347646/postgres-error-on-insert-error-invalid-byte-sequence-for-encoding-utf8-0x0?rq=1

Cannot enlarge string buffer containing XX bytes by XX more bytes的更多相关文章

  1. ORA-06502:PL/SQL :numberic or value error: character string buffer too small

    今天遇到一个错误提示:ORA-06502:PL/SQL :numberic or value error: character string buffer too small,一般对应的中文信息为:O ...

  2. String、String Buffer、String Builder

    对于String.String Buffer.String Builder:我一直都只知道String是字符串常量,后两者是字符串变量: String和String Buffer是线程安全的,Stri ...

  3. String Buffer和String Builder(String类深入理解)

      String在Java里面JDK1.8后它属于一个特殊的类,在创建一个String基本对象的时候,String会向“ 字符串常量池(String constant pool)” 进行检索是否有该数 ...

  4. 中文转unicode,中文转bytes,unicode转bytes java实现

    utf-8 utf-8格式的中文由三位字节组成. UTF-8的编码规则很简单,只有二条: 1)对于单字节的符号,字节的第一位设为0,后面7位为这个符号的unicode码.因此对于英语字母,UTF-8编 ...

  5. Exception: Operation xx of contract xx specifies multiple request body parameters to be serialized without any wrapper elements.

    Operation 'CreateProductCodeStock' of contract 'IChileService' specifies multiple request body param ...

  6. Linux内核中的Kconfig、xx.defconfig、xx.config、Makefile

    什么是Kconfig.xx.defconfig.xx.config.Makefile Kconfig: 一个文本形式的文件,其中主要作用是在内核配置时候,作为配置选项. xx.deconfig: Li ...

  7. (转)JVM内存分配 -Xms128m -Xmx512m -XX:PermSize=128m -XX:MaxPermSize=512m

    在linux环境下配置项目运行环境时,部署的人员都会分配一下内存,以保证程序正常的运行.其实在开发的时候(window系统),就已经涉及到内存分配了,只是这些参数有默认值,因此一直没有去重视它. 以M ...

  8. eclipse不能自动编译XX.java为XX.classs

    问题描述:eclipse不能自动编译XX.java为XX.classs 原因:今天下午写代码,因为需要引入jstl包,引入后发现原来项目中已经引入了,然后我又把包删除了,忘记删除java build ...

  9. Python3.x:报错POST data should be bytes, an iterable of bytes

    Python3.x:报错POST data should be bytes, an iterable of bytes 问题: python3.x:报错 POST data should be byt ...

随机推荐

  1. (第十一周)Beta—review阶段成员贡献分

    项目名:食物链教学工具 组名:奋斗吧兄弟 组长:黄兴 组员:李俞寰.杜桥.栾骄阳.王东涵 个人贡献分=基础分+表现分 基础分=5*5*0.5/5=2.5 成员得分如下: 成员 基础分 表现分 个人贡献 ...

  2. (第十周)评论Beta发布

    本人所在组:奋斗吧兄弟 按课上展示的顺序对每组进行点评: 1.  飞天小女警 项目:礼物挑选工具 相对于alpha发布时有了很大的进步.项目的界面很漂亮,这个项目的想法很新颖,我很喜欢.礼物的挑选给出 ...

  3. java 实验1

    北京电子科技学院(BESTI) 实     验    报     告 课程:Java程序设计 班级:1352 姓名:杨光 学号:20135233 成绩:            指导教师:娄嘉鹏  实验 ...

  4. Javascript面向对象二

    Javascript面向对象二 可以通过指定原型属性来对所有的对象指定属性, Object.prototype.name="zhangsan"; Object.prototype. ...

  5. Beta Scrum Day 7 — 听说

    7#听说

  6. Android笔记-1

    1.点击按钮出现小窗口(响应事件) 配置方式: Activity_main.xml文件中:<Button (输入)android: onClick=”test1” /> MainActiv ...

  7. 微信小程序倒计时实现

    思路:跟一般js倒计时一样,主要在于this的变相传递. 实现效果: wxml文件部分代码: common.js文件 : 引用页JS文件: PS: 1.在data里初始化时间格式,是避免时间加载的第1 ...

  8. PAT 甲级 1151 LCA in a Binary Tree

    https://pintia.cn/problem-sets/994805342720868352/problems/1038430130011897856 The lowest common anc ...

  9. dstat 监控时,无颜色显示

    linux:Centos 6.6 dstat:0.7.0 注意,有这个提醒:Color support is disabled, python-curses is not installed good ...

  10. 使用docker-compose编排django、mysql实战

    背景: 本萌最近在部署自己开发的项目的时候发现同一套代码上传到服务器上后,部分功能莫名其妙的有点问题,服务器的各项配置都没有做过变动,所以想把项目转战到docker. 奈何刚接触docker,很多地方 ...