在ELK的数据库报警系统中,发现有台机器报出了下面的错误:

2018-12-04 18:55:26.842 CST,"XXX","XXX",21106,"XXX",5c065c3d.5272,4,"idle",2018-12-04 18:51:41 CST,117/0,0,ERROR,54000,"out of memory","Cannot enlarge string buffer containing 0 bytes by 1342177281 more bytes.",,,,,,,"enlargeStringInfo, stringinfo.c:268",""

当看到是发生了OOM时,以为是整个数据库实例存在了问题,线上检查发现数据库正常,后查阅资料了解到,pg对于一次执行的查询语句长度是有限制的,如果长度超过了1G,则会报出上面的错误。

上面日志中的1342177281 bytes是查询的长度。

在使用copy的时候,也常会报出类似的问题,此时就要根据报错,查看对应的行数是不是由于引号或转义问题导致了对应行没有恰当的结束,或者是一整行的内容大于了1G。

下面是翻阅pg9.6源码找到的相关内容:

结合注释,pg的源码很容易看懂。

src/include/utils/memutils.h

/*
* MaxAllocSize, MaxAllocHugeSize
* Quasi-arbitrary limits on size of allocations.
*
* Note:
* There is no guarantee that smaller allocations will succeed, but
* larger requests will be summarily denied.
*
* palloc() enforces MaxAllocSize, chosen to correspond to the limiting size
* of varlena objects under TOAST. See VARSIZE_4B() and related macros in
* postgres.h. Many datatypes assume that any allocatable size can be
* represented in a varlena header. This limit also permits a caller to use
* an "int" variable for an index into or length of an allocation. Callers
* careful to avoid these hazards can access the higher limit with
* MemoryContextAllocHuge(). Both limits permit code to assume that it may
* compute twice an allocation's size without overflow.
*/
#define MaxAllocSize ((Size) 0x3fffffff) /* 1 gigabyte - 1 */

src/backend/lib/stringinfo.c

/*
* enlargeStringInfo
*
* Make sure there is enough space for 'needed' more bytes
* ('needed' does not include the terminating null).
*
* External callers usually need not concern themselves with this, since
* all stringinfo.c routines do it automatically. However, if a caller
* knows that a StringInfo will eventually become X bytes large, it
* can save some palloc overhead by enlarging the buffer before starting
* to store data in it.
*
* NB: because we use repalloc() to enlarge the buffer, the string buffer
* will remain allocated in the same memory context that was current when
* initStringInfo was called, even if another context is now current.
* This is the desired and indeed critical behavior!
*/
void
enlargeStringInfo(StringInfo str, int needed)
{
int newlen; /*
* Guard against out-of-range "needed" values. Without this, we can get
* an overflow or infinite loop in the following.
*/
if (needed < 0) /* should not happen */
elog(ERROR, "invalid string enlargement request size: %d", needed);
if (((Size) needed) >= (MaxAllocSize - (Size) str->len))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("out of memory"),
errdetail("Cannot enlarge string buffer containing %d bytes by %d more bytes.",
str->len, needed))); needed += str->len + 1; /* total space required now */ /* Because of the above test, we now have needed <= MaxAllocSize */ if (needed <= str->maxlen)
return; /* got enough space already */ /*
* We don't want to allocate just a little more space with each append;
* for efficiency, double the buffer size each time it overflows.
* Actually, we might need to more than double it if 'needed' is big...
*/
newlen = 2 * str->maxlen;
while (needed > newlen)
newlen = 2 * newlen; /*
* Clamp to MaxAllocSize in case we went past it. Note we are assuming
* here that MaxAllocSize <= INT_MAX/2, else the above loop could
* overflow. We will still have newlen >= needed.
*/
if (newlen > (int) MaxAllocSize)
newlen = (int) MaxAllocSize; str->data = (char *) repalloc(str->data, newlen); str->maxlen = newlen;
}

src/include/lib/stringinfo.h

下面是字符串存储用到的结构体:

/*-------------------------
* StringInfoData holds information about an extensible string.
* data is the current buffer for the string (allocated with palloc).
* len is the current string length. There is guaranteed to be
* a terminating '\0' at data[len], although this is not very
* useful when the string holds binary data rather than text.
* maxlen is the allocated size in bytes of 'data', i.e. the maximum
* string size (including the terminating '\0' char) that we can
* currently store in 'data' without having to reallocate
* more space. We must always have maxlen > len.
* cursor is initialized to zero by makeStringInfo or initStringInfo,
* but is not otherwise touched by the stringinfo.c routines.
* Some routines use it to scan through a StringInfo.
*-------------------------
*/
typedef struct StringInfoData
{
char *data;
int len;
int maxlen;
int cursor;
} StringInfoData; typedef StringInfoData *StringInfo;

从存放字符串或二进制的结构体StringInfoData中,可以看出pg字符串类型不支持\u0000的原因,因为在pg中的字符串形式是C strings,是以\0结束的字符串,\0在ASCII中叫做NUL,Unicode编码表示为\u0000,八进制则为0x00,如果字符串中包含\0,pg会当做字符串的结束符。

pg中的字符串不支持其中包含NULL(\0x00),这个很明显是不同于NULL值的,NULL值pg是支持的。

在具体的使用中,可以将\u0000替换掉再导入pg数据库。

在其他数据库导入pg时,可以使用下面方式替换:

regexp_replace(stringWithNull, '\\u0000', '', 'g')

java程序中替换:

str.replaceAll('\u0000', '')

vim替换:

s/\x00//g;

参考:

src/backend/lib/stringinfo.c

src/include/lib/stringinfo.h

src/include/utils/memutils.h

https://en.wikipedia.org/wiki/Null-terminated_string

https://stackoverflow.com/questions/1347646/postgres-error-on-insert-error-invalid-byte-sequence-for-encoding-utf8-0x0?rq=1

Cannot enlarge string buffer containing XX bytes by XX more bytes的更多相关文章

  1. ORA-06502:PL/SQL :numberic or value error: character string buffer too small

    今天遇到一个错误提示:ORA-06502:PL/SQL :numberic or value error: character string buffer too small,一般对应的中文信息为:O ...

  2. String、String Buffer、String Builder

    对于String.String Buffer.String Builder:我一直都只知道String是字符串常量,后两者是字符串变量: String和String Buffer是线程安全的,Stri ...

  3. String Buffer和String Builder(String类深入理解)

      String在Java里面JDK1.8后它属于一个特殊的类,在创建一个String基本对象的时候,String会向“ 字符串常量池(String constant pool)” 进行检索是否有该数 ...

  4. 中文转unicode,中文转bytes,unicode转bytes java实现

    utf-8 utf-8格式的中文由三位字节组成. UTF-8的编码规则很简单,只有二条: 1)对于单字节的符号,字节的第一位设为0,后面7位为这个符号的unicode码.因此对于英语字母,UTF-8编 ...

  5. Exception: Operation xx of contract xx specifies multiple request body parameters to be serialized without any wrapper elements.

    Operation 'CreateProductCodeStock' of contract 'IChileService' specifies multiple request body param ...

  6. Linux内核中的Kconfig、xx.defconfig、xx.config、Makefile

    什么是Kconfig.xx.defconfig.xx.config.Makefile Kconfig: 一个文本形式的文件,其中主要作用是在内核配置时候,作为配置选项. xx.deconfig: Li ...

  7. (转)JVM内存分配 -Xms128m -Xmx512m -XX:PermSize=128m -XX:MaxPermSize=512m

    在linux环境下配置项目运行环境时,部署的人员都会分配一下内存,以保证程序正常的运行.其实在开发的时候(window系统),就已经涉及到内存分配了,只是这些参数有默认值,因此一直没有去重视它. 以M ...

  8. eclipse不能自动编译XX.java为XX.classs

    问题描述:eclipse不能自动编译XX.java为XX.classs 原因:今天下午写代码,因为需要引入jstl包,引入后发现原来项目中已经引入了,然后我又把包删除了,忘记删除java build ...

  9. Python3.x:报错POST data should be bytes, an iterable of bytes

    Python3.x:报错POST data should be bytes, an iterable of bytes 问题: python3.x:报错 POST data should be byt ...

随机推荐

  1. Python基础_内置函数

        Built-in Functions     abs() delattr() hash() memoryview() set() all() dict() help() min() setat ...

  2. JS特效@缓动框架封装及应用

    | 版权声明:本文为博主原创文章,未经博主允许不得转载. 一.变量CSS样式属性获取/赋值方法 给属性赋值:(既能获取又能赋值) 1)div.style.width 单个赋值:点语法,这个方法比较固定 ...

  3. Daily Scrum (2015/10/28)

    昨天DEV们完成了一部分代码风格的修整.今晚在与其他组进行交流时我们发现我们的代码是需要在服务器上运行的,而且服务器是需要配置的,而且据说需要花一些时间.所以在编写代码之前PM提出我们应该先把服务器搭 ...

  4. Linux第一二章笔记

    第一章 Linux内核简介 1. Unix内核的特点 简洁:仅提供系统调用并有一个非常明确的设计目的 抽象:几乎所有东西都被当做文件 可移植性:使用C语言编写,使得其在各种硬件体系架构面前都具备令人惊 ...

  5. javascript中的call(),apply(),bind()方法的区别

    之前一直迷惑,记不住call(),apply(),bind()的区别.不知道如何使用,一直处于懵懂的状态.直到有一天面试被问到了这三个方法的区别,所以觉得很有必要总结一下. 如果有不全面的地方,后续再 ...

  6. 关于Target=" "的一些属性

    总结目前所知的Target=""的几个属性,对其它博客进行整理和归纳. Target="_blank":浏览器总在一个新打开.未命名的窗口中载入目标文档. 说通 ...

  7. ASP.NET中实现封装与策略模式

    首先把运算方法封装起来,这样在网页界面中直接就可以调用了,不过是换张脸而已! using System; using System.Collections.Generic; using System. ...

  8. 软工实践-Beta 冲刺 (2/7)

    队名:起床一起肝活队 组长博客:博客链接 作业博客:班级博客本次作业的链接 组员情况 组员1(队长):白晨曦 过去两天完成了哪些任务 描述: 1.界面的修改与完善 展示GitHub当日代码/文档签入记 ...

  9. 项目Beta冲刺(团队)第三天

    1.昨天的困难 记住密码打勾之后点击登录记住密码这四个字会变成省略号 点赞点击以后本应该呈现的爱心形状变成了方块 2.今天解决的进度 成员 进度 陈家权 私信模块探索ing,回复详情界面设计 赖晓连 ...

  10. 剑指offer:用两个栈实现队列

    题目描述: 用两个栈来实现一个队列,完成队列的Push和Pop操作. 队列中的元素为int类型. 思路: 可以用stack1来存所有入队的数.在出队操作中,首先将stack1中的元素清空,转移到sta ...