介绍, man手册

txt版 http://www.pcre.org/original/pcre.txt

html版 http://www.pcre.org/original/doc/html/pcre.html

In addition to the Perl-compatible matching function, PCRE contains an alternative function that matches the same compiled patterns in a different way. 
In certain circumstances, the alternative function has some advantages. For a discussion of the two matching algorithms, see the pcrematching page.

pcrematching:

http://www.pcre.org/original/doc/html/pcrematching.html

摘要:

  0.  是批量处理的意思? 一个pattern处理多个subject么?

The set of strings that are matched by a regular expression can be represented as a tree structure.

  1.  Jeffrey Friedl's book "Mastering Regular Expressions"

  中文版:精通正则表达式:https://book.douban.com/subject/2154713/

  英文版PDF:https://doc.lagout.org/Others/O%27Reilly%20-%20Mastering%20Regular%20Expressions.pdf

  2, PCRE匹配分标准接口(pcre_exec(), pcre16_exec() and pcre32_exec() functions. )和非标准接口( pcre_dfa_exec(), pcre16_dfa_exec() and pcre32_dfa_exec() functions )两种。

    前者在同一个串中只能返回一个匹配结果,或者可以同时返回一个串中的多个匹配结果。

    标准接口返回的结果有可能是最长串,最短串或任意长度的串,这取决于贪婪与非贪婪的设置。

    标准接口就是NFA algorithm是深度优先查找树,同时可以有贪婪(greedy)与非贪婪(ungreedy)两种控制种类。

    非标准接口为广度优先查找树,为DFA算法( In Friedl's terminology, this is a kind of "DFA algorithm", though it is not implemented as a traditional finite state machine (it keeps multiple states active simultaneously).)subject串的扫描会一直进行到串的尾部或者没有其他需要遍历的路径。所有的已终结路径即代表了全部的匹配结果,返回的结果按照字符串长度递减。有一个开关设置第一个命中即返回,也就是最短命中串。

  3.  非标准方法的优点:

    a, 匹配多个结果,尤其是找到最长匹配。

    b, 可以对超长的subject数据进行多次分批次的匹配。

     非标准方法的缺点:

    a, 比标准方法慢。

    b, 不支持子串提取。

    c, Although atomic groups are supported, their use does not provide the performance advantage that it does for the standard algorithm.

pcrejit:

http://www.pcre.org/original/doc/html/pcrejit.html

摘要:

  JIT提供特别深度的优化. 牺牲额外的处理步骤,从而提高匹配性能。适合一次pattern编译多次match操作的应用场景。

  1. 只支持标准PCRE接口,不支持DFA匹配模式。

  2. PCRE默认不打开JIT,需要在编译的时候增加--enable-jit选项。

  3.  有硬件平台限制

  ARM v5, v7, and Thumb2
Intel x86 -bit and -bit
MIPS -bit
Power PC -bit and -bit
SPARC -bit (experimental)

  4.  the pcre_jit_exec() function was not available at all before 8.32

  5.  The JIT compiler generates different optimized code for each of the three modes (normal, soft partial, hard partial). When pcre_exec() is called, the appropriate code is run if it is available. Otherwise, the pattern is matched using interpretive code.

  6.  There are some pcre_exec() options that are not supported for JIT execution. There are also some pattern items that JIT cannot handle. Details are given below. In both cases, execution automatically falls back to the interpretive code.

  7.  Once a pattern has been studied, with or without JIT, it can be used as many times as you like for matching different subject strings.

  8.   The code that is generated by the JIT compiler is architecture-specific, and is also position dependent. For those reasons it cannot be saved (in a file or database) and restored later like the bytecode and other data of a compiled pattern.

  more info: http://www.pcre.org/original/doc/html/pcreprecompile.html

  9.  有时候JIT机器码没有成功编译,但是pcre_exec()仍然正常运行,只不过fallback回了解释码。我们在高性能场景下不希望使用解释码的时候,使用API pcre_jit_exec().

Because the API described above falls back to interpreted execution when JIT is not available, it is convenient for programs that are written 
for general use in many environments. However, calling JIT via pcre_exec() does have a performance impact. Programs that are written for use
where JIT is known to be available, and which need the best possible performance, can instead use a "fast path" API to call JIT execution directly
instead of calling pcre_exec() (obviously only for patterns that have been successfully studied by JIT).

  10.  pcre_exec()会做参数合法性的检测。pcre_jit_exec()为了提高性能,不做合法性检测,如果参数不合法,结果无法预期。

API:

http://www.pcre.org/original/doc/html/pcreapi.html

摘要:

  1,

The functions pcre_compile(), pcre_compile2(), pcre_study(), and pcre_exec() are used for compiling and matching regular expressions in a Perl-compatible manner.

  2, compile a pattern

    http://www.pcre.org/original/doc/html/pcreapi.html#SEC11

  3,    studying a pattern

Studying a pattern does two things: first, a lower bound for the length of subject string that is needed to match the pattern is computed. This does not mean 
that there are any strings of that length that match, but it does guarantee that no shorter strings match. The value is used to avoid wasting time by trying
to match strings that are shorter than the lower bound.
Studying a pattern is also useful for non-anchored patterns that do not have a single fixed starting character. A bitmap of possible starting bytes is created.
This speeds up finding a position in the subject at which to start matching.

  4,   matching a pattern

However, it is possible to save compiled patterns and study data, and then use them later in different processes, possibly even on different hosts. 
For a discussion about this, see the pcreprecompile documentation.

对比一下PCRE2:

[development][PCRE] PCRE

trie:

https://zh.wikipedia.org/zh-hans/Trie

-------------------

黑哥的blog:http://www.cnblogs.com/zzqcn/p/3525636.html

这个讲的很好,对比PCRE、PCRE-JIT,hyperscan:https://mp.weixin.qq.com/s?__biz=MzI3NDA4ODY4MA==&mid=2653334341&idx=1&sn=bf10ca6d8ca1452723b84a62f7fc436d&chksm=f0cb5cc2c7bcd5d4f423af8d78aeb58dd6d9494c1562b1e775579321df3b9f59a951656100d0&scene=21#wechat_redirect

[development][PCRE] old PCRE的更多相关文章

  1. 在C语言中利用PCRE实现正则表达式

    1. PCRE简介 2. 正则表达式定义 3. PCRE正则表达式的定义 4. PCRE的函数简介 5. 使用PCRE在C语言中实现正则表达式的解析 6. PCRE函数在C语言中的使用小例子 1. P ...

  2. Linux下编译安装PCRE库

    备注:如果没有root权限,使用 --prefix 指定安装路径 ./configure --prefix=/home/work/tools/pcre-8.xx =================== ...

  3. pcre 使用

    1.主页地址:http://www.pcre.org/     下载pcre-7.8.tar.bz22.解压缩:     tar xjpf pcre-7.8.tar.bz23.配置:     cd p ...

  4. PCRE正则库的使用

    使用pcre编写C或C++程序,然后编译. 对于C程序,编译命令为:gcc -I/usr/local/include/pcre -L/usr/local/lib/pcre -lpcre file.c ...

  5. PCRE的安装及使用

    摘自http://www.cnblogs.com/renhao/archive/2011/08/17/2143264.html PCRE的安装及使用 1.主页地址:http://www.pcre.or ...

  6. pcre函数具体解释

    PCRE是一个NFA正则引擎,不然不能提供全然与Perl一致的正则语法功能.但它同一时候也实现了DFA,仅仅是满足数学意义上的正则. PCRE提供了19个接口函数,为了简介,使用PCRE内带的測试程序 ...

  7. PCRE函数简介和使用示例【转】

    PCRE函数简介和使用示例 标签: 正则表达式listbuffercompilationnullperl 原文地址:http://blog.csdn.net/sulliy/article/detail ...

  8. Sword pcre库函数学习一

    0.pcre_exec 原型: #include <pcre.h> int pcre_exec(const pcre *code, const pcre_extra *extra, con ...

  9. PCRE library

    wget http://nginx.org/download/nginx-1.15.6.tar.gz tar -xvf nginx-1.15.6.tar.gz ln -s nginx-1.15.6 n ...

随机推荐

  1. vue遍历时添加个数过滤条件

    1.效果 本身有5个地址,显示3个 2.address.html <!DOCTYPE html> <html lang="en"> <head> ...

  2. [20170629]带过滤的复制项UI操作导致订阅全部初始化问题

    [问题] 带过滤的复制项UI操作导致订阅全部初始化,但是想不全部初始化,只初始化对应的复制项   [解决] 1.如果修改过滤项,可以直接执行,然后生成快照: -- Adding the article ...

  3. openssl实现CA自签证书和颁发数字证书

    1. 测试环境准备: CA签署服务器:192.168.2.181 WEB服务器:192.168.2.180 CA安装openssl  WEB服务器使用nginx 2. CA生成自签证书: 2.1 为C ...

  4. 如何快速学习Scala

    大数据学习过程中,会学习非常多的技术,但SCALA无疑是必不可少,那我们在大数据技术的学习过程中,如何快速的认识scala,并且学习它,感谢科多大数据公司的余老师提供的详细素材,本人整理成章,希望对你 ...

  5. 【九天教您南方cass 9.1】 06 绘制方格网

    同学们大家好,欢迎收看由老王测量上班记出品的cass9.1视频课程 我是本节课主讲老师九天. 我们讲课的教程附件也是共享的,请注意索取测量空间中. [点击索取cass教程]5元立得 (给客服说暗号:“ ...

  6. 【iCore1S 双核心板_ARM】例程十四:FATFS实验——读写文件

    实验现象: 核心代码: int main(void) { /* USER CODE BEGIN 1 */ int i; int j; FIL file; FATFS fatfs; //Âß¼­Çý¶¯ ...

  7. ipv6禁用导致rpcbind服务启动失败实例

    ipv6禁用导致rpcbind服务启动失败实例     昨天在做服务器磁盘分区扩容的时候出现过一个服务启动的问题,在此记录.情景再现:前天晚上申请做磁盘扩容,得到批准后,昨天早上5点开始做停机调整维护 ...

  8. Java知多少(14)数组

    如果希望保存一组有相同类型的数据,可以使用数组. 数组的定义和内存分配 Java 中定义数组的语法有两种: type arrayName[]; type[] arrayName; type 为Java ...

  9. python中的列表、元组、数组——是不是特别容易混淆啊??

    列表: 1. 即list, 是python内置的数据类型.  它的形式是: a = [1, 2, 3, 4, 5] 2. 列表内的值是可以改变的:  即可以这样子: a[0] = 100,  把列表的 ...

  10. Spring源码学习:day2

    前言: 我还是太懒了,连截图都懒得粘贴,故直接用书上说的话的截图吧. 代码的编写过程都是应该有一个入口的,所有的代码最终都是为了那个入口更加方便更加简单而产生的. 看代码的过程,就应该抓住主线,顺着主 ...