21. Regular Expressions--from Apache
转自:
http://jmeter.apache.org/usermanual/regular_expressions.html
21.1 Overview¶
JMeter includes the pattern matching software Apache Jakarta ORO There is some documentation for this on the Jakarta web-site, for example a summary of the pattern matching characters
There is also documentation on an older incarnation of the product at OROMatcher User's guide, which might prove useful.
The pattern matching is very similar to the pattern matching in Perl. A full installation of Perl will include plenty of documentation on regular expressions - look for perlrequick, perlretut, perlre and perlreref.
It is worth stressing the difference between "contains" and "matches", as used on the Response Assertion test element:
- "contains"
- means that the regular expression matched at least some part of the target, so 'alphabet' "contains" 'ph.b.' because the regular expression matches the substring 'phabe'.
- "matches"
- means that the regular expression matched the whole target. So 'alphabet' is "matched" by 'al.*t'.
In this case, it is equivalent to wrapping the regular expression in ^ and $, viz '^al.*t$'.
However, this is not always the case. For example, the regular expression 'alp|.lp.*' is "contained" in 'alphabet', but does not "match" 'alphabet'.
Why? Because when the pattern matcher finds the sequence 'alp' in 'alphabet', it stops trying any other combinations - and 'alp' is not the same as 'alphabet', as it does not include 'habet'.
So how does one use the modifiers ismx etc. if there is no trailing /? The solution is to use extended regular expressions, i.e. /abc/i becomes (?i)abc. See also Placement of modifiers below.
21.2 Examples¶
Extract single string
Suppose you want to match the following portion of a web-page: name="file" value="readme.txt"> and you want to extract readme.txt. A suitable regular expression would be: name="file" value="(.+?)">
The special characters above are:
- ( and )
- these enclose the portion of the match string to be returned
- .
- match any character
- +
- one or more times
- ?
- don't be greedy, i.e. stop when first match succeeds
Note: without the ?, the .+ would continue past the first "> until it found the last possible "> - which is probably not what was intended.
Note: although the above expression works, it's more efficient to use the following expression: name="file" value="([^"]+)"> where [^"] - means match anything except " In this case, the matching engine can stop looking as soon as it sees the first ", whereas in the previous case the engine has to check that it has found "> rather than say " >.
Extract multiple strings
Suppose you want to match the following portion of a web-page: name="file.name" value="readme.txt" and you want to extract both file.name and readme.txt. A suitable regular expression would be: name="([^"]+)" value="([^"]+)" This would create 2 groups, which could be used in the JMeter Regular Expression Extractor template as $1$ and $2$.
The JMeter Regex Extractor saves the values of the groups in additional variables.
For example, assume:
- Reference Name: MYREF
- Regex: name="(.+?)" value="(.+?)"
- Template: $1$$2$
The following variables would be set:
- MYREF
- file.namereadme.txt
- MYREF_g0
- name="file.name" value="readme.txt"
- MYREF_g1
- file.name
- MYREF_g2
- readme.txt
These variables can be referred to later on in the JMeter test plan, as ${MYREF}, ${MYREF_g1} etc.
21.3 Line mode¶
The pattern matching behaves in various slightly different ways, depending on the setting of the multi-line and single-line modifiers. Note that the single-line and multi-line operators have nothing to do with each other; they can be specified independently.
Single-line mode
Single-line mode only affects how the '.' meta-character is interpreted.
Default behaviour is that '.' matches any character except newline. In single-line mode, '.' also matches newline.
Multi-line mode
Multi-line mode only affects how the meta-characters '^' and '$' are interpreted.
Default behaviour is that '^' and '$' only match at the very beginning and end of the string. When Multi-line mode is used, the '^' metacharacter matches at the beginning of every line, and the '$' metacharacter matches at the end of every line.
21.4 Meta characters¶
Regular expressions use certain characters as meta characters - these characters have a special meaning to the RE engine. Such characters must be escaped by preceding them with \ (backslash) in order to treat them as ordinary characters. Here is a list of the meta characters and their meaning (please check the ORO documentation if in doubt).
- ( and )
- grouping
- [ and ]
- character classes
- { and }
- repetition
- *, + and ?
- repetition
- .
- wild-card character
- \
- escape character
- |
- alternatives
- ^ and $
- start and end of string or line
The following Perl5 extended regular expressions are supported by ORO.
- (?#text)
- An embedded comment causing text to be ignored.
- (?:regexp)
- Groups things like "()" but doesn't cause the group match to be saved.
- (?=regexp)
- A zero-width positive lookahead assertion. For example, \w+(?=\s) matches a word followed by whitespace, without including whitespace in the MatchResult.
- (?!regexp)
- A zero-width negative lookahead assertion. For example foo(?!bar) matches any occurrence of "foo" that isn't followed by "bar". Remember that this is a zero-width assertion, which means that a(?!b)d will match ad because a is followed by a character that is not b (the d) and a d follows the zero-width assertion.
- (?imsx)
- One or more embedded pattern-match modifiers. i enables case insensitivity, m enables multiline treatment of the input, s enables single line treatment of the input, and x enables extended whitespace comments.
Note that (?<=regexp) - lookbehind - is not supported.
21.5 Placement of modifiers¶
Modifiers can be placed anywhere in the regex, and apply from that point onwards. [A bug in ORO means that they cannot be used at the very end of the regex. However they would have no effect there anyway.]
The single-line (?s) and multi-line (?m) modifiers are normally placed at the start of the regex.
The ignore-case modifier (?i) may be usefully applied to just part of a regex, for example:
Match ExAct case or (?i)ArBiTrARY(?-i) case
would match Match ExAct case or arbitrary case as well as Match ExAct case or ARBitrary case, but not Match exact case or ArBiTrARY case.
21.6 Testing Regular Expressions¶
Since JMeter 2.4, the listener View Results Tree include a RegExp Tester to test regular expressions directly on sampler response data.
There is a Website to test Java Regular expressions.
Another approach is to use a simple test plan to test the regular expressions. The Java Request sampler can be used to generate a sample, or the HTTP Sampler can be used to load a file. Add a Debug Sampler and a Tree View Listener and changes to the regular expression can be tested quickly, without needing to access any external servers.
21. Regular Expressions--from Apache的更多相关文章
- PCRE Perl Compatible Regular Expressions Learning
catalog . PCRE Introduction . pcre2api . pcre2jit . PCRE Programing 1. PCRE Introduction The PCRE li ...
- Regular Expressions in Grep Command with 10 Examples --reference
Regular expressions are used to search and manipulate the text, based on the patterns. Most of the L ...
- Introducing Regular Expressions 学习笔记
Introducing Regular Expressions 读书笔记 工具: regexbuddy:http://download.csdn.net/tag/regexbuddy%E7%A0%B4 ...
- 8 Regular Expressions You Should Know
Regular expressions are a language of their own. When you learn a new programming language, they're ...
- 转载:邮箱正则表达式Comparing E-mail Address Validating Regular Expressions
Comparing E-mail Address Validating Regular Expressions Updated: 2/3/2012 Summary This page compares ...
- Regular Expressions --正则表达式官方教程
http://docs.oracle.com/javase/tutorial/essential/regex/index.html This lesson explains how to use th ...
- [Regular Expressions] Find Plain Text Patterns
The simplest use of Regular Expressions is to find a plain text pattern. In this lesson we'll look a ...
- [Regular Expressions] Introduction
var str = "Is this This?"; //var regex = new RegExp("is", "gi"); var r ...
- [转]8 Regular Expressions You Should Know
Regular expressions are a language of their own. When you learn a new programming language, they're ...
随机推荐
- nmon+python 基于AIX系统数据分析
https://sourceforge.net/projects/pynmongraph/ github :https://github.com/madmaze/pyNmonAnalyzer nmon ...
- 题解 P5594 【【XR-4】模拟赛】
P5594 [[XR-4]模拟赛] 洛谷10月月赛 II & X Round 4 Div.2前两道签到题还是很简单的,基本上是半小时内一遍过两题 看看题解,这题STL做法有用set输出size ...
- A task in a suit and a tie:paraphrase generation with semantic augmentation解读
1.该算法核心:在seq2seq模型的编码器中增加语义的frame 和 roles 2.上图为算法整个流程: 1).首先输入一句话s,SLING会使用frame和role label注释输入语句s,然 ...
- Vue学习心得----新手如何学习Vue(转载)
ps:本文并非原著,转载自:https://www.cnblogs.com/buzhiqianduan/p/7620102.html,请悉知 前言 使用vue框架有一段时间了,这里总结一下心得,主要为 ...
- django view 视图控制之数据返回的视图函数
八.视图 view 概述:views.py定义的python函数,它接受Web请求并且返回Web响应. 有几个页面就有几个视图view user出入url地址,发送request--->urls ...
- web前端技能考核(阿里巴巴)
- 进程池与线程池、协程、协程实现TCP服务端并发、IO模型
进程池与线程池.协程.协程实现TCP服务端并发.IO模型 一.进程池与线程池 1.线程池 ''' 开进程开线程都需要消耗资源,只不过两者比较的情况下线程消耗的资源比较少 在计算机能够承受范围内最大限度 ...
- 解决<%@taglib prefix="s" uri="/struts-tags"%>显示找不到
问题: jsp中使用<%@taglib prefix="s" uri="/struts-tags"%>显示找不到 解决方法: 在web.xml中插入 ...
- STM32F103_外部RAM用作运存---IS62WV51216
https://www.cnblogs.com/lilto/p/9548736.html STM32F103_外部RAM用作运存 概述 SRAM的简介 折腾过电脑的朋友都知道,当电脑运行比较卡的时 ...
- Spring Log4jConfigListener部署多个项目是出错的问题
tomcat下部署多个项目,都用到了org.springframework.web.util.Log4jConfigListener时,需要注意在web.xml中加入webAppRootkey,要不然 ...