from :https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-what-does-a-question-mark-followed-by-a-colon

fter reading some tutorials I still don't get it.

Could someone explain how ?: is used and what it's good for?

Let me try to explain this with an example.

Consider the following text:

https://stackoverflow.com/
https://stackoverflow.com/questions/tagged/regex

Now, if I apply the regex below over it...

(http|ftp)://([^/\r\n]+)(/[^\r\n]*)?

... I would get the following result:

Match "https://stackoverflow.com/"
Group 1: "http"
Group 2: "stackoverflow.com"
Group 3: "/" Match "https://stackoverflow.com/questions/tagged/regex"
Group 1: "http"
Group 2: "stackoverflow.com"
Group 3: "/questions/tagged/regex"

But I don't care about the protocol -- I just want the host and path of the URL. So, I change the regex to include the non-capturing group (?:).

(?:http|ftp)://([^/\r\n]+)(/[^\r\n]*)?

Now, my result looks like this:

Match "https://stackoverflow.com/"
Group 1: "stackoverflow.com"
Group 2: "/" Match "https://stackoverflow.com/questions/tagged/regex"
Group 1: "stackoverflow.com"
Group 2: "/questions/tagged/regex"

See? The first group has not been captured. The parser uses it to match the text, but ignores it later, in the final result.


EDIT:

As requested, let me try to explain groups too.

Well, groups serve many purposes. They can help you to extract exact information from a bigger match (which can also be named), they let you rematch a previous matched group, and can be used for substitutions. Let's try some examples, shall we?

Ok, imagine you have some kind of XML or HTML (be aware that regex may not be the best tool for the job, but it is nice as an example). You want to parse the tags, so you could do something like this (I have added spaces to make it easier to understand):

   \<(?<TAG>.+?)\> [^<]*? \</\k<TAG>\>
or
\<(.+?)\> [^<]*? \</\1\>

The first regex has a named group (TAG), while the second one uses a common group. Both regexes do the same thing: they use the value from the first group (the name of the tag) to match the closing tag. The difference is that the first one uses the name to match the value, and the second one uses the group index (which starts at 1).

Let's try some substitutions now. Consider the following text:

Lorem ipsum dolor sit amet consectetuer feugiat fames malesuada pretium egestas.

Now, let's use the this dumb regex over it:

\b(\S)(\S)(\S)(\S*)\b

This regex matches words with at least 3 characters, and uses groups to separate the first three letters. The result is this:

Match "Lorem"
Group 1: "L"
Group 2: "o"
Group 3: "r"
Group 4: "em"
Match "ipsum"
Group 1: "i"
Group 2: "p"
Group 3: "s"
Group 4: "um"
... Match "consectetuer"
Group 1: "c"
Group 2: "o"
Group 3: "n"
Group 4: "sectetuer"
...

So, if we apply the substitution string...

$1_$3$2_$4

... over it, we are trying to use the first group, add an underscore, use the third group, then the second group, add another underscore, and then the fourth group. The resulting string would be like the one below.

L_ro_em i_sp_um d_lo_or s_ti_ a_em_t c_no_sectetuer f_ue_giat f_ma_es m_la_esuada p_er_tium e_eg_stas.

You can use named groups for substitutions too, using ${name}.

To play around with regexes, I recommend http://regex101.com/, which offers a good amount of details on how the regex works; it also offers a few regex engines to choose from.

javascript reg 不加入分组的更多相关文章

  1. JavaScript正则表达式模式匹配(2)——分组模式匹配

    var pattern=/google{4,8}$/; // {4,8}$表示匹配结尾4-8次 var str='googleeeeeeeee'; // 表示e的4-8次 alert(pattern. ...

  2. javascript正则表达式(一)

    元字符 ( [ { \ ^ $ | ) ? * + . 预定义的特殊字符 字符 正则 描述 \t /\t/ 制表符 \n /\n/ 制表符 \r /\r/ 回车符 \f /\f/ 换页符 \a /\a ...

  3. 正则表达式(javascript)学习总结

    正则表达式在jquery.linux等随处可见,已经无孔不入.因此有必要对这个工具认真的学习一番.本着认真.严谨的态度,这次总结我花了近一个月的时间.但本文无任何创新之处,属一般性学习总结. 一.思考 ...

  4. JS正则表达式---分组

    JS正则表达式---分组 之前写了一篇关于正则新手入门的文章,本以为对正则表达式相对比较了解 但是今天我又遇到了一个坑,可能是自己不够细心的原因吧,今天就着重和大家分享一下javascript正则表达 ...

  5. javascript的正则表达式总结

    网上正则表达式的教程够多了,但由于javascript的历史比较悠久,也比较古老,因此有许多特性是不支持的.我们先从最简单地说起,文章所演示的正则基本都是perl方式. 元字符 ( [ { \ ^ $ ...

  6. javascript:正则大全

    :replace函数,为写自己的js模板做准备 待完善 function 1,声明&用法 //数组: var arr=[];//字面量 var arr=new Array();//构造函数 / ...

  7. JavaScript探秘系列

    此文章所在专题列表如下: 我们应该如何去了解JavaScript引擎的工作原理 JavaScript探秘:编写可维护的代码的重要性 JavaScript探秘:谨慎使用全局变量 JavaScript探秘 ...

  8. 系列文章--JavaScript教程文章

    JavaScript教程文章专题列表如下: 我们应该如何去了解JavaScript引擎的工作原理 JavaScript探秘:编写可维护的代码的重要性 JavaScript探秘:谨慎使用全局变量 Jav ...

  9. 温故知新 javascript 正则表达式

    很长时间没看 正则表达式了,碰巧今天用到,温故知新了一把 看书学习吧 50% 的举一反三练习中的原创.   一 javascript正则表达式的基本知识 1     javascript 正则对象创建 ...

随机推荐

  1. macbook 蓝牙关闭按钮灰色的

    PRAM重置:1.关机,拔掉所有外设,只连接电源线2.按开机键,接着马上同时按住 command+option+p+r 键,按住后听到三次dang的开机音后放开3.电脑会自行启动,进到登录页面后点关机 ...

  2. 2016-ccf-data-mining-competition 搜狗用户画像构建

    想法1:   分成147(3*7*7)类, 后来觉得这样效果不好,后来看了看竞赛要求的也是分别预测,分别评分,而不是一次就把3类的标签都给出   所有后来我们改进了当时的想法,决定对年龄,性别,学历进 ...

  3. 企业级服务元年:iClap高效解决手游更新迭代问题

    2006年至今,手游市场经历了不少变革,从WAP站到2009年智能手机时代来临,2012大量资本涌入国内手游行业,到2014年手游市场趋于成熟,细分市场成为追逐热门,在2015年优胜劣汰的资本寒冬浪潮 ...

  4. 2017-2018 ACM-ICPC Latin American Regional Programming Contest Solution

    A - Arranging tiles 留坑. B - Buggy ICPC 题意:给出一个字符串,然后有两条规则,如果打出一个辅音字母,直接接在原字符串后面,如果打出一个元音字母,那么接在原来的字符 ...

  5. CCPC-Wannafly Winter Camp Day2 (Div2, onsite)

    Class $A_i = a \cdot i \% n$ 有 $A_i = k \cdot gcd(a, n)$ 证明: $A_0 = 0, A_x = x \cdot a - y \cdot n$ ...

  6. 前端学习笔记之HTML/CSS 速写神器 Emmet

    HTML/CSS 速写神器:Emmet 在前端开发的过程中,一个最繁琐的工作就是写 HTML.CSS 代码.数量繁多的标签.属性.尖括号.标签闭合等,让前端们甚是苦恼.于是,我向大家推荐 Emmet, ...

  7. Python3.x:生成器简介

    Python3.x:生成器简介 概念 任何使用yield的函数都称之为生成器:使用yield,可以让函数生成一个序列,该函数返回的对象类型是"generator",通过该对象连续调 ...

  8. Java回顾之网络通信

    在这篇文章里,我们主要讨论如何使用Java实现网络通信,包括TCP通信.UDP通信.多播以及NIO. TCP连接 TCP的基础是Socket,在TCP连接中,我们会使用ServerSocket和Soc ...

  9. Java回顾之多线程

    在这篇文章里,我们关注多线程.多线程是一个复杂的话题,包含了很多内容,这篇文章主要关注线程的基本属性.如何创建线程.线程的状态切换以及线程通信,我们把线程同步的话题留到下一篇文章中. 线程是操作系统运 ...

  10. Java中HashMap 初始化时容量(参数)如何设置合适?

    问题引入 注:本文代码源自java 9. 阿里的插件对于初始化HashMap时,调用无参构造方法,提示如下: 那么问题来了,如果已知需要向 map 中 put n次,那么需要设定初始容量为多少? 单纯 ...