开篇

通常的匹配分为两类，一种是正则表达式匹配，pattern包含一些关键字，比如'*'的用法是紧跟在pattern的某个字符后，表示这个字符可以出现任意多次(包括0次)。

另一种是通配符匹配，我们在操作系统里搜索文件的时候，用的就是这种匹配。比如 "*.pdf"，'*'在这里就不再代表次数，而是通配符，可以匹配任意长度的任意字符组成的串。所以"*.pdf"表示寻找所有的pdf文件。

在算法题中，往往也会有类似的模拟匹配题，当然考虑到当场实现的时间，会减少通配符数量或者正则表达式关键字的数量，只留那么几个，即便如此，这类题目也是属于比较难的题目了==。

正则表达式匹配

例题如下：

Regular Expression Matching

http://basicalgos.blogspot.com/2012/03/10-regular-expression-matching.html

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true

这道题是面Facebook时遇到的一道题。

要处理的关键字有两个'*', '.' ，第二个比较好办，第一个比较麻烦，

因为'*'可以表示任意数量，因此当*(p+1) == '*'时，我们可以掠过'*'之前的字符，直接++p，或者如果*s == *(p-1)或*(p-1) == '.'，我们可以跳过任意个这样的s。因此，'*'的处理被跳过多少个s划分成了多个子问题，我用递归函数来处理这些子问题。当时的代码还没有这么简洁，这是我修改后的代码：

bool isMatch(char *s, char *p){

    if(*s == '\0' && *p == '\0')

        return true;

    if (*(p+) == '*'){

        while(*p == *s || *p == '.'){ //若*s和*p相等，挨个略过

            if(isMatch(s++, p+));

                return true;

        }

        return isMatch(s, p+); //若*s和*p不等，直接略过*p；或者当*(p+2) == '\0'时的最后处理

    }

    if(*s == *p || *p == '.')

        return *s == '\0' ? false : isMatch(s+, p+);

    return false;

}

通配符匹配

我们以LeetCode上的一题为例。

Wildcard Matching

Implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.

'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:

bool isMatch(const char *s, const char *p)

Some examples:

isMatch("aa","a") → false

isMatch("aa","aa") → true

isMatch("aaa","aa") → false

isMatch("aa", "*") → true

isMatch("aa", "a*") → true

isMatch("ab", "?*") → true

isMatch("aab", "c*a*b") → false

required function:

bool isMatch(const char *s, const char *p)

通配符有两个："?"和"*"

因为*是可以匹配任意字符串的，因此还是划分子问题，我一开始的思路是遇到*后，和上一题一样使用递归来处理子问题。

代码：

class Solution {

public:

    bool isMatch(const char *s, const char *p) {

        if(*s == '\0'){

            if(*p == '\0') return true;

            if(*p != '*') return false;

        }

        if(*p == '?') return isMatch(++s, ++p);

        else if(*p == '*'){

            while(*(++p) == '*');

            for(; *s != '\0'; ++s){

                if(isMatch(s, p)) return true;

            }

            return isMatch(s, p);

        }else{

            if(*p == *s) return isMatch(++s, ++p);

            return false;

        }

        return false;

    }

};

但是这样做超时。

为了节约时间，我用空间换时间，用rec[][]记录了比较结果。

class Solution {

public:

    bool isMatch(const char *s, const char *p) {

        int lens = , lenp = ;

        const char *s1 = s, *p1 = p;

        for(; *s1 != '\0'; ++s1, ++lens);

        for(; *p1 != '\0'; ++p1, ++lenp);

        if(lenp == ) return false;

        if(lens == ) return true;

        rec = new int*[lens+];

        for(int i = ; i <= lens; ++i){

            rec[i] = new int[lenp+];

            for(int j = ; j <= lenp; ++j){

                rec[i][j] = -;

            }

        }

        return isMatchCore(s, s, p, p);

    }

private:

    int** rec;

    bool isMatchCore(const char *oris, const char *s, const char *orip, const char *p) {

        if(*s == '\0'){

            if(*p == '\0') return true;

            if(*p != '*') return false;

        }

        if(rec[s-oris][p-orip] >= ) return rec[s-oris][p-orip];

        if(*p == '?') return isMatchCore(oris, ++s, orip, ++p);

        else if(*p == '*'){

            while(*(++p) == '*');

            for(; *s != '\0'; ++s){

                if(isMatchCore(oris, s, orip, p)) return true;

            }

            return isMatchCore(oris, s, orip, p);

        }else{

            if(*p == *s) return isMatchCore(oris, ++s, orip, ++p);

            return false;

        }

        return false;

    }

};

结果依然超时。

原因在于即便使用了带记录的递归，对于p上的每一个'*'，依然需要考虑'*' 匹配之后字符的所有情况，比如p = "c*ab*c"，s = "cddabbac"时，遇到第一个'*'，我们需要用递归处理p的剩余部分"ab*c" 和s的剩余部分"ddabbac"的所有尾部子集匹配。也就是："ab*c"和"ddabbac"，"ab*c" 和"dabbac"的匹配，"ab*c" 和"abbac"的匹配，... ，"ab*c" 和"c"的匹配，"ab*c" 和"\0"的匹配。

遇到第二个'*'，依然如此。每一个'*'都意味着p的剩余部分要和s的剩余部分的所有尾子集匹配一遍。

然而，我们如果仔细想想，实际上，当p中'*'的数量大于1个时，我们并不需要像上面一样匹配所有尾子集。

依然以 p = "c*ab*c"，s = "cddabbac"为例。

对于p = "c*ab*c"，我们可以猜想出它可以匹配的s应该长成这样： "c....ab.....c"，省略号表示0到任意多的字符。我们发现主要就是p的中间那个"ab"比较麻烦，一定要s中的'ab'来匹配，因此只要s中间存在一个"ab"，那么一切都可以交给后面的'*'了。

所以说，当我们挨个比较p和s上的字符时，当我们遇到p的第一个'*'，我们实际只需要不断地在s的剩余部分找和'ab'匹配的部分。

换言之，我们可以记录下遇到*时p和s的位置，记为presp和press，然后挨个继续比较*(++p)和*(++s)；如果发现*p != *s，就回溯回去，p = presp，s = press+1, ++press；直到比较到末尾，或者遇到了下一个'*'，如果遇到了下一个'*'，说明 "ab"部分搞定了，下面的就交给第二个'*'了；如果p和s都到末尾了，那么就返回true；如果到末尾了既没遇到新的'*'，又还存在不匹配的值，press也已经到末尾了，那么就返回false了。

这样的思路和上面的递归比起来，最大的区别就在于：

遇到'*'，我们只考虑遇到下一个'*'前的子问题，而不是考虑一直到末尾的子问题。从而避免大量的子问题计算。

我们通过记录 presp和press，每次回溯的方法，避免使用递归。

代码：

class Solution {

public:

    bool isMatch(const char *s, const char *p) {

        const char *presp = NULL, *press = NULL;    //previous starting comparison place after * in s and p.

        bool startFound = false;

        while(*s != '\0'){

            if(*p == '?'){++s; ++p;}

            else if(*p == '*'){

                presp = ++p;

                press = s;

                startFound = true;

            }else{

                if(*p == *s){

                    ++p;

                    ++s;

                }else if(startFound){

                    p = presp;

                    s = (++press);

                }else return false;

            }

        }

        while(*p == '*') ++p;

        return *p == '\0';

    }

};

[LeetCode][Facebook面试题] 通配符匹配和正则表达式匹配，题 Wildcard Matching的更多相关文章

WildcardMatching和Regex，通配符匹配和正则表达式匹配
WildcardMatching:通配符匹配算法分析: 1. 二个指针i, j分别指向字符串.匹配公式. 2. 如果匹配,直接2个指针一起前进. 3. 如果匹配公式是*,在字符串中依次匹配即可. 注 ...
leetcode 第43题 Wildcard Matching
题目:(这题好难.题目意思类似于第十题,只是这里的*就是可以匹配任意长度串,也就是第十题的‘.*’)'?' Matches any single character. '*' Matches any ...
Leetcode 10. 正则表达式匹配 - 题解
版权声明: 本文为博主Bravo Yeung(知乎UserName同名)的原创文章,欲转载请先私信获博主允许,转载时请附上网址 http://blog.csdn.net/lzuacm. C#版 - L ...
[LeetCode] Regular Expression Matching 正则表达式匹配
Implement regular expression matching with support for '.' and '*'. '.' Matches any single character ...
LeetCode（10）：正则表达式匹配
Hard! 题目描述: 给定一个字符串 (s) 和一个字符模式 (p).实现支持 '.' 和 '*' 的正则表达式匹配. '.' 匹配任意单个字符. '*' 匹配零个或多个前面的元素. 匹配应该覆盖整 ...
剑指offer——面试题19：正则表达式匹配
#include"iostream" using namespace std; bool MatchCore(char*str,char* pattern); bool Match ...
leetcode 10 Regular Expression Matching（简单正则表达式匹配）
最近代码写的少了,而leetcode一直想做一个python,c/c++解题报告的专题,c/c++一直是我非常喜欢的,c语言编程练习的重要性体现在linux内核编程以及一些大公司算法上机的要求,pyt ...
Leetcode（10）正则表达式匹配
Leetcode(10)正则表达式匹配 [题目表述]: 给定一个字符串 (s) 和一个字符模式 (p).实现支持 '.' 和 '*' 的正则表达式匹配. '.' 匹配任意单个字符. '*' 匹配零个或 ...
[LeetCode] 10. Regular Expression Matching 正则表达式匹配
Given an input string (s) and a pattern (p), implement regular expression matching with support for ...

随机推荐

LeetCode 120——三角形最小路径和
1. 题目 2. 解答详细解答方案可参考北京大学 MOOC 程序设计与算法(二)算法基础之动态规划部分. 从三角形倒数第二行开始,某一位置只能从左下方或者右下方移动而来,因此,我们只需要求出这两者的 ...
通过Ajax上传文件至WebApi服务器
https://stackoverflow.com/questions/43013858/ajax-post-a-file-from-a-form-with-axios var formData = ...
Java动态代码模式
java动态代理(JDK和cglib) JAVA的动态代理代理模式代理模式是常用的java设计模式,他的特征是代理类与委托类有同样的接口,代理类主要负责为委托类预处理消息.过滤消息.把消息转发给委 ...
iOS开发解决页面滑动返回跟scrollView左右划冲突
-(BOOL)gestureRecognizer:(UIGestureRecognizer *)gestureRecognizer shouldRecognizeSimultaneouslyWithG ...
RXSwift--登录注册那点事
在iOS学习中登录注册是一个万能的可以拿出来实战的demo.接下来我们就从登录开始入手,PS:如果你对RXSwift中的概念和一些常用的函数不清楚可以参考这篇文章(可能打开比较慢请耐心等待).开始直接 ...
当重写了 httpservlet重写了GenericServlet的init方法时候必须显示调用GenericServlet的init方法时候才能在别的方法(父类创建config实例) 例如 doget里面使用servletContext对象不重写init 则可以直接使用
Python 源码剖析（五）【DICT对象】
五.DICT对象 1.散列表概述 2.PyDictObject 3.PyDictObject的创建与维护 4.PyDictObject 对象缓冲池 5.Hack PyDictObject 这篇篇幅较长 ...
[洛谷P5107]能量采集
题目大意:有一张$n(n\leqslant50)$个点$m(m\leqslant n(n-1))$条边的有向图,每个点还有一个自环,每个点有一个权值.每一秒钟,每个点的权值会等分成出边个数,流向出边. ...
Android 打开照相机、获取相册图片、获取图片并裁减
一.调用照相机注:surfaceView在当Activity不在前台的时候,会被销毁(onPause方法之后,执行销毁方法)当Activity回到前台时,在Activity执行onResume方法之 ...
BZOJ5217：[Lydsy2017省队十连测]航海舰队——题解
https://www.lydsy.com/JudgeOnline/problem.php?id=5217 Byteasar 组建了一支舰队!他们现在正在海洋上航行着.海洋可以抽象成一张n×m 的网格 ...

[LeetCode][Facebook面试题] 通配符匹配和正则表达式匹配，题 Wildcard Matching

开篇

正则表达式匹配

通配符匹配

[LeetCode][Facebook面试题] 通配符匹配和正则表达式匹配，题 Wildcard Matching的更多相关文章

随机推荐

热门专题