概述

CharMatcher提供了多种对字符串处理的方法, 它的主要意图有:

1. 找到匹配的字符

2. 处理匹配的字符

CharMatcher内部主要实现包括两部分:

1. 实现了大量公用内部类, 用来方便用户对字符串做匹配: 例如 JAVA_DIGIT 匹配数字, JAVA_LETTER 匹配字母等等.

2. 实现了大量处理字符串的方法, 使用特定的CharMatcher可以对匹配到的字符串做出多种处理, 例如 remove(), replace(), trim(), retain()等等

CharMatcher本身是一个抽象类, 其中一些操作方法是抽象方法, 他主要依靠内部继承CharMatcher的内部子类来实现抽象方法和重写一些操作方法, 因为不同的匹配规则的这些操作方法具有不同的实现要求

常用方法介绍

默认实现类

CharMatcher本身提供了很多CharMatcher实现类,如下:

ANY: 匹配任何字符

ASCII: 匹配是否是ASCII字符

BREAKING_WHITESPACE: 匹配所有可换行的空白字符(不包括非换行空白字符,例如"\u00a0")

DIGIT: 匹配ASCII数字

INVISIBLE: 匹配所有看不见的字符

JAVA_DIGIT: 匹配UNICODE数字, 使用 Character.isDigit() 实现

JAVA_ISO_CONTROL: 匹配ISO控制字符, 使用 Charater.isISOControl() 实现

JAVA_LETTER: 匹配字母, 使用 Charater.isLetter() 实现

JAVA_LETTER_OR_DIGET: 匹配数字或字母

JAVA_LOWER_CASE: 匹配小写

JAVA_UPPER_CASE: 匹配大写

NONE: 不匹配所有字符

SINGLE_WIDTH: 匹配单字宽字符, 如中文字就是双字宽

WHITESPACE: 匹配所有空白字符

常用操作方法

CharMatcher is(char match): 返回匹配指定字符的Matcher

CharMatcher isNot(char match): 返回不匹配指定字符的Matcher

CharMatcher anyOf(CharSequence sequence): 返回匹配sequence中任意字符的Matcher

CharMatcher noneOf(CharSequence sequence): 返回不匹配sequence中任何一个字符的Matcher

CharMatcher inRange(char startInclusive, char endIncludesive): 返回匹配范围内任意字符的Matcher

CharMatcher forPredicate(Predicate<? super Charater> predicate): 返回使用predicate的apply()判断匹配的Matcher

CharMatcher negate(): 返回以当前Matcher判断规则相反的Matcher

CharMatcher and(CharMatcher other): 返回与other匹配条件组合做与来判断的Matcher

CharMatcher or(CharMatcher other): 返回与other匹配条件组合做或来判断的Matcher

boolean matchesAnyOf(CharSequence sequence): 只要sequence中有任意字符能匹配Matcher,返回true

boolean matchesAllOf(CharSequence sequence): sequence中所有字符都能匹配Matcher,返回true

boolean matchesNoneOf(CharSequence sequence): sequence中所有字符都不能匹配Matcher,返回true

int indexIn(CharSequence sequence): 返回sequence中匹配到的第一个字符的坐标

int indexIn(CharSequence sequence, int start): 返回从start开始,在sequence中匹配到的第一个字符的坐标

int lastIndexIn(CharSequence sequence): 返回sequence中最后一次匹配到的字符的坐标

int countIn(CharSequence sequence): 返回sequence中匹配到的字符计数

String removeFrom(CharSequence sequence): 删除sequence中匹配到到的字符并返回

String retainFrom(CharSequence sequence): 保留sequence中匹配到的字符并返回

String replaceFrom(CharSequence sequence, char replacement): 替换sequence中匹配到的字符并返回

String trimFrom(CharSequence sequence): 删除首尾匹配到的字符并返回

String trimLeadingFrom(CharSequence sequence): 删除首部匹配到的字符

String trimTrailingFrom(CharSequence sequence): 删除尾部匹配到的字符

String collapseFrom(CharSequence sequence, char replacement): 将匹配到的组(连续匹配的字符)替换成replacement

String trimAndCollapseFrom(CharSequence sequence, char replacement): 先trim在replace

部分实现源码介绍

下面对CharMatcher的常用的操作方法实现做一些介绍

    /**
* 返回一个与当前Matcher匹配规则相反的Matcher
*/
public CharMatcher negate() {
final CharMatcher original = this;
return new CharMatcher(original + ".negate()") {
@Override public boolean matches(char c) {
return !original.matches(c);
} @Override public boolean matchesAllOf(CharSequence sequence) {
return original.matchesNoneOf(sequence);
} @Override public boolean matchesNoneOf(CharSequence sequence) {
return original.matchesAllOf(sequence);
} @Override public int countIn(CharSequence sequence) {
return sequence.length() - original.countIn(sequence);
} @Override public CharMatcher negate() {
return original;
}
};
} /**
* 返回一个具有组合规则链的Matcher
*/
public CharMatcher and(CharMatcher other) {
return new And(this, checkNotNull(other));
} /**
* And的实现和Ordering的Compound是一样的
* 使用一个内部子类继承Matcher,然后内部使用组合的方式将
* 多个Matcher组合在一起,调用操作方法的时候依次调用这些
* Matcher的同名操作方法即可
*/
private static class And extends CharMatcher {
final CharMatcher first;
final CharMatcher second; And(CharMatcher a, CharMatcher b) {
this(a, b, "CharMatcher.and(" + a + ", " + b + ")");
} And(CharMatcher a, CharMatcher b, String description) {
super(description);
first = checkNotNull(a);
second = checkNotNull(b);
} @Override
public CharMatcher and(CharMatcher other) {
return new And(this, other);
} @Override
public boolean matches(char c) {
return first.matches(c) && second.matches(c);
} @Override
CharMatcher withToString(String description) {
return new And(first, second, description);
}
} /**
* Or的实现与And一样,不再赘述
*/
public CharMatcher or(CharMatcher other) {
return new Or(this, checkNotNull(other));
} private static class Or extends CharMatcher {
final CharMatcher first;
final CharMatcher second; Or(CharMatcher a, CharMatcher b, String description) {
super(description);
first = checkNotNull(a);
second = checkNotNull(b);
} Or(CharMatcher a, CharMatcher b) {
this(a, b, "CharMatcher.or(" + a + ", " + b + ")");
} @Override
public CharMatcher or(CharMatcher other) {
return new Or(this, checkNotNull(other));
} @Override
public boolean matches(char c) {
return first.matches(c) || second.matches(c);
} @Override
CharMatcher withToString(String description) {
return new Or(first, second, description);
}
} /**
* Returns a {@code char} matcher functionally equivalent to this one, but which may be faster to
* query than the original; your mileage may vary. Precomputation takes time and is likely to be
* worthwhile only if the precomputed matcher is queried many thousands of times.
*
* <p>This method has no effect (returns {@code this}) when called in GWT: it's unclear whether a
* precomputed matcher is faster, but it certainly consumes more memory, which doesn't seem like a
* worthwhile tradeoff in a browser.
*/
public CharMatcher precomputed() {
return Platform.precomputeCharMatcher(this);
} /**
* 使用最慢的方式来返回字符全集中所有能被Matcher匹配的字符
* 最慢的方式?!
*/
char[] slowGetChars() {
char[] allChars = new char[65536];
int size = 0;
for (int c = Character.MIN_VALUE; c <= Character.MAX_VALUE; c++) {
if (matches((char) c)) {
allChars[size++] = (char) c;
}
}
char[] retValue = new char[size];
System.arraycopy(allChars, 0, retValue, 0, size);
return retValue;
} /**
* 只要sequence有任意字符匹配Matcher,则返回true
*/
public boolean matchesAnyOf(CharSequence sequence) {
return !matchesNoneOf(sequence);
} /**
* 如果sequence所有字符都匹配Matcher,则返回true
*/
public boolean matchesAllOf(CharSequence sequence) {
for (int i = sequence.length() - 1; i >= 0; i--) {
if (!matches(sequence.charAt(i))) {
return false;
}
}
return true;
} /**
* 如果sequence所有字符都不匹配Matcher,则返回true
*/
public boolean matchesNoneOf(CharSequence sequence) {
return indexIn(sequence) == -1;
} /**
* 返回Matcher在sequence中匹配到的第一个字符的坐标
* 没有匹配则返回 -1
*/
public int indexIn(CharSequence sequence) {
int length = sequence.length();
for (int i = 0; i < length; i++) {
if (matches(sequence.charAt(i))) {
return i;
}
}
return -1;
} /**
* 返回Matcher在sequence中从start开始的匹配到的第一个字符的坐标
*/
public int indexIn(CharSequence sequence, int start) {
int length = sequence.length();
Preconditions.checkPositionIndex(start, length);
for (int i = start; i < length; i++) {
if (matches(sequence.charAt(i))) {
return i;
}
}
return -1;
} /**
* 返回sequence最后一次匹配到Matcher的坐标
*/
public int lastIndexIn(CharSequence sequence) {
for (int i = sequence.length() - 1; i >= 0; i--) {
if (matches(sequence.charAt(i))) {
return i;
}
}
return -1;
} /**
* 返回Sequence匹配到Matcher的次数
*/
public int countIn(CharSequence sequence) {
int count = 0;
for (int i = 0; i < sequence.length(); i++) {
if (matches(sequence.charAt(i))) {
count++;
}
}
return count;
} /**
* 删除sequence中匹配到的所有字符并返回
*/
@CheckReturnValue
public String removeFrom(CharSequence sequence) {
String string = sequence.toString();
int pos = indexIn(string);
if (pos == -1) {
return string;
} char[] chars = string.toCharArray();
int spread = 1; // This unusual loop comes from extensive benchmarking
// 位移删除算法, 使用了双层循环和break OUT 写法
OUT: while (true) {
pos++;
while (true) {
if (pos == chars.length) {
break OUT;
}
if (matches(chars[pos])) {
break;
}
chars[pos - spread] = chars[pos];
pos++;
}
spread++;
}
return new String(chars, 0, pos - spread);
} /**
* 保留所有匹配的Matcher的字符并返回
* 使用逆向的Matcher的removeFrom()实现
*/
@CheckReturnValue
public String retainFrom(CharSequence sequence) {
return negate().removeFrom(sequence);
} /**
* 将所有匹配到Matcher的字符换成指定字符
*/
@CheckReturnValue
public String replaceFrom(CharSequence sequence, char replacement) {
String string = sequence.toString();
int pos = indexIn(string);
if (pos == -1) {
return string;
}
char[] chars = string.toCharArray();
chars[pos] = replacement;
for (int i = pos + 1; i < chars.length; i++) {
if (matches(chars[i])) {
chars[i] = replacement;
}
}
return new String(chars);
} /**
* 将所有可以匹配到的字符换成指定字符串
* 他的实现与替换成字符不相同,他是使用indexIn和StringBuilder实现的
*/
@CheckReturnValue
public String replaceFrom(CharSequence sequence, CharSequence replacement) {
int replacementLen = replacement.length();
if (replacementLen == 0) {
return removeFrom(sequence);
}
if (replacementLen == 1) {
return replaceFrom(sequence, replacement.charAt(0));
} String string = sequence.toString();
int pos = indexIn(string);
if (pos == -1) {
return string;
} int len = string.length();
StringBuilder buf = new StringBuilder((len * 3 / 2) + 16); int oldpos = 0;
do {
buf.append(string, oldpos, pos);
buf.append(replacement);
oldpos = pos + 1;
pos = indexIn(string, oldpos);
} while (pos != -1); buf.append(string, oldpos, len);
return buf.toString();
} /**
* 去除sequence首尾所有这个Matcher匹配的字符
*/
@CheckReturnValue
public String trimFrom(CharSequence sequence) {
int len = sequence.length();
int first;
int last; for (first = 0; first < len; first++) {
if (!matches(sequence.charAt(first))) {
break;
}
}
for (last = len - 1; last > first; last--) {
if (!matches(sequence.charAt(last))) {
break;
}
} return sequence.subSequence(first, last + 1).toString();
} /**
* 去掉sequence开头的所有Matcher能匹配的字符
*/
@CheckReturnValue
public String trimLeadingFrom(CharSequence sequence) {
int len = sequence.length();
int first; for (first = 0; first < len; first++) {
if (!matches(sequence.charAt(first))) {
break;
}
} return sequence.subSequence(first, len).toString();
} /**
* 删除字符串尾部所有能匹配Matcher的字符
*/
@CheckReturnValue
public String trimTrailingFrom(CharSequence sequence) {
int len = sequence.length();
int last; for (last = len - 1; last >= 0; last--) {
if (!matches(sequence.charAt(last))) {
break;
}
} return sequence.subSequence(0, last + 1).toString();
} /**
* 将所有能被Matcher匹配的组(连续匹配的字串)替换成指定字符
*/
@CheckReturnValue
public String collapseFrom(CharSequence sequence, char replacement) {
int first = indexIn(sequence);
if (first == -1) {
return sequence.toString();
} // TODO(kevinb): see if this implementation can be made faster
StringBuilder builder = new StringBuilder(sequence.length())
.append(sequence.subSequence(0, first))
.append(replacement);
boolean in = true;
for (int i = first + 1; i < sequence.length(); i++) {
char c = sequence.charAt(i);
if (matches(c)) {
if (!in) {
builder.append(replacement);
in = true;
}
} else {
builder.append(c);
in = false;
}
}
return builder.toString();
} /**
* 先trim再Collapse
*/
@CheckReturnValue
public String trimAndCollapseFrom(CharSequence sequence, char replacement) {
int first = negate().indexIn(sequence);
if (first == -1) {
return ""; // everything matches. nothing's left.
}
StringBuilder builder = new StringBuilder(sequence.length());
boolean inMatchingGroup = false;
for (int i = first; i < sequence.length(); i++) {
char c = sequence.charAt(i);
if (matches(c)) {
inMatchingGroup = true;
} else {
if (inMatchingGroup) {
builder.append(replacement);
inMatchingGroup = false;
}
builder.append(c);
}
}
return builder.toString();
} // Predicate interface /**
* matches()的异名方法
*/
@Override public boolean apply(Character character) {
return matches(character);
}

补完:

1. 提供的默认实现CharMatcher功能及介绍

2. 操作方法签名及功能列表

3. 使用代码示例

Guava CharMatcher的更多相关文章

  1. 使用 Google Guava 美化你的 Java 代码

    文章转载自:http://my.oschina.net/leejun2005/blog/172328 目录:[ - ] 1-使用 GOOGLE COLLECTIONS,GUAVA,STATIC IMP ...

  2. Guava 教程2-深入探索 Google Guava 库

    原文出处: oschina 在这个系列的第一部分里,我简单的介绍了非常优秀的Google collections和Guava类库,并简要的解释了作为Java程序员,如果使用Guava库来减少项目中大量 ...

  3. Guava入门使用教程

    Guava入门使用教程 Guava Maven dependency In our examples, we use the following Maven dependency. <depen ...

  4. guava(三)字符串处理 Joiner Splitter CharMatcher

    一.Joiner 拼接字符串 1.join 拼接集合中的元素 System.out.println(Joiner.on(";").join(Ints.asList(1,2,3))) ...

  5. Guava库介绍之实用工具类

    作者:Jack47 转载请保留作者和原文出处 欢迎关注我的微信公众账号程序员杰克,两边的文章会同步,也可以添加我的RSS订阅源. 本文是我写的Google开源的Java编程库Guava系列之一,主要介 ...

  6. Google Java编程库Guava介绍

    本系列想介绍下Java下开源的优秀编程库--Guava[ˈgwɑːvə].它包含了Google在Java项目中使用一些核心库,包含集合(Collections),缓存(Caching),并发编程库(C ...

  7. guava函数式编程

    [Google Guava] 4-函数式编程 原文链接 译文链接 译者:沈义扬,校对:丁一 注意事项 截至JDK7,Java中也只能通过笨拙冗长的匿名类来达到近似函数式编程的效果.预计JDK8中会有所 ...

  8. guava常用操作

    Jack47 我思故我在 Google Java编程库Guava介绍 本系列想介绍下Java下开源的优秀编程库--Guava[ˈgwɑːvə].它包含了Google在Java项目中使用一些核心库,包含 ...

  9. Guava 9-I/O

    字节流和字符流 Guava使用术语”流” 来表示可关闭的,并且在底层资源中有位置状态的I/O数据流.术语”字节流”指的是InputStream或OutputStream,”字符流”指的是Reader ...

随机推荐

  1. ubantu18.04下Hadoop安装与伪分布式配置

    1  下载 下载地址:http://mirror.bit.edu.cn/apache/hadoop/common/stable2/ 2 解压 将文件解压到 /usr/local/hadoop cd ~ ...

  2. 手动搭建ABP2.1.3——基础框架

    一.基础层搭建 1,创建一个空解决方案 2,层结构 Demo.Core[v:4.6.1]:类库 Demo.EntityFramework[v:4.6.1]:类库(引用Demo.Core) Demo.A ...

  3. Javascript实现一个插件

    写一个插件,兼容commonjs,amd,cmd,原生js. ;(function (global, factory) { if(typeof define == 'function' &&a ...

  4. leetcode 岛屿的个数 python

      岛屿的个数     给定一个由 '1'(陆地)和 '0'(水)组成的的二维网格,计算岛屿的数量.一个岛被水包围,并且它是通过水平方向或垂直方向上相邻的陆地连接而成的.你可以假设网格的四个边均被水包 ...

  5. Codeforces Round #505 (Div 1 + Div 2) (A~D)

    目录 Codeforces 1025 A.Doggo Recoloring B.Weakened Common Divisor C.Plasticine zebra D.Recovering BST( ...

  6. C/C++的64为长整型数的表示

    在C/C++中,64为整型一直是一种没有确定规范的数据类型.现今主流的编译器中,对64为整型的支持也是标准不一,形态各异.一般来说,64位整型的定义方式有long long和__int64两种(VC还 ...

  7. [BZOJ1115][POI2009]石子游戏Kam解题报告|阶梯博弈

    有N堆石子,除了第一堆外,每堆石子个数都不少于前一堆的石子个数.两人轮流操作每次操作可以从一堆石子中移走任意多石子,但是要保证操作后仍然满足初始时的条件谁没有石子可移时输掉游戏.问先手是否必胜. 首先 ...

  8. JVM进程cpu飙高分析

    在项目快速迭代中版本发布频繁  近期上线报错一个JVM导致服务器cpu飙高 但内存充足的原因现象.  对于耗内存的JVM程序来而言,  基本可以断定是线程僵死(死锁.死循环等)问题. 这里是纪录一下排 ...

  9. 马士兵hadoop第三课:java开发hdfs

    马士兵hadoop第一课:虚拟机搭建和安装hadoop及启动 马士兵hadoop第二课:hdfs集群集中管理和hadoop文件操作 马士兵hadoop第三课:java开发hdfs 马士兵hadoop第 ...

  10. [置顶] js 实现 <input type="file" /> 文件上传

    在开发中,文件上传必不可少,<input type="file" /> 是常用的上传标签,但是它长得又丑.浏览的字样不能换,我们一般会用让,<input type ...