JAVA中如何高效的实现SQL的like语法?
/** SQL {@code LIKE} function. */
public static boolean like(String s,String pattern){
final String regex = Like.sqlToRegexLike(pattern, null);
return Pattern.matches(regex, s);
}
/** Translates a SQL LIKE pattern to Java regex pattern.*/
static String sqlToRegexLike(String sqlPattern,char escapeChar) {
int i;
final int len = sqlPattern.length();
final StringBuilder javaPattern = new StringBuilder(len + len);
for (i = 0; i < len; i++) {
char c = sqlPattern.charAt(i);
if (JAVA_REGEX_SPECIALS.indexOf(c) >= 0) {
javaPattern.append('\\');
}
if (c == escapeChar) {
if (i == (sqlPattern.length() - 1)) {
throw invalidEscapeSequence(sqlPattern, i);
}
char nextChar = sqlPattern.charAt(i + 1);
if ((nextChar == '_')
|| (nextChar == '%')
|| (nextChar == escapeChar)) {
javaPattern.append(nextChar);
i++;
} else {
throw invalidEscapeSequence(sqlPattern, i);
}
} else if (c == '_') {
javaPattern.append('.');
} else if (c == '%') {
javaPattern.append("(?s:.*)");
} else {
javaPattern.append(c);
}
}
return javaPattern.toString();
}
...
try {
Pattern pattern = patterns.get(buildKey(right, escTmp), new Callable<Pattern>() {
public Pattern call() throws Exception {
return Pattern.compile(buildPattern(right, escTmp), Pattern.CASE_INSENSITIVE);
}
});
Matcher m = pattern.matcher(left);
return m.matches() ? 1l : 0l;
} catch (ExecutionException e) {
throw new FunctionException(e.getCause());
}
...
到此,综上来看,不少项目是基于正则表达式来完成的,接下来我整理了下我最近实现的几种方式:
public static boolean like(final String dest, final String pattern) {
String regex = regexParse(pattern);
regex = regex.replace("_",".").replace("%",".*?");
Pattern p = Pattern.compile(regex,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
return p.matcher(dest).matches();
}
这种方式在代码层面简单明了,但是性能非常差,多次replace的使用就已经进行了多次遍历,这里有个可以优化的点,对于单个字符做替换可以选择用replaceChars(str, searchChar, replaceChar)这个方案。
贪婪模式
懒惰模式
独占模式
public static boolean like(final String dest, final String pattern) {
int destPointer = 0, patternPointer = 0;
int destRecall = -1, patternRecall = -1;
final int patternLen = pattern.length();
final int destLen = dest.length();
while( destPointer < destLen) {
......
......
}
while(patternPointer < patternLen && pattern.chatAt(patternPointer) == '%') {
patternPointer++;
}
return patternPointer == patternLen;
}
有个场景我们不得不去考虑,那就是回溯的情况,举个例子:
穷举法
查表法
状态模式
public void compile(final String pattern) {
...
LikeStateMachine machine = LikeStateMachine.build(pattern);
...
}
构建的过程就是我们把pattern解析加载的过程,我采用的方式是构建链表的方式。实现就是遍历构建的过程,compile时间复杂度O(n)
public LikeParserResult compile(final String pattern) {
return parseLikeExpress(pattern);
}
....
public boolean match(final String dest, LikeParserResult likeParserResult) {
switch (likeParserResult.getMatchResult()) {
case LikeMatchResult.TYPE.ALLMATCH:
return true;
case LikeMatchResult.TYPE.EQUALS:
return doEquals(dest, likeParserResult.getFinalPattern());
case LikeMatchResult.TYPE.STARTSWITH:
return doStartsWith(dest, likeParserResult.getFinalPattern());
case LikeMatchResult.TYPE.ENDSWITH:
return doEndsWith(dest, likeParserResult.getFinalPattern());
case LikeMatchResult.TYPE.CONTAINS:
return doContain(dest, likeParserResult.getFinalPattern());
default:
//或者别的实现
return likeParserResult.getLikeStateMachine().match(dest);
}
}
上面给出的代码是为了清楚的看到里面运行,最终根据自己擅长的代码风格(函数式Function,接口等),还可以进一步优化和精简。
...
public LikeParserResult compile(final String pattern) {
return parseLikeExpress(pattern);
}
public boolean match(final String dest, LikeParserResult likeParserResult) {
return likeParserResult.getMatcher().match(dest);
}
...
public class StartsWithMatcher implements Matcher {
...
public Boolean match(String dest) {
return dest.startsWith(this.pattern);
}
}
...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
微信赞赏支付宝扫码领红包
声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。侵权投诉:375170667@qq.com