I was writing a piece of code (a custom EntityProcessor for Solr, but that isn't too relevant) designed to break lines of input based on a string delimiter (toMatch
below). Characters are read from a stream and then passed to addChar
. The assumption is that each character comes in a stream and look-ahead isn't allowed - the break needs to happen after the last character of a match is passed in. I wrote two implementations below addChar
and addChar2
. The first is a simple, dumb, keep a buffer and compare it each iteration, the second is a simple state machine and the latter is slightly faster. I wanted to see if anyone had any better ideas on how to optimize this code. Thanks!
import java.util.Iterator;
import java.util.LinkedList;
public class CharStreamMatcher {
protected char[] toMatch = null;
public MagicCharStreamMatcher(char[] match){
toMatch = match;
for(int i=0;i<match.length;i++) li.add((char)0);
}
public void clear(){
correspondences.clear();
}
LinkedList<Character> li = new LinkedList<Character>();
public boolean addChar2(char c){
li.addLast(c);
li.removeFirst();
int i=0;
for(Character ch : li){
if(ch.charValue() != toMatch[i++])
return false;
}
return true;
}
private class MutableInteger{
public int i;
public MutableInteger(int i){
this.i = i;
}
}
LinkedList<MutableInteger> correspondences = new LinkedList<MutableInteger>();
public boolean addChar(char c){
boolean result = false;
if(c == toMatch[0])
correspondences.add(new MutableInteger(-1));
Iterator<MutableInteger> it = correspondences.iterator();
while(it.hasNext()){
MutableInteger mi = it.next();
mi.i++;
// check the match
if(c != toMatch[mi.i]){
it.remove();
}
// are we done?
else if(mi.i == toMatch.length-1){
result = true;
it.remove();
}
}
return result;
}
}