Following the brilliant presentation of “Damian Conway” on how regular expression work, I have taken his pseudo-code and used my own example to try to understand it.
(see for reference the following 42:56) on
https://www.infoq.com/presentations/regex/
He conceptualises the 0 or more loop /X*/
as the following pseudocode
maxreps = Infinity;
loop {
repcount = 0;
for 1..maxreps {
str[matchpos] == 'X' or exit for;
matchpos++;
repcount++;
}
try {
// the rest of regex's commands here
}
catch Backtracking {
maxreps = repcount - 1;
maxreps >= 0 or throw Backtracking;
next loop;
}
finally {
exit loop;
}
}
For that I have taken my example /X*i/
on the string deXXXixia ho
Transcribing it to Javascript
const Backtracking = new Error("Backtracking exception");
function executeRegex(str, startPos) {
var returnState = false;
matchPos = startPos
while( matchPos <= str.length ) {
maxReps = Infinity;
while(true) {
var repCount = 0;
debugger;
for (let j = 0; j < maxReps; j++ ) {
if ( !(str[matchPos] == 'X') ) break ;
matchPos++ ;
repCount ++;
} try {
if ( !(str[matchPos] == 'i') ) throw Backtracking ;
matchPos++ ;
returnState = true;
} catch (e) {
if (! (e == Backtracking)) throw e; // Only catch Backtracking errors
maxReps = repCount - 1;
continue;
} finally {
matchPos++;
break;
}
}
}
return returnState; // Exhausting the whole string
}
console.log(executeRegex('deXXXoxia ho', 0));
//console.log(executeRegex('deXXXixia ho', 0));
Running the Node debugger with the following variables to watch
watch('startPos'); watch('matchPos'); watch('str[matchPos]'); watch('repCount'); watch('returnState')
I have an issue with the finally incrementing matchPos. Even if there is no match this will get incremented.
I know Damian didn’t add that matchPos++
after the finnally , but if I don’t add it it simply dont process the string if there is no initial Match.