Given two strings (denoted by A and B) and a set N of strings, I need to write a regular expression to test whether a given input string W contains a substring S, where S is any substring that satisfies all of the following three conditions: 1. starts with A; 2. ends with B; 3. any element of N does not occur in the part between A and B (this part does not overlap with A and B).
For example, I chose "ab"
as A, "bc"
as B, ["a", "cb", "cd"]
as N. If "ec"
is the inner part, then "abecbc"
is the string that satisfies all of the three conditions: if W contains such a substring, the regex must return true
. My first variant is the following regex:
var T = /(?=ab.*bc)(?=(?!ab.*a.*bc))(?=(?!ab.*cb.*bc))(?=(?!ab.*cd.*bc))/;
I chose W = S = "abecbc"
. This regex works as expected:
T.test("abecbc");
// true
But I am interested in the following problem: how to write a functionally equivalent regex without using the positive lookahead (?=)
as the AND operator?
So my second variant is the following:
var R = /ab(?!.*?(?:a|cb|cd).*)bc/;
But R.test("abecbc")
evaluates to false
. So let us split R
into three parts:
/ab(.*)/.test("abecbc")
returns true
.
Then
/(.*)bc/.test("abecbc")
returns true
.
The inner part (i.e. the part between "ab"
and "bc"
) is "ec"
. And
/(?!.*?(?:a|cb|cd).*)/.test("ec")
returns true
, which is expected. So there must be three truths, and there are no more parts in R
. Then why does
/ab(?!.*?(?:a|cb|cd).*)bc/.test("abecbc")
evaluate to false
? And how to write a correct regex that solves the problem described in the first paragraph of the post without using the positive lookahead (?=)
as the AND operator?