I am trying to take logical matching criteria, for example:
(("Foo" OR "Foo Bar" OR FooBar) AND ("test" OR "testA" OR "TestB")) OR TestZ
and apply this as a match to the file in the pig using
result = filter inputfields by text matches (some regex expression here));
The problem is that I have no idea how to translate the boolean expression above into a regular expression for the match method.
I was looking for different things, and closest I came to something like this:
((?=.*?\bFoo\b | \bFoo Bar\b))(?=.*?\bTestZ\b)
Any ideas? I also need to try to do this conversion programmatically, if possible.
Some examples:
a - A quick brown Foo jumped over a lazy test (this should pass as it contains foo and test)
b - something happens in TestZ (it also passes because it contains testZ)
c - a quick brown Foo jumped over a lazy dog (this should fail because it contains Foo, but not test, testA or TestB)
thanks
source share