When input length exceeds a certain threshold, it seems that some rex match will fail while others do not. Consider the following emulation:
index=none |stats count
| eval count=mvappend("short","long") | mvexpand count
| eval emul="Begin:foo}} bar "
+ if(count="short",mvjoin(mvrange(10000,15000),"-"),
mvjoin(mvrange(10000,20000),"-"))
| rex field=emul "Begin:(?<DATA>.*)"
| rex field=emul "Begin:.+}} (?<DATA>.+)"
| eval DATA=replace(DATA,"[\d-]+","/stuff")
| fields - emul
Note:
emul
is populated using numerals between 10000 and 20000. The first event will contain a string roughly 5,000x6 byte=30,000 byte long, whereas the second event contains a string roughly 10,000x6 byte=60,000 byte long.Because the two events are structurally identical, I expect both to produce bar /stuff
. Such is the case when the second mvrange() is up to mvrange(10000,19000)
, or ~9,000x6 byte=54,000 byte. Not much longer than that, output becomes
count DATA
short bar /stuff
long foo}} bar /stuff
In other words, the second rex fails to take effect. (The last two commands simply shorten output and have no effect on whether the second rex fails or not.) I see no error in job inspector and such. In my real-world search, complicated subsequent commands, including rex, do not appear affected, even after the equivalent of "Begin:.+}} (?.+)"
fails.
Is this a bug or is there some parameter I need to tweak? What is special about "Begin:.+}} (?.+)"
?
So it is unrelated to curly bracket, but triggered by left-aggressiveness before field extraction. If I add left-aggressiveness to the first rex, the first rex will fail the long string, too. E.g.,
| rex field=emul "Begi.+:(?<DATA>.*)"
| rex field=emul "Begin:.+}} (?<DATA>.+)"
Output becomes
count DATA
short bar /stuff
long
So it is unrelated to curly bracket, but triggered by left-aggressiveness before field extraction. If I add left-aggressiveness to the first rex, the first rex will fail the long string, too. E.g.,
| rex field=emul "Begi.+:(?<DATA>.*)"
| rex field=emul "Begin:.+}} (?<DATA>.+)"
Output becomes
count DATA
short bar /stuff
long
If the first rex is removed, the second rex indeed extracts no data with the long string.