Splunk Search

Combining 2 regex searches to a single one breaks the output results | Shows complete event instead of regex filtered

sysamit
Engager

I have an index cloud_stats on which I need to create a daily error count by source report, so that we can work on the error with max counts first and then others.  I have 4 error keywords to search as below :

  1. DailyErrorLogtTrigger
  2. Check if cloud return error or not
  3. Tracking Error
  4. Maualerrormessage

For 1,2 I created below search that works perfectly.

index="cloud_stats" "*ERROR*" 
| search (  "*DailyErrorLogtTrigger*" OR "*check if Cloud return error or not*" ) 
| rex field=_raw "INFO(.+(?=Step2)|[\s>]+)(?<Error>.+)"
| table index source Error
| stats count(Error) AS COUNT BY index source Error
| sort -COUNT


Output :

sysamit_0-1610471612055.png

And For 3,4 created below search, that also works perfectly.

index="cloud_stats" "*ERROR*" 
| search ( "*Tracking Error*" OR "*Maualerrormessage*" )
| rex field=_raw "(?<Error>DisplayName[^,]+Tracking[^,]+)"
| rex field=_raw "ManualStatsInfo.*,(?<Error>.*)"
| table index source Error
| stats count(Error) AS COUNT BY index source Error
| sort -COUNT

Output :

sysamit_1-1610471795413.png

Issues arises when I want to combine both of the above searches into one like below. 

index="cloud_stats" "*ERROR*" 
| search ( "*Tracking Error*" OR "*Maualerrormessage*" 
            OR  "*DailyErrorLogtTrigger*" OR "*check if Cloud return error or not*"
        ) 
| rex field=_raw "(?<Error>DisplayName[^,]+Tracking[^,]+)"
| rex field=_raw "ManualStatsInfo.*,(?<Error>.*)"
| rex field=_raw "INFO(.+(?=Step2)|[\s>]+)(?<Error>.+)"
| table index source Error
| stats count(Error) AS COUNT BY index source Error
| sort -COUNT

 
What happens here is that after combining the two searches,

  1. Warning part - I get error in rex command pointing to error in rex from the Ist search initially
  2. Frustrating part - And for some reason, second search (on Tracking error) starts to output the complete event ( in the red box in output) instead of filtered out keywords (in the green box in output) for some cases.

Output :

sysamit_3-1610473571682.png

Those events are a page long and I don't want to create report over such 100's of events where each event is a page long. That will make the error trend analysis more cumbersome.  Can anyone please help on to how to get only the words that are filtered from regex (as in green box)  instead of complete event (as in red box)?

Labels (3)
0 Karma

scelikok
Champion

Hi @sysamit,

When you combine both searches rex commands are working on whole dataset. I believe both problems are because of the rex limit error.  I suggest  you to work more on your regex optimization. Try to make them match using less steps. You should better get a bigger sample of all 4 log types and optimize this regex on this sample. Since there is no sample event data, I cannot make more suggestions.

If this reply helps you  an upvote is appreciated.

If this reply helps you an upvote is appreciated.

sysamit
Engager

Thanks @scelikok for your suggestions. I also figured out that garbled output was due to combining the two searches, two regex runs on whole dataset.

0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.