Splunk Search

Why is subsearch not working with regex?

asaphappy
New Member

I'm attempting to find file downloads within a 2 minute timespan following a browser being spawned from outlook (my subsearch). Everything works find (the search andsubsearch) until I add the regex command limiting the filepath to the downloads folder. 

I'm getting the error "Error in 'SearchOperator:regex': Usage: regex <field> (=|!=) <regex>."

Can anyone help me understand why the regex command is throwing it off? I think it's because it's taking the subsearch as part of the regex syntax but I don't know how to separate the two. 

Search:

index=random_index event_simpleName=*FileWritten
| regex TargetFileName="^[\WD]\w*\S*\W(?:Users)\W\w+\.\w+\W(?:Downloads)\W\w+"
[search index=random_index* sourcetype=stuff event_simpleName=ProcessRollup* ParentBaseFileName=OUTLOOK.EXE ImageFileName IN (*firefox* *chrome* *edge*) CommandLine IN (*sharepoint.com*) NOT CommandLine IN (*vendor*)
| rename _time AS earliest
| eval latest=relative_time(_time,"+5min@min")
| table aid earliest latest
| format]
| table _time aid TargetFileName 

Labels (2)
0 Karma

woodcock
Esteemed Legend

The subsearch, as written, must be an argument to "| search"  so try this:

index=random_index event_simpleName=*FileWritten
| regex TargetFileName="^[\WD]\w*\S*\W(?:Users)\W\w+\.\w+\W(?:Downloads)\W\w+"
| search [search index=random_index* sourcetype=stuff event_simpleName=ProcessRollup* ParentBaseFileName=OUTLOOK.EXE ImageFileName IN (*firefox* *chrome* *edge*) CommandLine IN (*sharepoint.com*) NOT CommandLine IN (*vendor*)
| rename _time AS earliest
| eval latest=relative_time(_time,"+5min@min")
| table aid earliest latest
| format]
| table _time aid TargetFileName

yeahnah
Motivator

OK cool, I did not know that.

0 Karma

yeahnah
Motivator

Hi @asaphappy 

The regex command will only filter results that match or not match (!=) the regular expression. Try removing the non capture group syntax and see if it helps, i.e.

| regex TargetFileName="^[\WD]\w*\S*\WUsers\W\w+\.\w+\WDownloads\W\w+"

 If you are looking to use capture groups to pull fields out then use the rex command instead.

Hope that helps

0 Karma

yeahnah
Motivator

Ah yes, I had a closer look at your SPL query and see what your mean (hint: use the Insert/Edit code sample when adding SPL as it helps in readability.

yeahnah_0-1681248284530.png

Anyway, as you suspected the regex should come after the subsearch, which I suspect is supposed to be a filter for the base search.  So something like this

index=random_index event_simpleName=*FileWritten [search index=random_index* sourcetype=stuff event_simpleName=ProcessRollup* ParentBaseFileName=OUTLOOK.EXE ImageFileName IN (*firefox* *chrome* *edge*) CommandLine IN (*sharepoint.com*) NOT CommandLine IN (*vendor*)
  | rename _time AS earliest
  | eval latest=relative_time(_time,"+5min@min")
  | table aid earliest latest
  | format ]
| regex TargetFileName="^[\WD]\w*\S*\W(?:Users)\W\w+\.\w+\W(?:Downloads)\W\w+"
| table _time aid TargetFileName

 

0 Karma

asaphappy
New Member

Sorry, this is my first time posting. I'll make sure to do that next time. 

I tried your suggestion (moving the regex to after the subsearch) previously and the search returned with only the base search without the subsearch results fed into the base. So what I would see is all of the downloaded files of different users, but it should only be for that small subset of hosts that were seen spawning a browser from outlook. 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Can you share some anonymised examples of events you would expect to keep and events you would expect to have been excluded by the regex. Please share in a code block </> so we can copy them to test solutions with.

0 Karma

asaphappy
New Member

Sure! 

 

Events to keep:

\Device\HarddiskVolume3\Users\jill.michaels\Downloads\46.pdf

\Device\HarddiskVolume3\Users\funny.bunny\Downloads\randomclientform.pdf

\Device\HarddiskVolume3\Users\miley.cyrus\Downloads\data\uber.jar

 

Events to filter out

\Device\HarddiskVolume3\Users\random.user\AppData\Local\Temp\screenshot11913941210533618901.png

\Device\HarddiskVolume3\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroTextExtractor.exe
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

These events seem to be missing a number of significant fields: event_simpleName, ParentBaseFileName, ImageFileName, CommandLine, _time, aid

0 Karma

asaphappy
New Member

Thanks for the reply. 

That regex string actually works -- I tried the primary search alone and it did pull back all the results I was looking for. I did attempt to change the regex to the method you suggested but that still gave me the same error. 

0 Karma
Get Updates on the Splunk Community!

How to send events & findings from AWS to Splunk using Amazon EventBridge

Amazon EventBridge is a serverless service that uses events to connect application components together, making ...

Exciting News: The AppDynamics Community Joins Splunk!

Hello Splunkers,   I’d like to introduce myself—I’m Ryan, the former AppDynamics Community Manager, and I’m ...

The All New Performance Insights for Splunk

Splunk gives you amazing tools to analyze system data and make business-critical decisions, react to issues, ...