Splunk Search

How can I ignore additional data from a field

iMarko
Engager

Hi, I'm writing a splunk query to find emails with specific file types attached

I have the regex working which pulls the files and also extracts the file extensions which I'll be using for data collection purposes later. I will then use this extracted file extension to search and return specific emails containing files with said extension (hope that makes sense)

The problem is that when I use

|where FileExtension=".doc"

I get events returned where it contains a .doc file which is fine. But it also shows all the other files attached which I do not want.

For example I want my output to be

senderrecipientfile.doc

 

But what I am getting is 

senderrecipient

file.doc

file.a

file.b

file.c

file.d

 

Is there any way to do some kind of exclusive search that will ignore the extra data in the file field that are not .doc's as they are of no interest to me at the moment?

Labels (3)
0 Karma
1 Solution

tshah-splunk
Splunk Employee
Splunk Employee

Hey @iMarko,

You can try using the mvexpand function as below and then filter from the fileExtension field. I believe the third column is a file name and the extension is being extracted in another field for which you have written the where condition. Please find the reference query below

| mvexpand fileName
| where FileEntension=".doc"

 

---
If you find the answer helpful, an upvote/karma is appreciated

View solution in original post

tshah-splunk
Splunk Employee
Splunk Employee

Hey @iMarko,

You can try using the mvexpand function as below and then filter from the fileExtension field. I believe the third column is a file name and the extension is being extracted in another field for which you have written the where condition. Please find the reference query below

| mvexpand fileName
| where FileEntension=".doc"

 

---
If you find the answer helpful, an upvote/karma is appreciated

iMarko
Engager

Hi thanks for your help, this nearly worked but it did get me 99% of the solution I needed.

I had to mvexpand both my Files and Extension fields and then use search instead of where. So just in case anyone comes across this while searching online (I spent a good while searching before posting) here was the solution:

| rex field=filenameandurl "(?<Files>mess of regex)"  -- to get a list of all files in the email
| rex field=Files "(?<Extension>mess of regex)" -- to get just the file extensions (for counting purposes)
| mvexpand Extension
| mvexpand Files
| search Extension=".doc" AND Files="*.doc"

which worked perfectly

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...