Splunk Search

Extremely slow search

erga00
Path Finder

I have a extremely slow search and I cannot understand why it is so. I'd appreciate any pointers.

Hardware is not a problem, nor is the data volume. The search runs to 100% (in the web gui) in 5-10 minutes then spends approximately 1 hour before it changes from 100% to the Finished state.

I suspect the culprit is the mvexpand command because when I check the dispatch folder for the search I find ~20 mvexpand_1, mvexpand_2, etc files of approx 100-200MB in size. From what I can tell Splunk seems to be reading/writing to these files the whole time. Longer the search duration the more of those files are present and the longer this stage takes.

Is this normal for mvexpand or am I doing things in an inefficient manner?

Sample data (all on a single line):

<date> senderA@myCompany.com recipientA@myCompany.com;recipientB@myCompany.com;recipient@outside.com 1048576183

Search command:

... | makemv delim=";" recipient_address
| mvexpand recipient_address
| eval sender_msgtype=if(match(sender_address,"@myCompany.com$"),"internal","external")
| search sender_msgtype="internal"
| eval recipient_msgtype=if(match(recipient_address,"@myCompany.com$"),"internal","external")
| eval msgtype=recipient_msgtype
| bucket total_bytes span=1048576
| timechart span=1d usenull=f useother=t count(eval(msgtype="external")) by total_bytes where total_bytes>0

Running Splunk 4.1.6 on Windows 2008 R2

alacercogitatus
SplunkTrust
SplunkTrust

If it were me, I'd throw away the data I don't want before I start expanding things. I'm assuming you want a count over time of external emails sent by internal users. I don't think you need mvexpand, because if there is 1 external email in the recipient list, the email is counted as "external", and by expanding the results, you may end up with multiple counts of the same email (i.e. the same email goes to 2 or more external email addresses).

sourcetype=WHATEVER sender_address="*myCompany.com*" |makemv delim=";" recipient_address|eval recipient_msgtype=if(match(recipient_address, "@myCompany.com"),"internal","external")|bucket total_bytes span=1048576|timechart span=1d usenull=f useother=t count(eval(recipient_msg_type="external")) by total_bytes where total_bytes>0

OR

sourcetype=WHATEVER eval sender_msgtype=if(match(sender_address,"@myCompany.com$"), "internal","external")| search sender_msgtype="internal" |makemv delim=";" recipient_address|eval recipient_msgtype=if(match(recipient_address, "@myCompany.com"),"internal","external")|bucket total_bytes span=1048576|timechart span=1d usenull=f useother=t count(eval(recipient_msg_type="external")) by total_bytes where total_bytes>0

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...