Why the delay in writing out the search?

a212830 · ‎07-30-2017

Hi,

I'm trouble-shooting problems with dbconnect and noticed this, and trying to understand it. I have a very basic dbquery running. On my prod system, it takes anywhere from 15-20 seconds. On my dev system, which is much smaller, it takes about 2 seconds. They are physically in the same datacenter, and the env are similiar (both use SHP). I noticed this line from my search.log:

7-30-2017 22:56:12.129 INFO script - Writing search results info to /searchPool/var/run/splunk/dispatch/1501469770.1373.lrtp449/externSearchResultsInfo.csv
07-30-2017 22:56:27.029 INFO script - Invoked script dbquery with 724 input bytes (0 events). Returned 3159 output bytes in 14898 ms.

It appears that it takes about 15 seconds to write out the results, which is tiny. I spoke to my nas engineers, and they informed me that from what they could see (attached image), everything is performing very well. The IOPS is well beyond requirements. I've examined the configs from the two db connects, and they are the same. Baffled. The issue appears to be isolated to dbconnect. I haven't received any complaints otherwise, but my customers notice the difference between the two systems.

DalJeanis · ‎07-31-2017

I'm not privy to the technical underpinnings, but I have noticed that certain types of queries switch to status "finalizing" LOOOONG before they are actually done with the search.

It may be that the "writing results" message has no bearing on when the actual writing/inserting of the results begins.... it might just be written when the system decides where it is going to put the eventual results.

You could test that theory by providing to your test system an instaquery that creates a large number of results, and a more arduous query that creates a single result, and see which has a bigger lag between the message and completion. That would tell you whether to focus on the search aspect or the OS/writing aspect of the system.

sloshburch · ‎07-31-2017

Is the search itself getting slammed trying to apply a ton more config (more in the bundle)? So in other words, in the lab, maybe there's less apps and global knowledge objects that have to be parsed, while in the prod environment it's the more of that cludge.

a212830 · ‎07-31-2017

You callin my system a cludge? 😉

sloshburch · ‎08-01-2017

I would never....but if the shoe fits.... 😉 ha ha

a212830 · ‎07-31-2017

Would a dbquery involve bundles? Certainly, my prod system has a ton more objects.

sloshburch · ‎08-01-2017

I'd check the Job Inspector for those that complain and see if there's anything offensive in there. Also, this type of performance stuff could take enough back and forths to pinpoint that it might be worth a support case where someone can webex with you to find out for suresies.

Why the delay in writing out the search?

Introducing Splunk Enterprise Security 8.0!

Mastering Threat Hunting

Upcoming Community Maintenance: 10/28