SPlunk Searches slow

sahils · ‎05-25-2017

Hello,

I am facing challenges to search query in SPlunk 6.4.1 environment But Splunk Performance is very slow.
We have 1 search head and 2 Indexers, 1 Deployment sevrers and 1 liscence master server.

Please help how we can improve the performance and search query faster.

Thanks,
Sahil

Richfez · ‎05-26-2017

There has been some great comments and work by aakwah, adonio and cmerriman. Allow me to recap for the audience, then just take a stab at a solution (or at least problem identification).

Your searches are slow. index = _internal earliest=-60m | stats count by sourcetype takes a minute or more to return when performed over a one hour period. This is over about 250,000 events.

For reference, my setup at work is similar in topology (I have several SHs, but otherwise only a pair of clustered indexers) and seems fast enough to me. It does that same search in the following amount of time when I run it in verbose mode:

This search has completed and has returned 33 results by scanning 734,337 events in 41.769 seconds

When I run it in fast mode (upper right corner of the search window just under the time selector), It reports

This search has completed and has returned 33 results by scanning 734,720 events in 3.227 seconds

Which is more than an order of magnitude faster.

That's one simple optimization - if you are running all searches in verbose mode, maybe switching to fast or smart mode. (Smart mode is somewhere between the two - often it's nearly as fast as fast mode).

Secondly, I'd GUESS that you are still hitting a system bottleneck. You haven't provided enough information to know which bottleneck, but usually it's one of two things: IOPS on the indexer or CPU on the various machines.

IOPS is usually the culprit. What disks do you have under your indexers? If it's marginal (under 800 or 1000 IOPS) then there's likely the rest of your problem. In order for spinning disks to provide 1000 IOPS, you'd be looking at more than 8x 15,000 rpm disks in Raid 10, and probably 20 or more 7200 RPM disks in R10. If you have fewer than that per server, or if anything's in R5 or god forbid R6, it's highly likely that's the issue. If you have SSDs, well, you shouldn't have any serious IOPS issues. But "shouldn't have" doesn't mean "don't have" and I'd check stuff anyway.

Otherwise, spend some time watching the 'top' utility on the indexers while you run searches. Watch the CPU and disk times. The utility iostat from the sysstat install can also be very helpful.

If you find you have an issue and would like help fixing it (or confirming what you should do about anything you've found), please help us by compiling the information you've found together into a nice summary and pasting it in. If all you explain is "I ran iostat it says my disks are slow", then we'll probably only be able to say "buy faster disks". If you instead tell us you have six 7200 RPM 4 TB disks in R5 on an HP 840 controller with 2 GB of RAM, then we can give you specific advice. The answer still may be "buy more/faster disks" but at least then we can suggest "intermediate" solutions that may help signficantly without being too expensive.

cmerriman · ‎05-25-2017

can you give any detail about how much data your are searching through? are there ways we can help make the searches more efficient, such as syntax or changing a time window? is it one or a few searches or all searches?

adonio · ‎05-25-2017

is it linux or windows?

sahils · ‎05-25-2017

It is Linux Servers.

adonio · ‎05-25-2017

there are plenty of things to check here, machines specs, THP ulimit, also check internally, how is your cpu usage looks like?

sahils · ‎05-25-2017

I have SPlunk on SPlunk app which i am checking CPU usage and disk space is fine

MemTotal: 32871212 kB
MemFree: 9258220 kB
Buffers: 1049452 kB
Cached: 13493864 kB

aakwah · ‎05-25-2017

Hello,
Do you have the same behavior with all sourcetypes?
I had a similar case before with Bluecoat default app, and after a lot troubleshooting I found that the regex used for filed extractions at search time was the reason.

After I used delimiter based field extractions (it was space in case of Bluecoat logs) the slow performance in searches disappeared.

Hope this helps.

Regards

sahils · ‎05-25-2017

Hello,

It is for all Search query and source type , Please let me know How I can removed regex exp for all fields.

Thanks,
Sahil

aakwah · ‎05-25-2017

Filed extraction configurations do exist on props.conf and transforms conf on the search head, you will find all Regex's there if any, each sourcetype sould have its own stanza on props.conf .

But as long this issue is affecting all sourcetypes then it is a global issue and not related to certain sourcetype field extraction.

Regards

adonio · ‎05-26-2017

try and concentrate on what happened 3 days ago...
try also to search for warning and errors in _internal index

adonio · ‎05-25-2017

can you try and be more specific? what exactly is slow? how long a basic search takes?
try: index = _internal | stats count by sourcetype
do it in fast mode in the last 60 minutes

sahils · ‎05-25-2017

I takes almost more than minute for search result which is more than normal search

count

aws:cloudtrail:log 1007
aws:cloudwatch:log 26088
aws:cloudwatchlogs:log 79024
aws:config:log 734
aws:description:log 1955
aws:s3:log 6124
mongod 129
nfs0000000009669cb 162
scheduler 174
splunk-powershell.ps-2 1932
splunk-powershell.ps-too_small 722
splunk_ta_aws_proxy_conf-2 182
splunk_user_realnames 86
splunkd 6838386
splunkd_access 124312
splunkd_conf 2
splunkd_remote_searches 424
splunkd_stderr 5
ta_box-3 1
ta_box-4 490
ta_frmk-5 288

Thanks,
Sahil

adonio · ‎05-25-2017

again, there are plenty of points to check here, did you look at ulimits and THP?
was splunk working fine in the past? did you change anything lately? if it was always slow, what are the specs for Indexers and SH, CPU (cores) and Memory?
how many forwarders you have sending data? how much data do you index every day on each inxeder?

sahils · ‎05-25-2017

Hello,

This happened 3 days back, We didn't change anything, We have 3 forwarders sending data.

Thanks,
Sahil

sahils · ‎05-25-2017

Hello,

It is for all searches and data is in MB's, I change the time period also But it is very slow.

Can you please suggest

Thanks,
Sahil

SPlunk Searches slow

CX Day is Coming!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Are you a member of the Splunk Community?

SPlunk Searches slow

CX Day is Coming!

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console