Splunk Search

Why am I seeing poor search performance for simple searches and high CPU on one core?

Communicator

Splunk deployment -adhoc query have been slow for the past months.

1- We upgraded our system from 2 core to 12 core on a single server.
2- We upgraded from Splunk 5 to 6 (not a fresh install)

Now system has 16GB of RAM and Disk is 84% full.
I have followed the monitoring advice:
http://wiki.splunk.com/Community:PerformanceTroubleshooting

There is no IO bottleneck. When queries are run, there is only a sporadic spike of activity (iotop).
While the search is running there is one process at 100% the whole time.

I query the internal for cpu usage,
Splunk > index=
internal source=*metrics.log group=pipeline | timechart sum(cpu_seconds) by name

the index is spiking at 5.455 in some rare occasions, all search are below 1. Whatever that means. The link above mention abnormal usage when the indice is over 30.
Memory was at constant 81% usage on the box, After restarting splunk, it dropped to 15%, but performance remained the same
To test created a brand new index.
ingested 745 log4j events.
There is no data model (I later setup a data model, and accessed it through Pivot, but it was slow too)
Basic default setting

Performed a very simple:
index=”mytestindex” | head 1

ran the search from the command line.
It takes 2 minutes 14 seconds to return the query above

On the same box, other indexes totaling 360 M events (few GB of data), these are slow too.

Bottom line, it is constantly slow, every query.
The job inspector is telling it spent 96% (120s) of the time doing Dispatch.evaluate.search,
all other categories are below 1 sec, most under 0.5 seconds.

Tags (3)
1 Solution

Splunk Employee
Splunk Employee

I just found what was the problem...

Found the issue to be with bloated props.conf. at $SPLUNk_HOME/etc/apps/learned/local/ was containing 50000+ sourcetypes.

It was caused by having source type set to automatic on one of our inputs. For each csv file that got indexed a new source type was created leading to a bloated props.conf. I also got rid of unused entries in transforms.conf.

So it looks like the parsing job phase look into props.conf/transforms.conf before starting to search.

It explains why the "Parsing Job..." message would stay on for 2 minutes before having the search done in 1 second.

View solution in original post

Splunk Employee
Splunk Employee

I just found what was the problem...

Found the issue to be with bloated props.conf. at $SPLUNk_HOME/etc/apps/learned/local/ was containing 50000+ sourcetypes.

It was caused by having source type set to automatic on one of our inputs. For each csv file that got indexed a new source type was created leading to a bloated props.conf. I also got rid of unused entries in transforms.conf.

So it looks like the parsing job phase look into props.conf/transforms.conf before starting to search.

It explains why the "Parsing Job..." message would stay on for 2 minutes before having the search done in 1 second.

View solution in original post

New Member

Hello ,

We are facing the same issue now and may you advise how to resolve it? Thanks.

0 Karma

Splunk Employee
Splunk Employee

The problem is solved by an input whose sourcetype is set to automatic, most likely from a file or directory input.

http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Bypassautomaticsourcetypeassignment

However, in 6.2+ I am not seeing a way to set sourcetype to automatic when setting up new data inputs. Best thing I can suggest is too do two things:

  1. Run: $SPLUNK_HOME/bin/cmd btool inputs list --debug to display all inputs and try to determine erroneous sources of data, disable or modify where necessary.
  2. Move $SPLUNkHOME/etc/apps/learned/local/props.conf and $SPLUNkHOME/etc/apps/learned/local/sourcetypes.conf to a temp directory and restart Splunk.

This was successful for me today. Hope this helps - Good Luck!