Performance issue with large number of sources - e...

grundsch · ‎08-22-2013

Hi,

this is a tricky question about the internals of Splunk.

We had an issue with our installation: basically a single splunk instance on a syslog server consuming the logs. Due to some garbage syslog, the syslog daemon spitted out 50'000s of files with just one line, that got indexed as individual sources.

After that, search performance was ridiculously slow.

We decided to delete the index and not reindex the old data, i.e. just start indexing fresh data with the garbage properly filtered out.

Still, search performance was very slow.

Finally, we deleted all indexes (incl. internal ones), and now everything is amazingly fast.

Has anybody an explanation why cleaning just the faulty index was not enough?

Thanks,

Steph

yannK · ‎08-22-2013

Question, did you have millions of hosts, or millions of sources ?

Are you on an old version (3., 4.1.. 4.2.) ? because they were sensitive to large number of metadata. stored in ($SPLUNK_HOME/var/lib/splunk//db/.meta in old )
However deleting the data and the global metadata version should have helped
The fishbucket keeps track of the sources not the so syslog TCP inputs
finally, you may have a large number of learned sourceype in your learned app, please check and clean $SPLUNK_HOME/etc/apps/leaned/local/

yannK · ‎08-22-2013

this is strange, maybe as kristian said, the scope of the files to tail is high and this is the tailing processor that is busy.

grundsch · ‎08-22-2013

lots of sources, actually could also have triggered lots of hosts, as the hosts were extracted from the filepath...
Splunk version is the latest one, 5.x
syslog was written to files, and the files indexed, no direct TCP input
sourcetype was fixed (syslog)

kristian_kolb · ‎08-22-2013

Hmm, if you [monitor] a directory with tens of thousands of files, it will surely tax the tailing processor (who is responsible for checking on whether a file has been updated or not).

When you cleaned the data in your index, did you also clean out all those files from /var/log/syslog (or wherever they are created).

Also, the internal fishbucket index will keep track of all files it has seen, and it will not be deleted when you remove the source files or clean out the data index. That could also play a part.

The sheer amount of indexed bytes probably has little to do with it.

/K

grundsch · ‎08-22-2013

yeah, thought about it too late...

kristian_kolb · ‎08-22-2013

Well, it's hard to say now that you've cleaned it. I guess you didn't make a diag dump before you started cleaning?

But I agree with you, a large fishbucket could not really affect searching, could it? Unless it also degrades general performance in some way.

grundsch · ‎08-22-2013

yes, we did clean also all the files to avoid reindexing. I was also thinking about the fishbucket, but had difficulty to understand how it would be related to a search...

Another possibility was that actually it was still the indexer that had a problem, and blocked some resources from the searches...

Performance issue with large number of sources - even after cleaning the index

What the End of Support for Splunk Add-on Builder Means for You

Solve, Learn, Repeat: New Puzzle Channel Now Live

Building Reliable Asset and Identity Frameworks in Splunk ES

Are you a member of the Splunk Community?

Performance issue with large number of sources - even after cleaning the index

What the End of Support for Splunk Add-on Builder Means for You

Solve, Learn, Repeat: New Puzzle Channel Now Live

Building Reliable Asset and Identity Frameworks in Splunk ES