Monitoring Splunk
Highlighted

Performance issue with large number of sources - even after cleaning the index

Communicator

Hi,

this is a tricky question about the internals of Splunk.

We had an issue with our installation: basically a single splunk instance on a syslog server consuming the logs. Due to some garbage syslog, the syslog daemon spitted out 50'000s of files with just one line, that got indexed as individual sources.

After that, search performance was ridiculously slow.

We decided to delete the index and not reindex the old data, i.e. just start indexing fresh data with the garbage properly filtered out.

Still, search performance was very slow.

Finally, we deleted all indexes (incl. internal ones), and now everything is amazingly fast.

Has anybody an explanation why cleaning just the faulty index was not enough?

Thanks,

Steph

0 Karma
Highlighted

Re: Performance issue with large number of sources - even after cleaning the index

Ultra Champion

Hmm, if you [monitor] a directory with tens of thousands of files, it will surely tax the tailing processor (who is responsible for checking on whether a file has been updated or not).

When you cleaned the data in your index, did you also clean out all those files from /var/log/syslog (or wherever they are created).

Also, the internal fishbucket index will keep track of all files it has seen, and it will not be deleted when you remove the source files or clean out the data index. That could also play a part.

The sheer amount of indexed bytes probably has little to do with it.

/K

0 Karma
Highlighted

Re: Performance issue with large number of sources - even after cleaning the index

Communicator

yes, we did clean also all the files to avoid reindexing. I was also thinking about the fishbucket, but had difficulty to understand how it would be related to a search...

Another possibility was that actually it was still the indexer that had a problem, and blocked some resources from the searches...

0 Karma
Highlighted

Re: Performance issue with large number of sources - even after cleaning the index

Ultra Champion

Well, it's hard to say now that you've cleaned it. I guess you didn't make a diag dump before you started cleaning?

But I agree with you, a large fishbucket could not really affect searching, could it? Unless it also degrades general performance in some way.

0 Karma
Highlighted

Re: Performance issue with large number of sources - even after cleaning the index

Communicator

yeah, thought about it too late...

0 Karma
Highlighted

Re: Performance issue with large number of sources - even after cleaning the index

Splunk Employee
Splunk Employee

Question, did you have millions of hosts, or millions of sources ?

  • Are you on an old version (3., 4.1.. 4.2.) ? because they were sensitive to large number of metadata. stored in ($SPLUNK_HOME/var/lib/splunk//db/.meta in old )
    However deleting the data and the global metadata version should have helped

  • The fishbucket keeps track of the sources not the so syslog TCP inputs

  • finally, you may have a large number of learned sourceype in your learned app, please check and clean $SPLUNK_HOME/etc/apps/leaned/local/

0 Karma
Highlighted

Re: Performance issue with large number of sources - even after cleaning the index

Communicator
  • lots of sources, actually could also have triggered lots of hosts, as the hosts were extracted from the filepath...
  • Splunk version is the latest one, 5.x
  • syslog was written to files, and the files indexed, no direct TCP input
  • sourcetype was fixed (syslog)
0 Karma
Highlighted

Re: Performance issue with large number of sources - even after cleaning the index

Splunk Employee
Splunk Employee

this is strange, maybe as kristian said, the scope of the files to tail is high and this is the tailing processor that is busy.

0 Karma