There's an app we have that is writing a file per transaction, and unfortunately, part of the useful information is in the filename itself.
But, the Sources.data has grown to such an extent that it is causing performance issues as it is constantly rewritten. It is currently 2.3 gigs.
What's the most effective way to essentially disable Sources.data? I have no use for it on this index.
This is possible to disable the global metadata since 4.3.3
And in Splunk 5.0 the global metadata are deprecated.
To disable the Global Metdata and restore full speed indexing :
Add option to indexes.conf that disables global metadata generation to handle deployments with rapidly growing sources.data file. (SPL-47689)
The side consequence will be that the default summary page will not display values for the "last source/sourcetype/host".
to apply, edit indexes.conf and add :
disableGlobalMetadata = true
see http://docs.splunk.com/Documentation/Splunk/4.3.4/Admin/Indexesconf
The eventual answer was to do a number of things...
This effectively short circuits Sources.meta. It would be nice if there was a setting in props.conf to do all of this for you. Basically, a setting that says "I still want to be able to search this field, but I don't care if it works for the metadata command."
You can't. The Sources.data
file is critical to Splunk's operation - part of its use is to help Splunk at search time to select buckets that may be relevant to a search. (If you did a search on source="/var/log/httpd/access*"
then Splunk could quickly determine if a particular index bucket had any matches at all by just scanning the sources list.)
A better solution might be to rewrite the value of "source" using props.conf
and transforms.conf
. You would then want to pull out the "useful information" from the filename and put it into an otherwise indexed field. Basically, making the source somewhat less specific but still making the useful information available to you.
I was under the impression those metadata files are only used by the |metadata commands, and the actual field values needed for search are in the tsidx files.
I was going down the route of copying source into orig_source and then trying to use FIELDALIAS to make it available as source, but the filenames have spaces in them, and I can't for the life of me make an indexed field work properly with spaces in the value.