Hi,
I've got ~15.000 events where FieldA exists (in total there are 20.000.000 events). I want to filter out these events and I'm wondering about the performance of different approaches.
Why is
sourcetype=X AND FieldA=*
so slow compared to this
sourcetype=X AND FieldA
BR
Heinz
In short, the first search takes much longer because it is searching for a lot more stuff eventhough the results are the same. Splunk reads the searches as follows:
sourtype AND X AND FieldA AND *
sourtype AND X AND FieldA
As for the thousands of events from an index of millions, I have found that creating a summary index of the _time and _raw data of the events I want to keep makes life a lot easier. Depending on your situation, you may find accelerated searches work better than summary searches.
In short, the first search takes much longer because it is searching for a lot more stuff eventhough the results are the same. Splunk reads the searches as follows:
sourtype AND X AND FieldA AND *
sourtype AND X AND FieldA
As for the thousands of events from an index of millions, I have found that creating a summary index of the _time and _raw data of the events I want to keep makes life a lot easier. Depending on your situation, you may find accelerated searches work better than summary searches.
Thanks a lot, now i've got it
The first search doesn't even look for "FieldA", so the first "translated" search there should read:
sourcetype AND X AND *
Meaning Splunk won't look for the field name until after it's found all the values, to see if it can couple the searched value to a field called "FieldA". This is a very good thing to remember when constructing searches as obviously the more you can narrow them down the better.