I've been looking at the "Search Job Inspector" recently and noticing that
command.search.typer is often showing up at the top of the list. It's not uncommon for it to be using nearly 50% (sometimes more) of the total
command.search time. My searches are not performing unacceptably yet, but I and anticipate the number of eventtypes growing as we add more and more sources (as will the search load) so I can't imaging this will magically improve; so I would like to look at this now, before it become a bigger problem.
Based on general optimization principles, I'm starting with the following assumptions:
typeris to be expected. So reducing the total number of eventypes should improve performance.
"user!=joe bytes>=1000"would be less efficient than an eventtype defined as
"sourcetype=ftp UPLOAD OK")
If I'm missing something or have any of this wrong so far, please say so.
Based on the Eventtypes' numbers limits question, the answer suggested that the total number of eventypes should ideally be limited to a few hundred. However, I'm not sure that very realistic. (The answer wasn't clear, but I'm thinking that a "few hundred" means somewhere between 200-400?)
I looked at my system and I currently have over 340 eventtypes defined that are shared across all apps. Of those, 111 of the come from the
windows app. I have the eventtypes in the "unix" app set to application-only sharing, or that would add another 133 eventtypes globally (I did this because the "unix" eventtypes generally seem to be too-loosely defined and rather unhelpful. To be honest, the quality seems pretty poor. For example, as of Splunk 4.1.3, the Unix app contains 17 eventtypes (e.g. "df", "cpu", ...) that don't even have a "search" defined in the config file. They show up as "None" in the UI. Also the eventtype tags are pretty inconsistent. So I chose to ignore them rather than try to deal with them.)
I have an app with nearly 100 app-level eventtypes. It's fairly self contained, and it would be nice to "block" out the eventypes of the other apps to improve performance within that app, but that's not possible as far as I know.
Again, it seems inevitable that the number of eventtypes will only grow as splunk usage increases. So other than doing some cleanup, it doesn't seem possible to reduce this dramatically.
This is where I would really like to focus my efforts. The problem is, I haven't come across any recommendations/suggestions/guidelines as to how to write more-efficient eventtypes, and I would really appreciate some input from the people who know this stuff.
Without a good place to start, I've done what I always do: Ask lots of questions!
If these can be answered directly, that would be great, but even starting with some general principles would be a great help. Even a never-do-this list would be helpful.
Here are some specific eventtype performance questions:
What's the impact of...
index=? (Old docs said you shouldn't do this, but newer docs say any search expression is fine. If I have a bunch of firewall events that only occur in
index=firewallwill they be faster if I add that to the eventtype definition?)
field=valuein an eventtype? Or is it better to use a literal string (like
"EventCode=538") than using the field lookup (
EventCode=538)? (Does using an eventtype with fields prevent field extraction engine from automatically disabling extractors when splunk detects that the fields being outputted are not needed by the search. I know some non-interactive searches try to do disable extractors for efficiency when possible, can eventtype get in the way of this?)
typerre-evaluate the raw events anyways?)
typermore quickly eliminate the derived eventypes too? Or does it cause more work? Or is it more like a macro-expansion thing where the eventtype get's unrolled before it's evaluated so it doesn't make much difference in performance in any case?
I'm guessing there are lots of corner cases here. An eventtype definition can go across tons of layers which is what makes them so powerful, and I'm sure that also mean they can be quite expensive at times too. So any hints would be appreciated, and some kind of "profiler tool" would be amazing (I'll even consider naming my first born after you.)
Thanks in advance!
My understanding is:
I'm not aware of the full algorithm for what extractions are run in the case that a field is requested.
Generally speaking, none of this matters much unless the data quantity being processed is very large.
So if that's true, then what if that same named field is extracted by different extractions? It's not uncommon for two different sourcetypes to extract the same named field, but using different regexes. I'm guessing splunk must maintain some sort of fieldname to extractor mapping (which, of course the "$1::$2" extractions must really add some complexity)
Thank you. Another very helpful answer! Help me out point #2, am I understanding this correctly: If a given field is used in the definition of one eventtype, then there is no (or very minimal) additional cost incurred by using that same field in additional eventtypes. Is that right?
There are very few suggestions about general eventtype optimization:
There are two modes in which the splunk UI executes searches:
exploratory mode - searches in the flashtimeline view are ran this way
optimized mode - searches in the Advanced Charting view are ran this way (the scheduler runs searches in this mode too)
There is one neat trick to avoid the eventtyping even when running searches from the flashtimeline view: simply add "| fields - eventtype" to your search, for example:
"search * | fields - eventtype | stats count" - no eventtyping even in exploratory mode
Note, that the number of eventtypes a search has to consider will not linearly correlate with the performance of eventtyping - the reason for this is that many eventtypes will share terms, phrases or field comparisons which we evaluate only once.
Also note that sometimes it's necessary to use the fields command to only include the fields you want to see, rather than trying to exclude the
eventtype field. For more info see: http://answers.splunk.com/answers/95595/disabling-eventtypes-on-a-per-query-basis/96678
Interesting performance tip: For a reasonably optimized eventtype definition (in my test case it was just
sourcetype=foo), you get nearly the same performance running
eventtype=foo | fields - eventtype as
sourcetype=foo | fields - eventtype, in exploratory mode. It makes sense -- as a primary search term, the eventtype just expands like a macro; the post-processing "what eventtypes does this event match?" is the time sink.
For anyone following along at home. I can confirm that using "
| fields - eventtype" does stop the evaluation of eventtypes. This didn't originally work for me because I had field extractions setup in
props.conf based on
eventtype (a feature I really loved, btw). However that is no longer a supported configuration in 4.x (well technically it works, but you get a bunch of warning messages), and it prevented
typer from being disabled. Now that I've removed all my eventtype-based field extractions this trick works. Thanks Ledion.
Thanks for your response; it was helpful. Can you confirm that using
| fields - eventtype actually does disable the typer. I've tried this from both flashtimeline and charting and I still see
command.search.typer on the search job inspector with the same amount of time associated with it. And
search.log still contains:
INFO SearchParser - PARSING: typer | tags