Monitoring Splunk

performance considerations for eventtypes?

Super Champion

I've been looking at the "Search Job Inspector" recently and noticing that command.search.typer is often showing up at the top of the list. It's not uncommon for it to be using nearly 50% (sometimes more) of the total command.search time. My searches are not performing unacceptably yet, but I and anticipate the number of eventtypes growing as we add more and more sources (as will the search load) so I can't imaging this will magically improve; so I would like to look at this now, before it become a bigger problem.

Based on general optimization principles, I'm starting with the following assumptions:

  1. The more eventtypes defined the more effort required to match events with eventtypes and therefore a longer execution of typer is to be expected. So reducing the total number of eventypes should improve performance.
  2. Poorly defined eventtypes will be more expensive than a well-defined eventtype. (For example, I'm assuming that an eventtype defined by the search "user!=joe bytes>=1000" would be less efficient than an eventtype defined as "sourcetype=ftp UPLOAD OK")

If I'm missing something or have any of this wrong so far, please say so.

#1: Reduce number of eventtypes:

Based on the Eventtypes' numbers limits question, the answer suggested that the total number of eventypes should ideally be limited to a few hundred. However, I'm not sure that very realistic. (The answer wasn't clear, but I'm thinking that a "few hundred" means somewhere between 200-400?)

I looked at my system and I currently have over 340 eventtypes defined that are shared across all apps. Of those, 111 of the come from the windows app. I have the eventtypes in the "unix" app set to application-only sharing, or that would add another 133 eventtypes globally (I did this because the "unix" eventtypes generally seem to be too-loosely defined and rather unhelpful. To be honest, the quality seems pretty poor. For example, as of Splunk 4.1.3, the Unix app contains 17 eventtypes (e.g. "df", "cpu", ...) that don't even have a "search" defined in the config file. They show up as "None" in the UI. Also the eventtype tags are pretty inconsistent. So I chose to ignore them rather than try to deal with them.)

I have an app with nearly 100 app-level eventtypes. It's fairly self contained, and it would be nice to "block" out the eventypes of the other apps to improve performance within that app, but that's not possible as far as I know.

Again, it seems inevitable that the number of eventtypes will only grow as splunk usage increases. So other than doing some cleanup, it doesn't seem possible to reduce this dramatically.

#2 Optimize eventtype definitions:

This is where I would really like to focus my efforts. The problem is, I haven't come across any recommendations/suggestions/guidelines as to how to write more-efficient eventtypes, and I would really appreciate some input from the people who know this stuff.

Without a good place to start, I've done what I always do: Ask lots of questions!

If these can be answered directly, that would be great, but even starting with some general principles would be a great help. Even a never-do-this list would be helpful.

Here are some specific eventtype performance questions:

What's the impact of...

  • Using the core indexed fields (source/sourcetype/host)? It seems eventtypes based on sourcetype can be included/excluded faster than eventtypes based on simple search terms, is that true?
  • Using index=? (Old docs said you shouldn't do this, but newer docs say any search expression is fine. If I have a bunch of firewall events that only occur in index=firewall will they be faster if I add that to the eventtype definition?)
  • Using splunk_server=?
  • Using field=value in an eventtype? Or is it better to use a literal string (like "EventCode=538") than using the field lookup (EventCode=538)? (Does using an eventtype with fields prevent field extraction engine from automatically disabling extractors when splunk detects that the fields being outputted are not needed by the search. I know some non-interactive searches try to do disable extractors for efficiency when possible, can eventtype get in the way of this?)
  • Using lookup fields? (Example: where an automatic extraction is based on a sourcetype, and that sourcetype is included in the eventtype definition)
  • Using a source/host/sourcetype tags as part of an eventtype criteria.
  • Using indexed fields vs extracted fields? (indexed fields like "punct")
  • Using quoted strings. (Can indexed terms alone be matched faster than a quoted expression? Is there any concept of segmentation here, or does typer re-evaluate the raw events anyways?)
  • Using wildcards (e.g. term*)
  • Nested eventtypes. Say you have an "base" eventtype that is used in the definition of several other eventtype (essentially creating a simple way to extend the "base" eventtype to cover a more specific scenario). So if the base eventtype doesn't match, can typer more quickly eliminate the derived eventypes too? Or does it cause more work? Or is it more like a macro-expansion thing where the eventtype get's unrolled before it's evaluated so it doesn't make much difference in performance in any case?

I'm guessing there are lots of corner cases here. An eventtype definition can go across tons of layers which is what makes them so powerful, and I'm sure that also mean they can be quite expensive at times too. So any hints would be appreciated, and some kind of "profiler tool" would be amazing (I'll even consider naming my first born after you.)

Thanks in advance!

Contributor

Splunk Employee
Splunk Employee

My understanding is:

  • the cost of eventtypes is not linear, but something more like logarithmic, because we use techniques to fold together the work for the possible eventtypes
  • The fields you base your eventtypes on are only important inasmuch as they affect whether splunk has to determine those fields at all. Thus having zero eventtypes matching against extracted fields might save you from having to ever extract, but if a single eventtype needs each extracted field, there is no additional cost of more eventtypes on extracted fields.
  • index should work in an eventtype. Index used to be handled in a very special way and the results of a given combination were hard to predict. index is now handled much more like other terms in a search expression
  • "field=value" vs field=value is general to all searches, including eventtypes. "field=value" could be a win if the "field" string is rare in your events. It could also be a win if somehow this allows AutoKV to not need to run, but that's very hard to arrange for/predict.
  • Using lookup fields may force the lookup to be loaded.
  • tags are not intrinsically expensive at all, they're just mapped (once per search) to a list of OR expressions. If the OR list is very very large, there is some cost.
  • The raw bucket data is not available to the event typer, thus indexed vs nonindexed fields are almost entirely out of scope.
  • I didn't even know we supported nested eventtypes. The implementation author probably could say more.

Splunk Employee
Splunk Employee

I'm not aware of the full algorithm for what extractions are run in the case that a field is requested.
Generally speaking, none of this matters much unless the data quantity being processed is very large.

Splunk Employee
Splunk Employee

re: your first comment. yes.

Super Champion

So if that's true, then what if that same named field is extracted by different extractions? It's not uncommon for two different sourcetypes to extract the same named field, but using different regexes. I'm guessing splunk must maintain some sort of fieldname to extractor mapping (which, of course the "$1::$2" extractions must really add some complexity)

0 Karma

Super Champion

Thank you. Another very helpful answer! Help me out point #2, am I understanding this correctly: If a given field is used in the definition of one eventtype, then there is no (or very minimal) additional cost incurred by using that same field in additional eventtypes. Is that right?

0 Karma

Splunk Employee
Splunk Employee

There are very few suggestions about general eventtype optimization:

  • use app scoping to limit the number of eventtypes a search has to consider
  • eventtypes using just terms/phrases/wildcarded terms are sometimes computationally cheaper than eventtypes with fields in them
  • since eventtyping is done at search time, all fields (indexed, search time, looked up) are treated the same

There are two modes in which the splunk UI executes searches:

  1. exploratory mode - searches in the flashtimeline view are ran this way

    • in this mode all fields, including eventtype, are required. This enables the users to view the field picker and field summary etc ..
  2. optimized mode - searches in the Advanced Charting view are ran this way (the scheduler runs searches in this mode too)

    • in this mode we analyze the search to determine the set of required fields and in most cases the eventtype field is not required, (unless of course the search is using the eventtype ) - thus no eventtyping is done

There is one neat trick to avoid the eventtyping even when running searches from the flashtimeline view: simply add "| fields - eventtype" to your search, for example:

"search * | fields - eventtype | stats count"  - no eventtyping even in exploratory mode

Note, that the number of eventtypes a search has to consider will not linearly correlate with the performance of eventtyping - the reason for this is that many eventtypes will share terms, phrases or field comparisons which we evaluate only once.

Super Champion

Also note that sometimes it's necessary to use the fields command to only include the fields you want to see, rather than trying to exclude the eventtype field. For more info see: http://answers.splunk.com/answers/95595/disabling-eventtypes-on-a-per-query-basis/96678

0 Karma

Contributor

Path Finder

Interesting performance tip: For a reasonably optimized eventtype definition (in my test case it was just sourcetype=foo), you get nearly the same performance running eventtype=foo | fields - eventtype as sourcetype=foo | fields - eventtype, in exploratory mode. It makes sense -- as a primary search term, the eventtype just expands like a macro; the post-processing "what eventtypes does this event match?" is the time sink.

0 Karma

Communicator

For me, i had to also exclude tags on eventtypes that come with the windows app.
|fields - eventtype,tag::eventtype|

Super Champion

For anyone following along at home. I can confirm that using "| fields - eventtype" does stop the evaluation of eventtypes. This didn't originally work for me because I had field extractions setup in props.conf based on eventtype (a feature I really loved, btw). However that is no longer a supported configuration in 4.x (well technically it works, but you get a bunch of warning messages), and it prevented typer from being disabled. Now that I've removed all my eventtype-based field extractions this trick works. Thanks Ledion.

Super Champion

Thanks for your response; it was helpful. Can you confirm that using | fields - eventtype actually does disable the typer. I've tried this from both flashtimeline and charting and I still see command.search.typer on the search job inspector with the same amount of time associated with it. And search.log still contains: INFO SearchParser - PARSING: typer | tags

Motivator

i liked the question a lot, and i've poked some people to see if they can help. 🙂

0 Karma

Super Champion

Thanks. But I'm not sure if that means that you liked the question, or if you can help get me some answers. 😉

Motivator

dude, you are amazing. thank you!

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!