<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do optimizations for field-based searches work? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153907#M43216</link>
    <description>&lt;P&gt;I don't know if we support acquiring source, sourcetype, and host  via  autoKV.  I think effectively we don't because it will always be superseded by the built-in values, and thus I expect we construct the search string around this assumption.&lt;/P&gt;

&lt;P&gt;As for the behavior of source, sourcetype, and host fields and searching on them, that's a bit out of scope for a comment, but they are potentially more efficient than most fields or keywords.&lt;/P&gt;</description>
    <pubDate>Mon, 03 Nov 2014 18:48:00 GMT</pubDate>
    <dc:creator>jrodman</dc:creator>
    <dc:date>2014-11-03T18:48:00Z</dc:date>
    <item>
      <title>How do optimizations for field-based searches work?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153904#M43213</link>
      <description>&lt;P&gt;When I search using key-value pairs as terms, what kind of optimizations does Splunk perform to retrieve the events that match my terms in the smallest amount of time?&lt;/P&gt;</description>
      <pubDate>Tue, 07 Oct 2014 05:49:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153904#M43213</guid>
      <dc:creator>hexx</dc:creator>
      <dc:date>2014-10-07T05:49:59Z</dc:date>
    </item>
    <item>
      <title>Re: How do optimizations for field-based searches work?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153905#M43214</link>
      <description>&lt;P&gt;For a given search, such as &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;gt; index=myindex myfield=yak
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Splunk does not normally retreive all events from myindex and&lt;BR /&gt;
determine after extracting them whether the myfield exists after the&lt;BR /&gt;
fact with the value yak, because that would be far far too slow for&lt;BR /&gt;
typical use.&lt;/P&gt;

&lt;P&gt;Instead, Splunk makes use of searchtime configuration information to&lt;BR /&gt;
determine what possible origins could exist for the field, and&lt;BR /&gt;
generates a much more constrained search which will only return events&lt;BR /&gt;
which could possibly generate a field called myfield with a value&lt;BR /&gt;
"yak".&lt;/P&gt;

&lt;P&gt;In the simplest case, there are no lookups, extractions, calculated&lt;BR /&gt;
fields, etc etc which can ever produce the field called myfield.  In&lt;BR /&gt;
this case the search effectively becomes:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;gt; index=myindex yak myfield=yak
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Splunk assumes that if the field is going to come from something like&lt;BR /&gt;
autoKV, that the text of the field will be present in the keyword&lt;BR /&gt;
index. Thus, Splunk can via index traversal only return events which&lt;BR /&gt;
are referenced by this keyword.  Later after autoKV has performed its&lt;BR /&gt;
work on all of the returned events, we can test to see if myfield has&lt;BR /&gt;
come to exist on those events, and if so if contains the field "yak".&lt;/P&gt;

&lt;P&gt;Thus an event such as&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TIMESTAMP yak yak yak myfield=ox
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;will be returned by the index, autoKV'ed and then filtered out,&lt;BR /&gt;
because myfield will be ox.&lt;/P&gt;

&lt;P&gt;However, an event such as&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TIMESTAMP cow cow cow myfield=ox
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;will not be returned from the index at all, and we can skip all the&lt;BR /&gt;
work of autoKV and other implicit steps prior to postfiltering.&lt;/P&gt;

&lt;P&gt;This becomes more complicated when additional potential sources of the&lt;BR /&gt;
field may exist.  For example you may have a regex based extraction such as:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;props.conf:
REPORT-myfield = myfield-extractor

transforms.conf:
[myfield-extractor]
REGEX = chicken chicken chicken (\w+)\=
FORMAT = $1::yak
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Here we have hardcoded the value of yak in the transform, so it does&lt;BR /&gt;
not exist in the event.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TIMESTAMP chicken chicken chicken myfield=goats
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;this would produce a field called myfield with the value yak.  The&lt;BR /&gt;
default optimization won't work, because 'yak' may not be in the&lt;BR /&gt;
event. In this case you must give splunk a hint in fields.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[myfield]
INDEXED_VALUE = false
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Now splunk knows it cannot make this optimization, and must simply&lt;BR /&gt;
retreive all the events for this index and test for the field and&lt;BR /&gt;
value presence.&lt;/P&gt;

&lt;P&gt;Of course there are many other ways a field can come to exist.&lt;BR /&gt;&lt;BR /&gt;
Take for example a lookup such country_animals.csv:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Country, Animal
USA, chicken
France, rooster
Nepal, yak
Bhutan, yak
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Now, this might be configured to be used for your sourcetype, like so:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;transforms.conf:
[country_animals]
filename = country_animals.csv

props.conf:
[country_data]
LOOKUP-animals = country_animals Country OUTPUT Animal as myfield
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;At this point, the lookup could generate our field!  So Splunk does a&lt;BR /&gt;
reverse mapping in this lookup during search startup, and determines&lt;BR /&gt;
that if Country=Nepal or Country=Bhutan, then myfield would be yak.&lt;BR /&gt;
Thus the resulting search will look like.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;gt; index=myindex (yak OR 
   (sourcetype=country_animals AND (Country=Nepal OR Country=Bhutan))   
   myfield=yak
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Or something along these lines.&lt;/P&gt;

&lt;P&gt;And so on.  The more possible ways the the field could come to exist,&lt;BR /&gt;
the more elaborate the resulting search passed down to the&lt;BR /&gt;
optimization and fetch layers may be.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Oct 2014 05:53:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153905#M43214</guid>
      <dc:creator>jrodman</dc:creator>
      <dc:date>2014-10-07T05:53:49Z</dc:date>
    </item>
    <item>
      <title>Re: How do optimizations for field-based searches work?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153906#M43215</link>
      <description>&lt;P&gt;In regard to the search optimisations you describe above, do source, sourcetype and host qualify as autoKV fields, or are they something even more optimal than autoKV?&lt;BR /&gt;
See other (question)[&lt;A href="http://answers.splunk.com/comments/174575/view.html"&gt;http://answers.splunk.com/comments/174575/view.html&lt;/A&gt;]&lt;/P&gt;</description>
      <pubDate>Thu, 16 Oct 2014 11:08:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153906#M43215</guid>
      <dc:creator>manus</dc:creator>
      <dc:date>2014-10-16T11:08:25Z</dc:date>
    </item>
    <item>
      <title>Re: How do optimizations for field-based searches work?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153907#M43216</link>
      <description>&lt;P&gt;I don't know if we support acquiring source, sourcetype, and host  via  autoKV.  I think effectively we don't because it will always be superseded by the built-in values, and thus I expect we construct the search string around this assumption.&lt;/P&gt;

&lt;P&gt;As for the behavior of source, sourcetype, and host fields and searching on them, that's a bit out of scope for a comment, but they are potentially more efficient than most fields or keywords.&lt;/P&gt;</description>
      <pubDate>Mon, 03 Nov 2014 18:48:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153907#M43216</guid>
      <dc:creator>jrodman</dc:creator>
      <dc:date>2014-11-03T18:48:00Z</dc:date>
    </item>
    <item>
      <title>Re: How do optimizations for field-based searches work?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153908#M43217</link>
      <description>&lt;P&gt;What does &lt;CODE&gt;::&lt;/CODE&gt; really do (e.g., &lt;CODE&gt;country::Japan&lt;/CODE&gt;)?  Does it have any extra value, or does the internal optimizer convert &lt;CODE&gt;=&lt;/CODE&gt; to &lt;CODE&gt;::&lt;/CODE&gt; automatically?&lt;/P&gt;</description>
      <pubDate>Sat, 23 Jun 2018 16:46:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-optimizations-for-field-based-searches-work/m-p/153908#M43217</guid>
      <dc:creator>Kenshiro70</dc:creator>
      <dc:date>2018-06-23T16:46:06Z</dc:date>
    </item>
  </channel>
</rss>

