<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: What is the most efficient approach to create a new field from the last portion of the source field's value? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197117#M56914</link>
    <description>&lt;P&gt;Having search-time field extractions are preferred to indexed time field extractions. More details on below links:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://answers.splunk.com/answers/2535/search-time-vs-index-time-field-extraction.html"&gt;http://answers.splunk.com/answers/2535/search-time-vs-index-time-field-extraction.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.1/Indexer/Indextimeversussearchtime"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.1/Indexer/Indextimeversussearchtime&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 21 Jan 2015 18:35:17 GMT</pubDate>
    <dc:creator>somesoni2</dc:creator>
    <dc:date>2015-01-21T18:35:17Z</dc:date>
    <item>
      <title>What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197116#M56913</link>
      <description>&lt;P&gt;I need to create 'site' field from 'source' field by grabbing last fragment of source, such as:&lt;BR /&gt;
/var/logs/dir/subdomain1.domainA.com -&amp;gt; subdomain1.domainA.com&lt;BR /&gt;
/var/logs/dir/domainB.com -&amp;gt; domainB.com&lt;/P&gt;

&lt;P&gt;Every search query filters on 'site' extensively, so my idea was to either use index-time extractions or source-time extraction via props/transforms.&lt;/P&gt;

&lt;P&gt;Considering that data is coming via universal forwarder to indexer - which approach is the most efficient?&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jan 2015 18:06:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197116#M56913</guid>
      <dc:creator>gesman</dc:creator>
      <dc:date>2015-01-21T18:06:50Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197117#M56914</link>
      <description>&lt;P&gt;Having search-time field extractions are preferred to indexed time field extractions. More details on below links:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://answers.splunk.com/answers/2535/search-time-vs-index-time-field-extraction.html"&gt;http://answers.splunk.com/answers/2535/search-time-vs-index-time-field-extraction.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.1/Indexer/Indextimeversussearchtime"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.1/Indexer/Indextimeversussearchtime&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jan 2015 18:35:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197117#M56914</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2015-01-21T18:35:17Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197118#M56915</link>
      <description>&lt;P&gt;[Updated]&lt;BR /&gt;
You can do search-time extraction of a field from another field. BUT - you can also do a calculated field! Calculated fields are also search time artifacts, and are preferred over index-time extractions. I strongly advise you to avoid index-time field extractions if you possibly can. They are &lt;STRONG&gt;not&lt;/STRONG&gt; more efficient, they are less flexible and they consume more disk space.&lt;/P&gt;

&lt;P&gt;Test  this eval command. if it works, use it to create a calculated field on the indexer (or search head if you have one):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=*.com 
| eval site = replace(source,".*/(.*?)$", "\1")
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 21 Jan 2015 20:24:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197118#M56915</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2015-01-21T20:24:17Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197119#M56916</link>
      <description>&lt;P&gt;Is there a reason why you couldn't just use rex? &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;source=* | rex field=source ".*/(?&amp;lt;end_of_source&amp;gt;.*)"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 21 Jan 2015 21:56:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197119#M56916</guid>
      <dc:creator>aljohnson_splun</dc:creator>
      <dc:date>2015-01-21T21:56:50Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197120#M56917</link>
      <description>&lt;P&gt;I am not going to do index-time extractions, but:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;EM&gt;You can't do search-time extraction of a field from another field&lt;/EM&gt;&lt;BR /&gt;
props.conf DOC says that I &lt;EM&gt;can&lt;/EM&gt; though, like this in my case:&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;/BLOCKQUOTE&gt;

&lt;PRE&gt;&lt;CODE&gt;[access_combined]
EXTRACT-site = [/\\](?&amp;lt;site&amp;gt;[^/\\]+])$ in source
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I thought I'd be able to use it like above especially making it sourcetype-specific. Shouldn't it work? &lt;BR /&gt;
Your eval certainly will work (not sure why double slashes though), could you elaborate please on the difference between EXTRACT-site and EVAL-site?&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jan 2015 23:37:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197120#M56917</guid>
      <dc:creator>gesman</dc:creator>
      <dc:date>2015-01-21T23:37:46Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197121#M56918</link>
      <description>&lt;P&gt;I have tons of queries and don't want to inject the same thing into every single query, knowing that it is needed for every each of them.&lt;/P&gt;</description>
      <pubDate>Wed, 21 Jan 2015 23:58:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197121#M56918</guid>
      <dc:creator>gesman</dc:creator>
      <dc:date>2015-01-21T23:58:50Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197122#M56919</link>
      <description>&lt;P&gt;I ended up putting this into &lt;STRONG&gt;/splunk/etc/apps/MY_APP/local/props.conf&lt;/STRONG&gt;:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[access_combined]
EVAL-site = replace(source, "^.*?/([^/]+)$", "\1")
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 22 Jan 2015 00:01:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197122#M56919</guid>
      <dc:creator>gesman</dc:creator>
      <dc:date>2015-01-22T00:01:29Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197123#M56920</link>
      <description>&lt;P&gt;If you're going to search on &lt;CODE&gt;site=foo&lt;/CODE&gt; then both are going to be terrible.&lt;/P&gt;

&lt;P&gt;The calculated field (props.conf &lt;CODE&gt;EVAL-site&lt;/CODE&gt;) and extracted fragment (props.conf &lt;CODE&gt;EXTRACT-site ... in source&lt;/CODE&gt;) are both going to be terribly slow to filter on because Splunk cannot build an efficient source selector based on them. Effectively, both these searches should do the same:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;site=foo
source=*foo
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However, to the machine those two are not identical. The first asks for a field called &lt;CODE&gt;site&lt;/CODE&gt;, which technically could come from anywhere. The second asks for specific sources ending in foo, so Splunk can look up matching sources and then load only those.&lt;/P&gt;

&lt;P&gt;To put numbers to the theory, I've defined both eval'd and extracted fields on my PC's &lt;CODE&gt;splunkd&lt;/CODE&gt; sourcetype and ran these three searches:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=_internal sourcetype=splunkd site_eval="license_usage.log"
index=_internal sourcetype=splunkd site_extract="license_usage.log"
index=_internal sourcetype=splunkd source="*license_usage.log"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The first two take five seconds, scanning 100k events and returning 2 events. The third takes 0.3 seconds, scanning 2 events and returning 2 events. In terms of events scanned, that's a 50000x speedup.&lt;/P&gt;

&lt;P&gt;My take-away from this is as follows: For searching, use &lt;CODE&gt;source=*yoursite&lt;/CODE&gt; instead of &lt;CODE&gt;site=yoursite&lt;/CODE&gt;. For reporting, create the &lt;CODE&gt;site&lt;/CODE&gt; field using calculated fields or search-time extractions (doesn't matter much) to get &lt;CODE&gt;... | stats count by site&lt;/CODE&gt;.&lt;/P&gt;

&lt;P&gt;You &lt;EM&gt;could&lt;/EM&gt; define an index-time field &lt;CODE&gt;site&lt;/CODE&gt; for searching, but there's no speed advantage over &lt;CODE&gt;source=*site&lt;/CODE&gt; to outweigh the disadvantages.&lt;BR /&gt;
If you absolutely need the prettier search for &lt;CODE&gt;site=yoursite&lt;/CODE&gt; without an indexed field you &lt;EM&gt;could&lt;/EM&gt; fiddle with fields.conf to teach Splunk how to use the &lt;CODE&gt;site&lt;/CODE&gt; field, something along the lines of the &lt;CODE&gt;source::&lt;/CODE&gt; example in &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/fieldsconf"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/fieldsconf&lt;/A&gt; - here be dragons though, less than careful settings in fields.conf can muck up a lot of things.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2015 05:15:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197123#M56920</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2015-01-22T05:15:34Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197124#M56921</link>
      <description>&lt;P&gt;Thanks Martin. That's a great idea about &lt;CODE&gt;source=*site.com&lt;/CODE&gt; to prefilter on sources.&lt;BR /&gt;
I still need to add &lt;CODE&gt;site=site.com&lt;/CODE&gt; because there could also be &lt;CODE&gt;site=subdomain.site.com&lt;/CODE&gt;.&lt;BR /&gt;
I assume that even if I have 10,000 different sources - &lt;CODE&gt;source=*site.com&lt;/CODE&gt; will still be faster than loading &lt;EM&gt;everything&lt;/EM&gt; and then post-filter of 'site' field?&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2015 13:43:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197124#M56921</guid>
      <dc:creator>gesman</dc:creator>
      <dc:date>2015-01-22T13:43:18Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197125#M56922</link>
      <description>&lt;P&gt;Ha! You are right and I had forgotten that you could do  this (&lt;CODE&gt;EXTRACT-site = [/](?[^/]+])$ in source&lt;/CODE&gt;). I used&lt;CODE&gt;//&lt;/CODE&gt; because I can't type. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; &lt;BR /&gt;
I fixed my answer.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2015 20:39:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197125#M56922</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2015-01-22T20:39:31Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197126#M56923</link>
      <description>&lt;P&gt;Thank you.&lt;BR /&gt;
I tried EXTRACT-* but it didn't work for some reason. EVAL-* approach did, so i went with it.&lt;BR /&gt;
Not to forget that we cannot do field aliasing with EVAL-&lt;EM&gt;-ed fields because aliasing done before EVAL-&lt;/EM&gt;-ing.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2015 22:03:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197126#M56923</guid>
      <dc:creator>gesman</dc:creator>
      <dc:date>2015-01-22T22:03:11Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197127#M56924</link>
      <description>&lt;P&gt;Yes. On search-time extracted fields, &lt;CODE&gt;field=*suffix&lt;/CODE&gt; is typically bad because it means you have to load everything, extract, and then filter. On index-time extracted fields such as source, &lt;CODE&gt;field=*value&lt;/CODE&gt; is typically not bad because you only need to look at all the extracted values, select the matches, and then load matching raw data.&lt;/P&gt;

&lt;P&gt;To judge how bad your filters are, compare the scanCount in the job inspector with the returned results.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2015 08:26:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197127#M56924</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2015-01-23T08:26:31Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197128#M56925</link>
      <description>&lt;P&gt;Another thought - if you do really have thousands or even millions of unique sources, it'd be worth a thought to split source and site up at index time. Let source be the common path, and site the site-specific part. That way you don't duplicate high cardinality at index time by adding a site field on top of the full source field but rather move the cardinality elsewhere. When that starts to make sense depends on your data.&lt;/P&gt;

&lt;P&gt;As for &lt;CODE&gt;site=subdomain.site.com&lt;/CODE&gt; vs &lt;CODE&gt;site=site.com&lt;/CODE&gt;, a filter on &lt;CODE&gt;source=*/site.com&lt;/CODE&gt; should fix that.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2015 08:29:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197128#M56925</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2015-01-23T08:29:44Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197129#M56926</link>
      <description>&lt;P&gt;Macro maybe ?&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2015 09:37:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197129#M56926</guid>
      <dc:creator>DavidHourani</dc:creator>
      <dc:date>2015-01-23T09:37:37Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197130#M56927</link>
      <description>&lt;P&gt;I thought about it - this won't work for Windows environment with backslashes.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2015 13:16:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197130#M56927</guid>
      <dc:creator>gesman</dc:creator>
      <dc:date>2015-01-23T13:16:28Z</dc:date>
    </item>
    <item>
      <title>Re: What is the most efficient approach to create a new field from the last portion of the source field's value?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197131#M56928</link>
      <description>&lt;P&gt;Use &lt;CODE&gt;source="*\\site.com"&lt;/CODE&gt; then.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2015 14:41:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/What-is-the-most-efficient-approach-to-create-a-new-field-from/m-p/197131#M56928</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2015-01-23T14:41:34Z</dc:date>
    </item>
  </channel>
</rss>

