<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Force index-time host extraction to lower-case in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Force-index-time-host-extraction-to-lower-case/m-p/38859#M7183</link>
    <description>&lt;P&gt;I don't believe this is possible. There is certainly a case to be made for allowing simple transforms (e.g., simple string operations like yours, or basic arithmetic) that can not be accomplished by PCRE, but that would have to be an enhancement to the product, and has some other repercussions on searching for such transformed fields.&lt;/P&gt;

&lt;P&gt;I suppose in your particular case, for search purposes it's not necessary (as search is case-insenstive), and for reporting and display you can still use the &lt;CODE&gt;eval&lt;/CODE&gt; &lt;CODE&gt;lower()&lt;/CODE&gt; function. It does mess up &lt;CODE&gt;metadata&lt;/CODE&gt; a bit, but you &lt;EM&gt;could&lt;/EM&gt; resolve that by, e.g., changing the metadata search on the dashboards from &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| metadata type=hosts
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;to &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| metadata type=hosts 
| eval host=lower(host) 
| stats 
    sum(totalCount) as totalCount
    min(firstTime) as firstTime
    max(lastTime) as lastTime
    max(recentTime) as recentTime
    first(type) as type
  by host
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(though this might actually get &lt;CODE&gt;recentTime&lt;/CODE&gt; wrong, but I doubt that's a problem in practice)&lt;/P&gt;

&lt;P&gt;If you're looking at a few specific hosts and specific ways they are capitalized, you could also construct a lookup table and set a combination of automatic &lt;CODE&gt;FIELDALIAS&lt;/CODE&gt; and &lt;CODE&gt;LOOKUP&lt;/CODE&gt; to overwrite the original &lt;CODE&gt;host&lt;/CODE&gt; field. You could do it with a scripted lookup too I guess, if it's more complicated than that. This seems a little wrong to me though.&lt;/P&gt;</description>
    <pubDate>Sun, 29 Aug 2010 01:47:05 GMT</pubDate>
    <dc:creator>gkanapathy</dc:creator>
    <dc:date>2010-08-29T01:47:05Z</dc:date>
    <item>
      <title>Force index-time host extraction to lower-case</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Force-index-time-host-extraction-to-lower-case/m-p/38858#M7182</link>
      <description>&lt;P&gt;Is there a way to extract the hostname from an event, but force it to lower-case in the process?&lt;/P&gt;

&lt;P&gt;Extracting the hostname is easy enough (DEST_KEY in transforms.conf, etc.), but this doesn't account for the case.&lt;/P&gt;

&lt;P&gt;The SEDCMD option in props.conf would appear to be an option, but it's not clear whether 'y/[A-Z]/[a-z]/' style replacements are supported.  Even if they are, using SEDCMD would modify the original event text, which is undesirable.&lt;/P&gt;

&lt;P&gt;The goal is normalize hostnames so that they are consistent for all events from that machine, without modifying the actual event text.&lt;/P&gt;</description>
      <pubDate>Sat, 28 Aug 2010 00:35:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Force-index-time-host-extraction-to-lower-case/m-p/38858#M7182</guid>
      <dc:creator>southeringtonp</dc:creator>
      <dc:date>2010-08-28T00:35:53Z</dc:date>
    </item>
    <item>
      <title>Re: Force index-time host extraction to lower-case</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Force-index-time-host-extraction-to-lower-case/m-p/38859#M7183</link>
      <description>&lt;P&gt;I don't believe this is possible. There is certainly a case to be made for allowing simple transforms (e.g., simple string operations like yours, or basic arithmetic) that can not be accomplished by PCRE, but that would have to be an enhancement to the product, and has some other repercussions on searching for such transformed fields.&lt;/P&gt;

&lt;P&gt;I suppose in your particular case, for search purposes it's not necessary (as search is case-insenstive), and for reporting and display you can still use the &lt;CODE&gt;eval&lt;/CODE&gt; &lt;CODE&gt;lower()&lt;/CODE&gt; function. It does mess up &lt;CODE&gt;metadata&lt;/CODE&gt; a bit, but you &lt;EM&gt;could&lt;/EM&gt; resolve that by, e.g., changing the metadata search on the dashboards from &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| metadata type=hosts
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;to &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| metadata type=hosts 
| eval host=lower(host) 
| stats 
    sum(totalCount) as totalCount
    min(firstTime) as firstTime
    max(lastTime) as lastTime
    max(recentTime) as recentTime
    first(type) as type
  by host
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;(though this might actually get &lt;CODE&gt;recentTime&lt;/CODE&gt; wrong, but I doubt that's a problem in practice)&lt;/P&gt;

&lt;P&gt;If you're looking at a few specific hosts and specific ways they are capitalized, you could also construct a lookup table and set a combination of automatic &lt;CODE&gt;FIELDALIAS&lt;/CODE&gt; and &lt;CODE&gt;LOOKUP&lt;/CODE&gt; to overwrite the original &lt;CODE&gt;host&lt;/CODE&gt; field. You could do it with a scripted lookup too I guess, if it's more complicated than that. This seems a little wrong to me though.&lt;/P&gt;</description>
      <pubDate>Sun, 29 Aug 2010 01:47:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Force-index-time-host-extraction-to-lower-case/m-p/38859#M7183</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2010-08-29T01:47:05Z</dc:date>
    </item>
    <item>
      <title>Re: Force index-time host extraction to lower-case</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Force-index-time-host-extraction-to-lower-case/m-p/38860#M7184</link>
      <description>&lt;P&gt;gkanapathy is correct here. Although SEDCMD can perform y///g substitutions, it's only on _raw and not on any other fields.&lt;/P&gt;</description>
      <pubDate>Sun, 29 Aug 2010 04:49:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Force-index-time-host-extraction-to-lower-case/m-p/38860#M7184</guid>
      <dc:creator>Stephen_Sorkin</dc:creator>
      <dc:date>2010-08-29T04:49:39Z</dc:date>
    </item>
  </channel>
</rss>

