<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic extract a field from event source filename in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36029#M7964</link>
    <description>&lt;P&gt;How can I configure Splunk to extract some fields from the source filename. &lt;/P&gt;

&lt;P&gt;I already specify a host_regex and that works great. Also I understand that if there is a date in the filename, splunk will find it automatically. The field can be extracted at index-time if it must.&lt;/P&gt;

&lt;P&gt;I have Splunk watch a lot of files and directories. For some source types,  there are fields in the filename that aren't the 'host', or a 'date' field. Furthermore these fields aren't repeated in the event data themselves (i.e. not in the file content, only in the filename). &lt;/P&gt;

&lt;P&gt;Here's an example from a host collecting oracle alert logs,.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;logdir&amp;gt;/&amp;lt;host&amp;gt;.&amp;lt;sid&amp;gt;.log

/tmp/splunk_alert_logs/db01.TOOL.log
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This might have been hit already, but I'm having some difficulty finding an answer that doesn't involve an automatically located field.&lt;/P&gt;</description>
    <pubDate>Tue, 24 Aug 2010 15:07:54 GMT</pubDate>
    <dc:creator>jstillwell</dc:creator>
    <dc:date>2010-08-24T15:07:54Z</dc:date>
    <item>
      <title>extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36029#M7964</link>
      <description>&lt;P&gt;How can I configure Splunk to extract some fields from the source filename. &lt;/P&gt;

&lt;P&gt;I already specify a host_regex and that works great. Also I understand that if there is a date in the filename, splunk will find it automatically. The field can be extracted at index-time if it must.&lt;/P&gt;

&lt;P&gt;I have Splunk watch a lot of files and directories. For some source types,  there are fields in the filename that aren't the 'host', or a 'date' field. Furthermore these fields aren't repeated in the event data themselves (i.e. not in the file content, only in the filename). &lt;/P&gt;

&lt;P&gt;Here's an example from a host collecting oracle alert logs,.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;logdir&amp;gt;/&amp;lt;host&amp;gt;.&amp;lt;sid&amp;gt;.log

/tmp/splunk_alert_logs/db01.TOOL.log
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This might have been hit already, but I'm having some difficulty finding an answer that doesn't involve an automatically located field.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Aug 2010 15:07:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36029#M7964</guid>
      <dc:creator>jstillwell</dc:creator>
      <dc:date>2010-08-24T15:07:54Z</dc:date>
    </item>
    <item>
      <title>Re: extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36030#M7965</link>
      <description>&lt;P&gt;You should be able to just define a transform.conf with SOURCE_KEY set as "source" and a REGEX defining your fieldname.  Something like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[a_transform]  
SOURCE_KEY = source  
REGEX = (?i)[\/A-Za-z]+\/(?&amp;lt;give_it_a_fieldname&amp;gt;\w+)(?=\.\w+)  
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;In your props.conf your reference the "a_transform" such as:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[a_sourcetype]  
REPORT-transform = a_transform  
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You'll probably also have to define the fieldname in fields.conf as well since field value would not have been indexed; such as:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[give_it_a_fieldname]  
INDEXED_VALUE = false
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 24 Aug 2010 18:15:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36030#M7965</guid>
      <dc:creator>ayme</dc:creator>
      <dc:date>2010-08-24T18:15:58Z</dc:date>
    </item>
    <item>
      <title>Re: extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36031#M7966</link>
      <description>&lt;P&gt;Hi, 
as a rule of thumb, it is bad to have splunk index new fields if not really necessary (higher burden on the indexer and so on). What you might need most is a search-time field extraction that you can configure like this.&lt;/P&gt;

&lt;P&gt;Suppose your oracle alert logs have the sourcetype "oracle_alert", then in local/props.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[oracle_alert]
EXTRACT-sourcefields = (?&amp;lt;logdir&amp;gt;[\w\W/]+)/(?&amp;lt;host_2&amp;gt;[^\.]+)\.(?&amp;lt;sid&amp;gt;[^\.]+)\.log in source
# (double check the regex) (edit: the "in source" is what tells splunk to look into the source field)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;That would instruct splunk to extract 3 fields: logdir (anything before the last /), host_2 (which I renamed to not override the original "host" field), and sid.&lt;/P&gt;

&lt;P&gt;You don't need to modify fields.conf for this. &lt;/P&gt;

&lt;P&gt;Another method would be to also use transforms.conf&lt;/P&gt;

&lt;P&gt;For further info on the alternative methods, you can write a comment here or refer to:
&lt;A href="http://www.splunk.com/base/Documentation/latest/Admin/Propsconf" rel="nofollow"&gt;Props.conf documentation&lt;/A&gt; and search for the keyword "EXTRACT".&lt;/P&gt;

&lt;P&gt;If you want to test the regex before applying the configuration, you can use the &lt;A href="http://www.splunk.com/base/Documentation/latest/SearchReference/Rex" rel="nofollow"&gt;rex command&lt;/A&gt; on the search bar; in this case, you could run a search like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype=oracle_alert | rex field=source max_match=10 "(?&amp;lt;logdir&amp;gt;[\w\W/]+)/(?&amp;lt;host_2&amp;gt;[^\.]+)\.(?&amp;lt;sid&amp;gt;[^\.]+)\.log"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and check that the three fields appear on the left field-picker menu.&lt;/P&gt;

&lt;P&gt;Hope that helped a bit,
Paolo&lt;/P&gt;</description>
      <pubDate>Tue, 24 Aug 2010 23:15:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36031#M7966</guid>
      <dc:creator>Paolo_Prigione</dc:creator>
      <dc:date>2010-08-24T23:15:40Z</dc:date>
    </item>
    <item>
      <title>Re: extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36032#M7967</link>
      <description>&lt;P&gt;Note that if you search for this field alone, because it's marked as a non-indexed value, Splunk will perform a full table scan to find matches. To get around this performance issue, you could extract the field at index time, set up a lookup table that maps all sources to your fields, or set up a set of eventtypes.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Aug 2010 00:03:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36032#M7967</guid>
      <dc:creator>Stephen_Sorkin</dc:creator>
      <dc:date>2010-08-25T00:03:47Z</dc:date>
    </item>
    <item>
      <title>Re: extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36033#M7968</link>
      <description>&lt;P&gt;Thanks, I know to avoid indexing fields, I just knew that 'host' was indexed, so I wasn't sure how fields form the filename where going to work out. I understand now though. Thanks both of you.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Aug 2010 02:56:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36033#M7968</guid>
      <dc:creator>jstillwell</dc:creator>
      <dc:date>2010-08-25T02:56:50Z</dc:date>
    </item>
    <item>
      <title>Re: extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36034#M7969</link>
      <description>&lt;P&gt;I was looking to do a similar thing, and ran into this thread.&lt;BR /&gt;
Thought I'd post what I did with the "rex field=source", if it helps anyone who wants to do something similar:&lt;/P&gt;

&lt;P&gt;If the "hidden" source field is something like:&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
source=/home/MyName/logs/Area_310_LosAngeles/2012/07/log.txt&lt;BR /&gt;
&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;You could use:&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
(splunk search expression here) | rex field=source "Area_(?&lt;AREA_CODE&gt;.{3})"&lt;BR /&gt;
&lt;/AREA_CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;
to extract the 3 character Area Code out of the source path name, etc.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 13:44:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36034#M7969</guid>
      <dc:creator>NK_1</dc:creator>
      <dc:date>2020-09-28T13:44:43Z</dc:date>
    </item>
    <item>
      <title>Re: extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36035#M7970</link>
      <description>&lt;P&gt;props.conf instructions worked perfectly, thanks.&lt;BR /&gt;
Only thing I had to add were quotes around the regex.&lt;/P&gt;</description>
      <pubDate>Mon, 19 Oct 2015 21:06:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36035#M7970</guid>
      <dc:creator>nmungel_splunk</dc:creator>
      <dc:date>2015-10-19T21:06:54Z</dc:date>
    </item>
    <item>
      <title>Re: extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36036#M7971</link>
      <description>&lt;P&gt;I had a very similar situation, where pertinent information was in the filenames - this solution worked perfectly for me as well.  Thank You!&lt;/P&gt;</description>
      <pubDate>Thu, 21 Dec 2017 07:03:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36036#M7971</guid>
      <dc:creator>JoeIII</dc:creator>
      <dc:date>2017-12-21T07:03:38Z</dc:date>
    </item>
    <item>
      <title>Re: extract a field from event source filename</title>
      <link>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36037#M7972</link>
      <description>&lt;P&gt;Paolo,&lt;/P&gt;

&lt;P&gt;I was trying to create Field extractions from source for multiple sourcetype.&lt;/P&gt;

&lt;P&gt;Is there a way to create single extract for multiple sourcetype?&lt;/P&gt;

&lt;P&gt;Ex: sourcetypes: (The source is has the serverName)&lt;BR /&gt;
soa:access:log&lt;BR /&gt;
soa:server:log&lt;/P&gt;

&lt;P&gt;Trying to create Field extraction as type=inline&lt;BR /&gt;
Name:EXTRACT-SOA-ServerName&lt;BR /&gt;
sourcetype:  soa:.*:log&lt;BR /&gt;
Extraction / Transformation: (?SOA[0-9]+) in source&lt;/P&gt;

&lt;P&gt;However the above one is not working&lt;BR /&gt;
when I am trying to search : index="soa" sourcetype="soa:server:log"&lt;/P&gt;

&lt;P&gt;Let me know what i missed out?&lt;/P&gt;

&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
      <pubDate>Mon, 06 Apr 2020 06:27:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/extract-a-field-from-event-source-filename/m-p/36037#M7972</guid>
      <dc:creator>3es</dc:creator>
      <dc:date>2020-04-06T06:27:37Z</dc:date>
    </item>
  </channel>
</rss>

