<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to configure Splunk to split a single large file into 2 sourcetypes based on a keyword in the log file? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116120#M24252</link>
    <description>&lt;P&gt;Yes it is possible, but you could do it before the indexing-time  of the data pipeline, since override a sourcetype occurs at parse-time.&lt;BR /&gt;
I hope this could help you. &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Advancedsourcetypeoverrides"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Advancedsourcetypeoverrides&lt;/A&gt; &lt;/P&gt;</description>
    <pubDate>Thu, 26 Mar 2015 15:44:32 GMT</pubDate>
    <dc:creator>stephanefotso</dc:creator>
    <dc:date>2015-03-26T15:44:32Z</dc:date>
    <item>
      <title>How to configure Splunk to split a single large file into 2 sourcetypes based on a keyword in the log file?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116119#M24251</link>
      <description>&lt;P&gt;Hi ,&lt;/P&gt;

&lt;P&gt;I have a single source which has a huge number of events. These events are broadly classified into two groups and all are present in the same file single file. Now, my requirement is to get the file indexed into a single index as called "myindex" and have two different sourcetypes "group1" and "group2". group1 and group2 category in the file is distinguihsed with the help of the keyword XXX and YYY in my log file.for example XXX denotes group1 and YYY denotes group2.&lt;/P&gt;

&lt;P&gt;Here is the sample of log file.&lt;/P&gt;

&lt;P&gt;// mylog_sample.txt&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;24-08-2014 10:23:34  12e,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
24-08-2014 10:23:35  12e,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
24-08-2014 10:23:36  1w2,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
24-08-2014 10:23:37  12e,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
24-08-2014 10:23:39  122,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
25-08-2014 10:23:34  12e,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
25-08-2014 10:23:35  12e,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
25-08-2014 10:23:36  1w2,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
25-08-2014 10:23:37  12e,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
25-08-2014 10:23:39  122,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;All the data is present in the same file. Now i want to split the whole data into two different sourcetypes "group1" and "group2" in a single index.&lt;/P&gt;

&lt;P&gt;so if i search the data with:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="myindex" sourcetype="group1"  
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;it should list the following data ..&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;24-08-2014 10:23:34  12e,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
24-08-2014 10:23:35  12e,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
24-08-2014 10:23:36  1w2,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
24-08-2014 10:23:37  12e,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
24-08-2014 10:23:39  122,34,56,67,87,90,123, 34,545,45,XXX,56,5768,342,34456
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and if I search with the following:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index="myindex" sourcetype="group2"  
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;it should list the following data ..&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;25-08-2014 10:23:34  12e,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
25-08-2014 10:23:35  12e,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
25-08-2014 10:23:36  1w2,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
25-08-2014 10:23:37  12e,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
25-08-2014 10:23:39  122,34,56,67,87,90,123, 34,545,45,YYY,56,5768,342,34456
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Any help on the above use case. I used to transforms.conf, but no luck on separation. Please post the proper configuration that helps and suits the requirement.&lt;/P&gt;

&lt;P&gt;Many thanks.&lt;BR /&gt;
Rakesh.&lt;/P&gt;</description>
      <pubDate>Thu, 26 Mar 2015 13:02:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116119#M24251</guid>
      <dc:creator>rakesh_498115</dc:creator>
      <dc:date>2015-03-26T13:02:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure Splunk to split a single large file into 2 sourcetypes based on a keyword in the log file?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116120#M24252</link>
      <description>&lt;P&gt;Yes it is possible, but you could do it before the indexing-time  of the data pipeline, since override a sourcetype occurs at parse-time.&lt;BR /&gt;
I hope this could help you. &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Advancedsourcetypeoverrides"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Advancedsourcetypeoverrides&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Thu, 26 Mar 2015 15:44:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116120#M24252</guid>
      <dc:creator>stephanefotso</dc:creator>
      <dc:date>2015-03-26T15:44:32Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure Splunk to split a single large file into 2 sourcetypes based on a keyword in the log file?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116121#M24253</link>
      <description>&lt;P&gt;Thanks for the update stephan. but this seems not working below is my configuration.&lt;/P&gt;

&lt;P&gt;// inputs.conf&lt;/P&gt;

&lt;P&gt;[monitor:///opt/splunk/splunkInput/mylog_sample.txt]&lt;BR /&gt;
disabled = false&lt;BR /&gt;
followTail = 0&lt;BR /&gt;
recursive = false&lt;BR /&gt;
sourcetype = temp&lt;BR /&gt;
index = myindex&lt;/P&gt;

&lt;P&gt;// transforms.conf&lt;/P&gt;

&lt;P&gt;[set_group1_routing]&lt;BR /&gt;
REGEX = XXX&lt;BR /&gt;
FORMAT = sourcetype::group1&lt;BR /&gt;
DEST_KEY = MetaData:Sourcetype&lt;/P&gt;

&lt;P&gt;[set_group2_routing]&lt;BR /&gt;
REGEX = YYY&lt;BR /&gt;
FORMAT = sourcetype::group2&lt;BR /&gt;
DEST_KEY = MetaData:Sourcetype&lt;/P&gt;

&lt;P&gt;// props.conf&lt;/P&gt;

&lt;P&gt;[group1]&lt;BR /&gt;
TRANSFORMS-350_routing=set_group1_routing&lt;BR /&gt;
DATETIME_CONFIG = CURRENT&lt;BR /&gt;
MAX_TIMESTAMP_LOOKAHEAD = 150&lt;BR /&gt;
NO_BINARY_CHECK = 1&lt;BR /&gt;
SHOULD_LINEMERGE = false&lt;/P&gt;

&lt;P&gt;[group2]&lt;BR /&gt;
TRANSFORMS-350_routing=set_group2_routing&lt;BR /&gt;
DATETIME_CONFIG = CURRENT&lt;BR /&gt;
MAX_TIMESTAMP_LOOKAHEAD = 150&lt;BR /&gt;
NO_BINARY_CHECK = 1&lt;BR /&gt;
SHOULD_LINEMERGE = false&lt;/P&gt;

&lt;P&gt;Help me if am missing something. thanks in advance &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 19:21:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116121#M24253</guid>
      <dc:creator>rakesh_498115</dc:creator>
      <dc:date>2020-09-28T19:21:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure Splunk to split a single large file into 2 sourcetypes based on a keyword in the log file?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116122#M24254</link>
      <description>&lt;P&gt;Are you using a heavy forwarder? Where do you put this configuration? Is it your indexer?&lt;/P&gt;</description>
      <pubDate>Fri, 27 Mar 2015 16:22:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116122#M24254</guid>
      <dc:creator>vincenteous</dc:creator>
      <dc:date>2015-03-27T16:22:01Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure Splunk to split a single large file into 2 sourcetypes based on a keyword in the log file?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116123#M24255</link>
      <description>&lt;P&gt;it looks like your data will be sending with a sourcetype of temp intially.  So your props can probably look more like this&lt;/P&gt;

&lt;P&gt;[temp]&lt;BR /&gt;
DATETIME_CONFIG = CURRENT&lt;BR /&gt;
MAX_TIMESTAMP_LOOKAHEAD = 150&lt;BR /&gt;
NO_BINARY_CHECK = 1&lt;BR /&gt;
SHOULD_LINEMERGE = false&lt;BR /&gt;
TRANSFORMS-350_routing=set_group1_routing, set_group2_routing&lt;/P&gt;

&lt;P&gt;&lt;A href="https://community.splunk.com/Search%20Time%20settings" target="_blank"&gt;group1&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;A href="https://community.splunk.com/Search%20Time%20settings" target="_blank"&gt;group2&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;So the data will come in with a sourcetype of "temp" and hit your props.  So along with the timestamp/linebreak settings, your transforms will be applied which will set the new sourcetype accordingly.  &lt;/P&gt;

&lt;P&gt;Also, if you decide on creating field extractions or other search-time settings, they would be applied to/configured in the stanzas for those new sourcetypes you created - group1 and group2 &lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 19:18:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116123#M24255</guid>
      <dc:creator>maciep</dc:creator>
      <dc:date>2020-09-28T19:18:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to configure Splunk to split a single large file into 2 sourcetypes based on a keyword in the log file?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116124#M24256</link>
      <description>&lt;P&gt;No vincenteous... i am using this configuration at indexer .&lt;/P&gt;</description>
      <pubDate>Mon, 30 Mar 2015 09:08:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-split-a-single-large-file-into-2/m-p/116124#M24256</guid>
      <dc:creator>rakesh_498115</dc:creator>
      <dc:date>2015-03-30T09:08:22Z</dc:date>
    </item>
  </channel>
</rss>

