<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to split data into separate sourcetypes with transforms in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311787#M58509</link>
    <description>&lt;P&gt;Ok I updated the original post with new testing configs. Everything is working EXCEPT sourcetyping. ITs not breaking out the sourcetypes, it just uses the one set in input BUT if I remove that it uses the "too_small" sourcetype. What am I missing? Has to be something simple&lt;/P&gt;

&lt;P&gt;Thanks again!&lt;/P&gt;</description>
    <pubDate>Thu, 07 Sep 2017 14:15:11 GMT</pubDate>
    <dc:creator>tkwaller</dc:creator>
    <dc:date>2017-09-07T14:15:11Z</dc:date>
    <item>
      <title>How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311781#M58503</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;

&lt;P&gt;I have a input that is monitoring a file. In this file theres data of multiple formats including timestamps, its bad, but I was thinking I could use a transform to set sourcetype in props that I could use to format data.&lt;BR /&gt;
So I did this in inputs.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[monitor:///var/log/this_log/*.ec]
index = main
sourcetype=momlog
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;then I created a transforms.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[momlog_json_sourcetype]
DEST_KEY = MetaData:Sourcetype
REGEX = \{\"msys\"
FORMAT = sourcetype::momlog:json


[momlog_basic_sourcetype]
DEST_KEY = MetaData:Sourcetype
REGEX = .*
FORMAT = sourcetype::momlog:basic
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I also have a props that looks like&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[momlog:basic]
TIME_FORMAT = %s
TIME_PREFIX = ^
LINE_BREAKER = ([\r\n]+)
TRANSFORMS-basic = momlog_basic_sourcetype

[momlog:json]
TIME_FORMAT = %s
TIME_PREFIX = "timestamp":"
INDEXED_EXTRACTIONS = JSON
TRANSFORMS-json = momlog_json_sourcetype
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;My question is this:&lt;BR /&gt;
What would the regex be for the NON-JSON data? Do inputs and props look correct? Im testing locally so I can break things all day long.&lt;/P&gt;

&lt;P&gt;thanks for the assistance&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 16:39:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311781#M58503</guid>
      <dc:creator>tkwaller</dc:creator>
      <dc:date>2017-08-31T16:39:18Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311782#M58504</link>
      <description>&lt;P&gt;How can we possibly know what REGEX will work if you do not post sample data?  In any case, the PaloAlto TA does this so you can download that app and check it all out.  It gets stuff from syslog that is supposed to come in as &lt;CODE&gt;sourcetype=pan:log&lt;/CODE&gt; and then it splits it out into 5 or 6 different sourcetypes based on RegEx patterns, just like what you are doing.&lt;/P&gt;</description>
      <pubDate>Sat, 02 Sep 2017 16:23:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311782#M58504</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-09-02T16:23:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311783#M58505</link>
      <description>&lt;P&gt;Well, really, all it has to do is match anything that isnt JSON format, meaning anything that ISNT&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;TIME_PREFIX = "timestamp":"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;which is why I didnt add the data samples. I can take a look at the app but I dont think it should really be that difficult.&lt;/P&gt;

&lt;P&gt;but JUST IN CASE&lt;BR /&gt;
(this is actually data from several files)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;1503626401@N@/tmp/12354@@user@1
1503664701@@@@M1
1503664761@@@@M1
1503664821@@@@M1
1503664881@@@@M1
1503664941@@@@M1
1503665001@@@@M1
1503665061@@@@M1
1503665121@@@@M1
1503665181@@@@M1
1503665241@@@@M1
1503665301@@@@M1
1503665361@@@@M1
1503665421@@@@M1
1503665481@@@@M1
{"msys":{"message_event":{"origination":"unauthorized_attempt","conn_name":"stuff","recv_method":"esmtp","remote_addr":"10.0.0.0:12345","raw_reason":"500 5.5.2 unrecognized command","node_name":"host@domain.com","scope_name":"scriptlet","pathway_group":"default","error_code":"500","msg_proc_state":"awaiting mailfrom","tenant_id":"__unauthorized__","reason":"500 5.5.2 unrecognized command","pathway":"default","local_addr":"10.0.0.0:12345","timestamp":"1503524959","customer_id":"0","event_id":"1234512354","type":"rejection"}}}
{"msys":{"message_event":{"timestamp":"1503527383","customer_id":"1","msg_proc_state":"awaiting mailfrom","pathway_group":"default","remote_addr":"10.0.0.0:12345","raw_reason":"500 5.5.2 unrecognized command","conn_name":"11/22-12345-1D10E111","event_id":"1234512345","reason":"500 5.5.2 unrecognized command","tenant_id":"__unauthorized__","type":"rejection","error_code":"500","local_addr":"10.0.0.0:12345","recv_method":"esmtp","node_name":"host.domain.com","origination":"unauthorized_attempt","pathway":"default","scope_name":"scriptlet"}}}
{"msys":{"track_event":{"rcpt_to":"user@domain.com","type":"open","rcpt_meta":{ "userMessageId": "123456789" },"campaign_id":"test_campaign","node_name":"host.domain.com","ip_address":"10.0.0.0:12345","customer_id":"1","template_id":"template_1234512345","transmission_id":"1234512345","event_id":"12345122345","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/601.7.8 (KHTML, like Gecko)","message_id":"000074029e597f538c00","accept_language":"en-us","rcpt_tags":[ "testTag" ],"delv_method":"esmtp","template_version":"0","timestamp":"1503527606"}}}
1503676342@@@@M1
1503676402@@@@M1
1503676462@@@@M1
1503676522@@@@M1
1503676582@@@@M1
1503676642@@@@M1
1503676702@@@@M1
1503676402: Marker 1
1503676462: Marker 1
1503676522: Marker 1
1503676582: Marker 1
1503676642: Marker 1
1503676702: Marker 1
1503676762: Marker 1
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 05 Sep 2017 12:53:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311783#M58505</guid>
      <dc:creator>tkwaller</dc:creator>
      <dc:date>2017-09-05T12:53:53Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311784#M58506</link>
      <description>&lt;P&gt;Clearly, something is wrong with the props TIME_PREFIX not having a closed quote.  &lt;/P&gt;

&lt;P&gt;I would expect that anything that doesn't match the json would therefore be non-json, so you would just use &lt;CODE&gt;.*&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Sep 2017 18:34:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311784#M58506</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2017-09-05T18:34:47Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311785#M58507</link>
      <description>&lt;P&gt;Timestamps are correct. Why would the time prefix need a closed quote, its the prefix of the epoch timestamp.&lt;/P&gt;

&lt;P&gt;I tried the &lt;CODE&gt;.*&lt;/CODE&gt; to match but my config must still be incorrect in the props or inputs as I got ONE of the JSON logs and non of the sourcetyping was correct.&lt;/P&gt;

&lt;P&gt;I tried several different variations of inputs and props, just not quite right yet. Close though.&lt;/P&gt;

&lt;P&gt;I updated the original post to reflect all changes made&lt;/P&gt;</description>
      <pubDate>Tue, 05 Sep 2017 21:27:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311785#M58507</guid>
      <dc:creator>tkwaller</dc:creator>
      <dc:date>2017-09-05T21:27:33Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311786#M58508</link>
      <description>&lt;P&gt;I would escape all 3 double-quotes (can't hurt).&lt;/P&gt;</description>
      <pubDate>Wed, 06 Sep 2017 01:50:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311786#M58508</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2017-09-06T01:50:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311787#M58509</link>
      <description>&lt;P&gt;Ok I updated the original post with new testing configs. Everything is working EXCEPT sourcetyping. ITs not breaking out the sourcetypes, it just uses the one set in input BUT if I remove that it uses the "too_small" sourcetype. What am I missing? Has to be something simple&lt;/P&gt;

&lt;P&gt;Thanks again!&lt;/P&gt;</description>
      <pubDate>Thu, 07 Sep 2017 14:15:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311787#M58509</guid>
      <dc:creator>tkwaller</dc:creator>
      <dc:date>2017-09-07T14:15:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311788#M58510</link>
      <description>&lt;P&gt;Hi there @tkwaller&lt;/P&gt;

&lt;P&gt;Try adding this to your &lt;CODE&gt;props.conf&lt;/CODE&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; [momlog]
 SHOULD_LINEMERGE=false
 NO_BINARY_CHECK=true
 TIME_PREFIX =\"timestamp\":\"
 TRANSFORMS-sourcetye_routing = momlog_basic_sourcetype, momlog_json_sourcetype

 [momlog:basic]
 TIME_FORMAT = %s
 TIME_PREFIX = ^
 LINE_BREAKER = ([\r\n]+)

 [momlog:json]
 TIME_FORMAT = %s
 TIME_PREFIX = \"timestamp\":\"
 INDEXED_EXTRACTIONS = JSON
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;EDITED: Added a few things on the main sourcetype and fixed TIME_PREFIX regex for momlog:json sourcetype.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Sep 2017 14:23:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311788#M58510</guid>
      <dc:creator>alemarzu</dc:creator>
      <dc:date>2017-09-07T14:23:32Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311789#M58511</link>
      <description>&lt;P&gt;I added that but when I did it broke formatting, JSON isnt recognized and sourcetype is still momlog&lt;/P&gt;</description>
      <pubDate>Thu, 07 Sep 2017 15:02:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311789#M58511</guid>
      <dc:creator>tkwaller</dc:creator>
      <dc:date>2017-09-07T15:02:32Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311790#M58512</link>
      <description>&lt;P&gt;Please try the above to see if it works now that I've added a few more things.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Sep 2017 15:55:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311790#M58512</guid>
      <dc:creator>alemarzu</dc:creator>
      <dc:date>2017-09-07T15:55:10Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311791#M58513</link>
      <description>&lt;P&gt;Yes that was exactly it, Sourcetype now splits properly as well as formatting properly. Thanks everyone for the help!&lt;/P&gt;</description>
      <pubDate>Thu, 07 Sep 2017 19:01:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311791#M58513</guid>
      <dc:creator>tkwaller</dc:creator>
      <dc:date>2017-09-07T19:01:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311792#M58514</link>
      <description>&lt;P&gt;Glad it worked out, happy splunking!&lt;/P&gt;</description>
      <pubDate>Thu, 07 Sep 2017 21:07:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311792#M58514</guid>
      <dc:creator>alemarzu</dc:creator>
      <dc:date>2017-09-07T21:07:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to split data into separate sourcetypes with transforms</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311793#M58515</link>
      <description>&lt;P&gt;HI Guys&lt;BR /&gt;
I used this and it worked thanks.&lt;/P&gt;

&lt;P&gt;One small question. The JSON i have has characters before it, so i need to get rid of them before i can get into the 100% JSON, i have done the following - however it is taking the whole line in not just the JSON. Is there a way to get it to take in only the JSON?&lt;/P&gt;

&lt;P&gt;Example - 2018-01-10 15:52:03 [metrics-application-1-thread-1] INFO  METRIC:41 - {"v":"1.0","t":"MTR","ts":"2018-01-10T15:52:03.700Z","h":"mx7654vm","pid" ....etc..&lt;/P&gt;

&lt;P&gt;Transform&lt;BR /&gt;
[AMBER_RAW_json_METRIC]&lt;BR /&gt;
DEST_KEY = MetaData:Sourcetype&lt;BR /&gt;
REGEX = {"v":"1.0\"&lt;BR /&gt;
FORMAT = sourcetype::AMBER_RAW:METRIC&lt;/P&gt;

&lt;P&gt;Props&lt;BR /&gt;
[AMBER_RAW:METRIC]&lt;BR /&gt;
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N&lt;BR /&gt;
TIME_PREFIX = \"ts\":\"&lt;BR /&gt;
INDEXED_EXTRACTIONS = JSON&lt;/P&gt;

&lt;P&gt;So it takes the full line, not just the JSON&lt;/P&gt;

&lt;P&gt;Thanks in Advance:)&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 18:52:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-to-split-data-into-separate-sourcetypes-with-transforms/m-p/311793#M58515</guid>
      <dc:creator>robertlynch2020</dc:creator>
      <dc:date>2020-09-29T18:52:49Z</dc:date>
    </item>
  </channel>
</rss>

