<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Loading syslog-prefixed JSON in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112528#M23580</link>
    <description>&lt;P&gt;Please note that the greedy-ness did need to be removed by using ".*?".  Also, this technique works for &lt;CODE&gt;KV_MODE=json&lt;/CODE&gt; but not &lt;CODE&gt;INDEXED_EXTRACTION=json&lt;/CODE&gt;, which I need to use.  I've opened a new question for that: &lt;A href="http://answers.splunk.com/answers/145388/indexed_extractionsjson-with-transform"&gt;http://answers.splunk.com/answers/145388/indexed_extractionsjson-with-transform&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 15 Jul 2014 17:27:50 GMT</pubDate>
    <dc:creator>kamermans</dc:creator>
    <dc:date>2014-07-15T17:27:50Z</dc:date>
    <item>
      <title>Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112517#M23569</link>
      <description>&lt;P&gt;I've got a data source being produced by rsyslog which is in this format:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;Jun 19 10:28:25 hostname appname: {"date":12345678,"foo":"bar"}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;There is always one event per line.  I would like to parse this as JSON, discarding the stuff that syslog added to the beginning of the line (in this case "&lt;CODE&gt;Jun 19 10:28:25 hostname appname:&lt;/CODE&gt;").&lt;/P&gt;

&lt;P&gt;I have tried using &lt;CODE&gt;LINE_BREAKER&lt;/CODE&gt; to consume and discard this line prefix like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;LINE_BREAKER=}(\n[^{]+?)
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and I've tried using sed:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;SEDCMD-stripjsonheader = s/^[^{]+?//g
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Neither of which have worked.  In the log file it seems that the JsonLineBreaker is not using the LINE_BREAKER data, and SED is happening too late:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;06-19-2014 14:59:19.736 -0400 ERROR JsonLineBreaker - JSON StreamID: 0 having confkey=source::/file|host::app|SyslogJson|2 had parsing error: Unexpected character while looking for value: 'J'
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Is there any way for me to remove this line prefix before parsing?&lt;/P&gt;

&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 20:16:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112517#M23569</guid>
      <dc:creator>kamermans</dc:creator>
      <dc:date>2014-06-19T20:16:30Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112518#M23570</link>
      <description>&lt;P&gt;Try this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;SEDCMD-StripHeader = s/^.*(\{.*$)/\1/
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This should remove your prefix with anything after (and including) the opening '{' up until the end of line.&lt;BR /&gt;
Note the use of a capture group for the stuff you want to keep.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 20:40:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112518#M23570</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2014-06-19T20:40:26Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112519#M23571</link>
      <description>&lt;P&gt;Thanks for the quick response!  Unfortunately my regex was correct - it replaces the prefix with an empty string, which is faster than capturing the relevant part and replacing the entire string with that capture group.  In addition, your first .* is "greedy", so it will prefer to go to the last "{" instead of the first one.  Anyway, I have tested my regex using the "unstructured" data type and I can see that it trims properly, it's just that the JSON line parser seems to be parsing the data before the SED replacement occurs or something.  Maybe there is a TRANSFORM required or something?&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 20:49:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112519#M23571</guid>
      <dc:creator>kamermans</dc:creator>
      <dc:date>2014-06-19T20:49:09Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112520#M23572</link>
      <description>&lt;P&gt;I beg to respectfully differ. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;
I believe your RegEx only matches 'J', because you used a lazy match (?). &lt;BR /&gt;
So, if you like your RegEx better, try it without the '?':&lt;/P&gt;

&lt;P&gt;SEDCMD-stripjsonheader = s/^[^{]+//&lt;/P&gt;

&lt;P&gt;Note that I also don't think you need the global flag, so I removed it.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 20:56:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112520#M23572</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2014-06-19T20:56:13Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112521#M23573</link>
      <description>&lt;P&gt;Ah, very good point on the greedy thing, I do indeed need it to be greedy - thanks &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;Unfortunately, I'm getting the same error "had parsing error: Unexpected character while looking for value: 'J'" using your suggestion verbatim, which is strange because the 'J' should be gone in any case.  I restarted splunkd after the change and attempted a new file load with that props.conf.  Notice that although the data preview failed, I still tried to continue just in case the preview is different than the actual parsing.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 21:38:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112521#M23573</guid>
      <dc:creator>kamermans</dc:creator>
      <dc:date>2014-06-19T21:38:30Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112522#M23574</link>
      <description>&lt;P&gt;I have just tested it succesfully with this props.conf entry:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[answers_json]
SEDCMD-StripHeader = s/^.*(\{.*$)/\1/
KV_MODE=json
pulldown_type=1
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;IMG src="http://answers.splunk.com//storage/Screen_Shot_2014-06-19_at_3.07.21_PM.png" alt="alt text" /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 22:09:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112522#M23574</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2014-06-19T22:09:17Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112523#M23575</link>
      <description>&lt;P&gt;See my answer below. What version of Splunk are you on?&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 22:10:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112523#M23575</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2014-06-19T22:10:27Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112524#M23576</link>
      <description>&lt;P&gt;And it works just as well with your SEDCMD/RegEx&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 22:12:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112524#M23576</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2014-06-19T22:12:59Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112525#M23577</link>
      <description>&lt;P&gt;Wow, thanks for verifying!  I must be doing something noob-ish as I am just getting started with Splunk.  From the GUI I'm going to "Add Data" -&amp;gt; "From files and dirs...", then I choose a file and Preview the data, which brings up a dialog to specify the format, where I choose JSON and drop in the edits to props.conf.  How should I be loading the data into Splunk (using the file monitor method)?&lt;/P&gt;

&lt;P&gt;I'm using Splunk Enterprise 6.1 on my own server(s).&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 23:00:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112525#M23577</guid>
      <dc:creator>kamermans</dc:creator>
      <dc:date>2014-06-19T23:00:36Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112526#M23578</link>
      <description>&lt;P&gt;You are on the right track. There are multiple ways of getting data in and the UI is OK if you want to index files from the same server. If the data comes from a different box, you'll want to use a universal forwarder to watch a file/directory and forward to your indexer. Depends on your architecture.&lt;BR /&gt;
I'd recommend watching this for starters: &lt;A href="http://www.splunk.com/view/education-videos/SP-CAAAGB6"&gt;http://www.splunk.com/view/education-videos/SP-CAAAGB6&lt;/A&gt;&lt;BR /&gt;
and reading through this &lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/Data/WhatSplunkcanmonitor"&gt;http://docs.splunk.com/Documentation/Splunk/latest/Data/WhatSplunkcanmonitor&lt;/A&gt; for more details on the various options.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Jun 2014 23:41:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112526#M23578</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2014-06-19T23:41:42Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112527#M23579</link>
      <description>&lt;P&gt;Btw, your regex still fails in my environment due to the greedy ".&lt;EM&gt;" issue - I have nested records, and without ".&lt;/EM&gt;?" it starts the record at the last "{" instead of the first one.  For the sake of posterity, here is the most efficient regex I've found for this problem "s/^[^{]+//".  Note that I was able to get things working, but it seems INDEXED_EXTRACTIONS=json will not work with a custom SEDCMD or LINE_BREAKER.  I will post this as another question.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 17:04:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112527#M23579</guid>
      <dc:creator>kamermans</dc:creator>
      <dc:date>2020-09-28T17:04:31Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112528#M23580</link>
      <description>&lt;P&gt;Please note that the greedy-ness did need to be removed by using ".*?".  Also, this technique works for &lt;CODE&gt;KV_MODE=json&lt;/CODE&gt; but not &lt;CODE&gt;INDEXED_EXTRACTION=json&lt;/CODE&gt;, which I need to use.  I've opened a new question for that: &lt;A href="http://answers.splunk.com/answers/145388/indexed_extractionsjson-with-transform"&gt;http://answers.splunk.com/answers/145388/indexed_extractionsjson-with-transform&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 15 Jul 2014 17:27:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112528#M23580</guid>
      <dc:creator>kamermans</dc:creator>
      <dc:date>2014-07-15T17:27:50Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112529#M23581</link>
      <description>&lt;P&gt;Thank you Stefan and Kamermans - I had a customer running into this same issue today and this Answers post allowed me to avoid a ton of testing.&lt;/P&gt;</description>
      <pubDate>Wed, 18 Mar 2015 16:50:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112529#M23581</guid>
      <dc:creator>jbrodsky_splunk</dc:creator>
      <dc:date>2015-03-18T16:50:20Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112530#M23582</link>
      <description>&lt;P&gt;Just fyi, like kamermans, I found this match to be greedy, it prefer to go to the last '{'.  I have used his suggested "SEDCMD-stripjsonheader = s/^[^{]+//" and it worked better for me.&lt;/P&gt;</description>
      <pubDate>Tue, 17 Jan 2017 19:31:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112530#M23582</guid>
      <dc:creator>suarezry</dc:creator>
      <dc:date>2017-01-17T19:31:34Z</dc:date>
    </item>
    <item>
      <title>Re: Loading syslog-prefixed JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112531#M23583</link>
      <description>&lt;P&gt;Thanks for this discussion.  Helped big time solve my problem with the same issue.&lt;/P&gt;</description>
      <pubDate>Mon, 05 Feb 2018 19:40:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Loading-syslog-prefixed-JSON/m-p/112531#M23583</guid>
      <dc:creator>reswob4</dc:creator>
      <dc:date>2018-02-05T19:40:25Z</dc:date>
    </item>
  </channel>
</rss>

