Getting Data In

Json event breaking no longer working since forwarding method changed from using a universal forwarder to AWS Firehose

gary_richardson
Path Finder

Hello!

I have some json data being generated by a client-side tool:

{
    "name": "open_sockets",
    "hostIdentifier": "ip-172-30-1-242.ec2.internal",
    "calendarTime": "Tue May 24 10:37:31 2016 UTC",
    "unixTime": "1464086251",
    "columns": {
        "family": "2",
        "fd": "6",
        "local_address": "172.30.1.242",
        "local_port": "32886",
        "path": "",
        "pid": "547",
        "protocol": "17",
        "remote_address": "4.53.160.75",
        "remote_port": "123",
        "socket": "52263"
    },
    "action": "added"
}

When this data is dropped into a flat file on the client then picked up by the Splunk Universal Forwarder, the field extractions using the _json sourcetype work perfectly. I've since reconfigured the tool to push the data into Amazon S3 via Firehose, and the field extractions are no longer work using the _json sourcetype.

The data is unchanged. I've examined the raw logs in the S3 management console and they are the same structure as the previously indexed flat file with no additional data or formatting as far as I can tell.

I've tried a variety of regex in the BREAK_ONLY_BEFORE, BREAK_ONLY_BEFORE_DATE, MUST_BREAK_AFTER, no effect.

I currently have two near identical clients forwarding this information: one using the Splunk UF and one using AWS Firehose, both with the _json sourcetype, the first works fine, the second does not!

I am editing sourcetypes using the GUI; we are imminently moving to Splunk Cloud, and I am training myself to cope with no shell access!

Thanks

0 Karma
1 Solution

gary_richardson
Path Finder

Solved it, with a little help from Splunk PS:

[osq]
LINE_BREAKER=(){\"name

And that works.

() Is a capture group which consumes nothing (otherwise Splunk will remove the "name" string)

View solution in original post

behlkush
Path Finder

Apparently I ran into an issue specifically as my Prod Splunk infra is running on 6.4.0 and Lower environment on 6.5.

6.5 had only this much and it worked perfectly:
[mySourcetype]
INDEXED_EXTRACTIONS = json
KV_MODE = none

For 6.4 I had to follow what Gary has recommended. Many thanks to him for sharing his experience.
Here is my props. Mind you, if you are a beginner, you would love to know that Indexer is where you want to update this props as event breaking is a parsing step.

[mySourcetype]
INDEXED_EXTRACTIONS = json
KV_MODE = none
LINE_BREAKER = (){\"searchString
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true

0 Karma

gary_richardson
Path Finder

Solved it, with a little help from Splunk PS:

[osq]
LINE_BREAKER=(){\"name

And that works.

() Is a capture group which consumes nothing (otherwise Splunk will remove the "name" string)

View solution in original post

jkat54
SplunkTrust
SplunkTrust

JSON should linemerge, and I know you said you tried the _json sourcetype, but this is a copy of it i'd like you to try instead:

[osq2]
CHARSET=AUTO
INDEXED_EXTRACTIONS=json
KV_MODE=none
SHOULD_LINEMERGE=true
category=Structured
disabled=false
pulldown_type=true
0 Karma

gary_richardson
Path Finder

Didn't work sadly...

I have discovered a difference between the two sources:

  • The flat file on disk, each json object begins on it's own line.

  • The AWS S3 source, all json events occur on the same line.

So, need a way to break events from a single line, where each json object begins with {"name":

I thought my regex would have done this, but clearly not.

Thanks.

0 Karma

jkat54
SplunkTrust
SplunkTrust

Have you tried this regex instead?

'{"name"'

surrounded by single quotes...?
or even '\{"name"'

again surrounded by single quotes but escaping the {

0 Karma

gary_richardson
Path Finder

Thanks, tried both but still not breaking.

Am I right in thinking the SHOULD_LINEMERGE directive could be causing Splunk to assume that the entire block of data is a single event? In that case, shouldn't a matching regex in BREAK_ONLY_BEFORE override that and define the individual events?

0 Karma

jkat54
SplunkTrust
SplunkTrust

Oh sorry you just hit the nail on the head

Your thinking is correct but let's try removing break only before, setting should line merge equal to false and use our regex as LINE_BREAKER instead.

0 Karma

jkat54
SplunkTrust
SplunkTrust

Should linemerge = true and the break only here and there's are for tcp/udp inputs mainly. Line breakers are for when you don't have the standard carriage returns / line feeds. Now we might still have issues with indexed extractions and may need to use kv mode instead... Let's see. Sorry for the bad syntax I'm replying from a phone.

0 Karma

gary_richardson
Path Finder

Thanks, attempting to force SHOULD_LINE_MERGE=false via the GUI keeps defaulting to "true" and adding a BREAK_ONLY_BEFORE directive, which is annoying... have no console access at present to edit the props.conf, will do this tomorrow back in the office and let you know.

0 Karma

gary_richardson
Path Finder

So, currently running with the below in system/local/props.conf

[osq]
NO_BINARY_CHECK = true
disabled = false
KV_MODE = none
SHOULD_LINEMERGE = false
LINE_BREAKER = {\"name/g

And still no breaking... Regex validated using http://www.regextester.com/:

alt text

alt text

0 Karma

gary_richardson
Path Finder

Another update...

If I copy the data from event (which contains multiple json objects on one line) into a flat file local to my laptop, then try to upload that file manually into Splunk using the _json sourcetype... event breaking works!

alt text

0 Karma

gary_richardson
Path Finder

As an update, I have created another sourcetype with the below in the Splunk_TA_aws app:

[osq2]
DATETIME_CONFIG = 
INDEXED_EXTRACTIONS = json
NO_BINARY_CHECK = true
TRUNCATE = 0
category = Structured
pulldown_type = 1
BREAK_ONLY_BEFORE = (\{\"name\")/g
disabled = false

Still not getting event breaking. Suggestions welcomed!

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!