Getting Data In

Parsing very long JSON lines

leatherface
Explorer

I am working with log lines of pure JSON (so no need to rex the lines - Splunk is correctly parsing and extracting all the JSON fields). However, some of these lines are extremely long (greater than 5000 characters).

In order for Splunk to parse these long lines I have set TRUNCATE=0 in props.conf and this is working.
However, when I search, Splunk is not parsing the JSON fields at the end of the longer lines, meaning that if I search on these particular fields, the long lines don't appear in the search results.

Fields at the start of long lines do get parsed correctly.
Lines less than 5000 characters with the same fields do get parsed and searched correctly, so it's not a problem with the JSON field itself.

Is there some config setting or some command in my search that I can add to parse these lines, regardless of length?

Thanks in advance.

Tags (2)
1 Solution

capnjosh
Explorer

I had been hitting the same problem: some events had xml that was longer than 5000 characters, and spath wasn't extracting all the fields I knew were in there.

Here's how to fix it:
Override the spath character limit in $splunk_home%/etc/system/local/limits.conf.

My exact edit was to add the below config section to /opt/splunk/etc/system/local/limits.conf (since it wasn't there be default in 4.3.3). I pulled this from /opt/splunk/etc/system/default/limit.conf:

[spath]
#number of characters to read from an XML or JSON event when auto extracting
extraction_cutoff = 10000

View solution in original post

hexxamillion
Explorer

What splunk server would this configuration be updated on?

0 Karma

capnjosh
Explorer

I had been hitting the same problem: some events had xml that was longer than 5000 characters, and spath wasn't extracting all the fields I knew were in there.

Here's how to fix it:
Override the spath character limit in $splunk_home%/etc/system/local/limits.conf.

My exact edit was to add the below config section to /opt/splunk/etc/system/local/limits.conf (since it wasn't there be default in 4.3.3). I pulled this from /opt/splunk/etc/system/default/limit.conf:

[spath]
#number of characters to read from an XML or JSON event when auto extracting
extraction_cutoff = 10000

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

I remember there was an issue surrounding a maximum of 100 auto-extracted fields some time ago, that's why I asked. That wasn't JSON though... I don't really have a solution for you, just asking questions that might point someone in the right direction.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Is each event one line?

Does adding a line break - but keeping it one event - in the middle of a long line change the parsing behaviour?

How many fields are there in not fully parsed events?

0 Karma

leatherface
Explorer

So I did a little testing:
Adding a line break to the event makes no difference.

However, if I move the event I am trying to search to the start of a long line it will get parsed and I can search it.
By creating a log line with only two fields but one of them having a 15,000 character name, I find that the short field at the end of the line is not parsed.
Therefore it would seem the issue is definitely caused by the length of the line, but the total number of fields in the line may also be a factor.
Any suggestions that don't involve me changing the logging itself?

0 Karma

leatherface
Explorer

Each event is on one line.

Short lines have just under 100 fields versus vs around 150 fields for long lines. Long lines also have much longer field names (130 characters vs 60). If the limit on parsing fields is 100, this would fit with what I'm seeing.
I will test the line break and see what happens.

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!