Getting Data In

JSON file event breaking parsing on universal forwarder

rayar
Contributor

I have a JSON file.
Once I upload the file on the search head using the below stanza in props.conf it's indexed properly.

Splunk 7.3.4

[json_test]
CHARSET = UTF-8
DATETIME_CONFIG = CURRENT
SEDCMD-cut_footer = s/\]\,\n\s*\"total\":.*$/g
SEDCMD-cut_header = s/^\{\n\s*\"matches\":\s\[/g
category = Structured
disabled = false
HEADER_FIELD_LINE_NUMBER = 3
SHOULD_LINEMERGE = 0
TRUNCATE = 0
INDEXED_EXTRACTIONS = json
KV_MODE = none

Once I upload the data from UF the data do not break to events

Universal Forwarder

props.conf

[json_test]
CHARSET = UTF-8
INDEXED_EXTRACTIONS = json

inputs.conf

[monitor:///tmp/*.json]
disabled = 0
sourcetype = json_test
index = test_hr
crcSalt  = REINDEXMEPLEASE
initCrcLength = 780

Indexer

props.conf

[json_test]
DATETIME_CONFIG = CURRENT
SEDCMD-cut_footer = s/\]\,\n\s*\"total\":.*$/g
SEDCMD-cut_header = s/^\{\n\s*\"matches\":\s\[/g
category = Structured
disabled = false
HEADER_FIELD_LINE_NUMBER = 3
SHOULD_LINEMERGE = 0
TRUNCATE = 0

Search Head

props.conf

[json_test]
KV_MODE = none
0 Karma

FrankVl
Ultra Champion

What if you remove the INDEXED_EXTRACTIONS = json from the UF's config (and enable kvmode again, or move the indexed extractions to the indexer)? The UF will try to do the json extractions, without any of the custom line breaking and header stripping. And once the indexed extractions have been done, the downstream splunk enterprise instance will no longer apply linebreaking stuff if I'm not mistaken.

Indexing JSON files that contain multiple events in one json structure is anyway a pain in the proverbial butt. You might also want to look at setting sensible EVENT_BREAKER settings on your UF, to at least make sure events arrive in one piece at the indexers.

Or consider using a heavy forwarder for this, so that indexed extractions and linebreaking and such happen at the same place.

But all in all, I think changing how this data gets logged, or do some pre-processing on the json file to transform it into individual events, might be the best (but not necessarily the easiest) thing to do here.

0 Karma

rayar
Contributor

Hi
thanks for your inputs

updated as below

UF

CHARSET = UTF-8
KV_MODE = none

Indexer

[json_odelia]
DATETIME_CONFIG = CURRENT
SEDCMD-cut_footer = s/]\,\n\s*\"total\":.$/g
SEDCMD-cut_header = s/^{\n\s
\"matches\":\s[/g
category = Structured
disabled = false
HEADER_FIELD_LINE_NUMBER = 3
SHOULD_LINEMERGE = 0
TRUNCATE = 0
INDEXED_EXTRACTIONS = json

still not working

0 Karma
Get Updates on the Splunk Community!

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...

Introducing New Splunkbase Governance!

Splunk apps are essential for maximizing the value of your Splunk Experience. Whether you’re using the default ...

3 Ways to Make OpenTelemetry Even Better

My role as an Observability Specialist at Splunk provides me with the opportunity to work with customers of ...