Getting Data In

Please help with LINE BREAKING/Truncate issue

Roy_9
Motivator

Hello,

Can anyone please help me with the line breaking and truncate issue which I am seeing for the nested Json events coming via HEC to splunk. This event size is almost close to 25 million bytes where as the truncate limit is set to 10000 only. Due to this event is getting truncated.I was not allowed to set the truncate limit to 0 due to performance issues.I want to break this nested event into multiple events starting from Source_System

Example of an event:

{"sourcetype": "abc_json","index":"test", "event":{"severity":"INFO","logger":"org.mule.runtime.core.internal.processor.LoggerMessageProcessor","time":"XXX","thread":"[MuleRuntime].xxx.123: [App name].post:\\schedules:application\\json:app.CPU_INTENSIVE @xxxx","message":{"correlationId":"XXXX","inputPayload":[{"Source_System":"TEST","Created_By":"ESB","Created_Date_UTC":"1900-XX-01T02:59:14.783Z","Last_Updated_By":"ESB","Last_Updated_Date_UTC":"2020-07-25T03:34:31.91Z",]},{"Source_System":"TEST2","Created_By":"ESB","Created_Date_UTC":"1900-XX-07T02:59:14.783Z","Last_Updated_By":"ESB","Last_Updated_Date_UTC":"1900-XX-25T03:34:31.91Z",]},{"Source_System":"TEST3","Created_By":"ESB","Created_Date_UTC":"2019-08-22T23:27:32.123Z","Last_Updated_By":"ESB","Last_Updated_Date_UTC":"1900-xx-20T01:11:45.35Z",]}}}}'

 

My current props.conf configuration:

ADD_EXTRA_TIME_FIELDS=True
ANNOTATE_PUNCT=true
AUTO_KV_JSON=true
BREAK_ONLY_BEFORE_DATE=null
CHARSET=UTF-8
DEPTH_LIMIT=1000
DETERMINE_TIMESTAMP_DATE_WITH_SYSTEM_TIME=false
LB_CHUNK_BREAKER_TRUNCATE=2000000
LEARN_MODEL=true
LEARN_SOURCETYPE=true
LINE_BREAKER=([,|[]){"Source_System":
LINE_BREAKER_LOOKBEHIND=100
MATCH_LIMIT=100000
MAX_DAYS_AGO=2000
MAX_DAYS_HENCE=2
MAX_DIFF_SECS_AGO=3600
MAX_DIFF_SECS_HENCE=604800
MAX_EVENTS=256
MAX_TIMESTAMP_LOOKAHEAD=128
NO_BINARY_CHECK=true
SEGMENTATION=indexing
SEGMENTATION-all=full
SEGMENTATION-inner=inner
SEGMENTATION-outer=outer
SEGMENTATION-raw=none
SEGMENTATION-standard=standard
SHOULD_LINEMERGE=false
TRUNCATE=10000
category=Custom
detect_trailing_nulls=false
disabled=false
maxDist=100
pulldown_type=true
termFrequencyWeightedDist=false

 

 

Am i missing something? Any help would be highly appreciated.

 

Thanks

Tags (1)
0 Karma
1 Solution

kamlesh_vaghela
SplunkTrust
SplunkTrust

@Roy_9 

Can you please try this configuration in props.? I tried it with your sample data.

[YOUR_SOURCE_TYPE]
SHOULD_LINEMERGE=false
LINE_BREAKER=}(\,\s){
NO_BINARY_CHECK=true
SEDCMD-a=s/{.*"inputPayload":\s\[//g
SEDCMD-b=s/]}}//g
TRUNCATE=0

 

 

 

 

View solution in original post

kamlesh_vaghela
SplunkTrust
SplunkTrust

@Roy_9 

As you said event size is almost close to 25 million bytes I have few questions.

  1. Do you collecting this JSON from any API OR any of your script generating it?
  2. Is it possible to parse the JSON in your script where you call Splunk HEC api call? If possible them can you extract on required list of event and send it limited number of event to HEC in one shot and other set of events in another shot?

 

If this reply helps you, an upvote would be appreciated.

Thanks
Kamlesh Vaghela

0 Karma

Roy_9
Motivator

Hi Kamlesh,

These logs are coming from Mulesoft cloudhub runtime manager via HEC to Splunk cloud. User is sending multiple json logs where only for a particular type of log, it is coming in nested json format where when i execute the search across that source, SH is freezing for a while and i have put the truncate limit to 450000 initially.Now the user is requesting to break this huge set of log into smaller chunks by breaking this event at the key value Source_System

i added a line breaker for this as mentioned above in props file but i had no luck in parsing this event.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@Roy_9 

Can you please try this configuration in props.? I tried it with your sample data.

[YOUR_SOURCE_TYPE]
SHOULD_LINEMERGE=false
LINE_BREAKER=}(\,\s){
NO_BINARY_CHECK=true
SEDCMD-a=s/{.*"inputPayload":\s\[//g
SEDCMD-b=s/]}}//g
TRUNCATE=0

 

 

 

 

Roy_9
Motivator

Thanks much @kamlesh_vaghela.

0 Karma

Roy_9
Motivator

Did anyone came across this kind of issue? please help me out.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...