Getting Data In

LINE_BREAKER trouble

renems
Communicator

I'm struggeling to get splunk to break some json events properly. This is due to the fact, that my input has no new lines. Let me show you my input:

{"id":"40CC75B0DA1A8AEE3A5A884D7007D0D9","id_old":null,"favorited":null,"authorinfo":{"rank":1.3,"followercount":1475},"sentiment":"neu","link":"https:\/\/twitter.com\/innijverdal\/status\/735051205836148736","fulltext":"Zojuist is er een tas [gestolen] bij de Primera van een vrouw. De dader heeft vervolgens gepind bij de [Rabobank]. De... https:\/\/t.co\/U1ULiaE2yt","timestamp_link":"1464084838","timestamp_show":"1464084838","subsite":null,"author":"innijverdal","postid":"4a85:2c69:f51b:42cb","dataproviderid":null,"authortype":"user","label":"post","snippet":"Zojuist is er een tas [gestolen] bij de Primera van een vrouw. De dader heeft vervolgens gepind bij de [Rabobank]. De... https:\/\/t.co\/U1ULiaE2yt","numposts":"","pagerank":1,"title":"","sourcetype":"twitter","followercount":1475,"authorid":"1371834230","authorrealname":"Leven in Nijverdal","likescount":0,"authorrank":1.3,"fbid":null,"ytid":null,"replytoid":null,"avatar":"https:\/\/pbs.twimg.com\/profile_images\/435686221109395456\/FCz3PoOo_normal.png","coordinates":null,"media":[],"links":["http:\/\/www.leveninnijverdal.nl\/nieuws\/27205\/tas-[gestolen]-bij-primera-en-dader-pint-bij-[rabobank]"],"message_id":"735051205836148736","found_conversation":false,"postid_orig":"735051205836148736","mentioned":[],"translated_sourcetype":"twitter"},{"id":"579771EFE5829B94F17B3F03E7AB1177","id_old":null,"favorited":null,"authorinfo":{"rank":10.8,"followercount":830},"sentiment":"pos","link":"https:\/\/twitter.com\/Paul_0110\/status\/735036812033396736","fulltext":"Potverdomme [@Rabobank], het programma voor [internetbankieren] hebben jullie toch wel retestrak en klantvriendelijk voor mekaar!","timestamp_link":"1464081406","timestamp_show":"1464081406","subsite":null,"author":"Paul_0110","postid":"97f9:1675:4c33:5489","dataproviderid":null,"authortype":"user","label":"post","snippet":"Potverdomme [@Rabobank], het programma voor [internetbankieren] hebben jullie toch wel retestrak en klantvriendelijk voor mekaar!","numposts":"","pagerank":1,"title":"","sourcetype":"twitter","followercount":830,"authorid":"507333420","authorrealname":"Paul Netten \u00a9","likescount":0,"authorrank":10.8,"fbid":null,"ytid":null,"replytoid":null,"avatar":"https:\/\/pbs.twimg.com\/profile_images\/678477714735685632\/_SmvdMWf_normal.jpg","coordinates":null,"media":[],"links":[],"message_id":"735036812033396736","found_conversation":false,"postid_orig":"735036812033396736","mentioned":[{"authortype":"user","authorid":7385462,"authorrealname":"Rabobank","author":"Rabobank"}],"translated_sourcetype":"twitter"}

I'd like a line break at every },{"id"
The old line should end with },
The new line should start with {"id"

Any help would greatly appreciated.

Tags (1)
0 Karma
1 Solution

jkat54
SplunkTrust
SplunkTrust

I'd use something like this maybe...

[sourceTypeName]
INDEXED_EXTRACTIONS=json
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE = ',{"id":'
SEDCMD-RemoveComma = 's/^\,//g'

Not sure if the sedcmd will be needed or if anything beyond indexed_extractions is needed at all.

View solution in original post

0 Karma

jkat54
SplunkTrust
SplunkTrust

I'd use something like this maybe...

[sourceTypeName]
INDEXED_EXTRACTIONS=json
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE = ',{"id":'
SEDCMD-RemoveComma = 's/^\,//g'

Not sure if the sedcmd will be needed or if anything beyond indexed_extractions is needed at all.

0 Karma

ryanoconnor
Builder

I downvoted this post because i would stray away from using the break_only_before command due to performance. you'll actually get better performance using should_linemerge=false and then a linebreaker.

see a similar question asked here:

https://answers.splunk.com/answers/227121/what-is-the-difference-between-line-breaker-and-br.html

0 Karma

jkat54
SplunkTrust
SplunkTrust

Downvotes are for when something is going to damage someones system... something like "hey try running sudo rm -Rf /" or "format c:". See this before downvoting please: https://answers.splunk.com/answers/244111/proper-etiquette-and-timing-for-voting-here-on-ans.html

0 Karma

ryanoconnor
Builder

Apologies, the only reason I downvoted it is because we want to get people in the habit of not using SHOULD_LINEMERGE=true where possible. You'll see very significant performance improvements if you set SHOULD_LINEMERGE to false and use a regex for your LINE_BREAKER.

When you don't use that setting you're essentially skipping a step in the data pipeline (http://wiki.splunk.com/Community:HowIndexingWorks) and according to the Consultant II class, you'll see very significant performance improvements.

0 Karma

jkat54
SplunkTrust
SplunkTrust

If you remove code lines 3,4,5 from my answer and replace them with lines 2,& 3 from Ryan's answer, I think you'll be in a sweet spot for performance and still achieve what you want.

Indexed extractions could be of concern too because it uses more disk on indexers. Kv mode JSON on the search heads causes the JSON parsing at search time though and is less performant in many cases at search time. However indexed extractions is less performant at index time... It's a trade off and most people want to guarantee indexing over search which means Ryan's answer is better for most.

0 Karma

ryanoconnor
Builder

I didn't get around to ensuring timestamps were correct which you may want to look into for this data, however the following props.conf should help you out.

[your_sourcetype_name]
LINE_BREAKER = .*}(,){.*
SHOULD_LINEMERGE = False
KV_MODE = json
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...