Getting Data In
Highlighted

LINE_BREAKER trouble

Communicator

I'm struggeling to get splunk to break some json events properly. This is due to the fact, that my input has no new lines. Let me show you my input:

{"id":"40CC75B0DA1A8AEE3A5A884D7007D0D9","id_old":null,"favorited":null,"authorinfo":{"rank":1.3,"followercount":1475},"sentiment":"neu","link":"https:\/\/twitter.com\/innijverdal\/status\/735051205836148736","fulltext":"Zojuist is er een tas [gestolen] bij de Primera van een vrouw. De dader heeft vervolgens gepind bij de [Rabobank]. De... https:\/\/t.co\/U1ULiaE2yt","timestamp_link":"1464084838","timestamp_show":"1464084838","subsite":null,"author":"innijverdal","postid":"4a85:2c69:f51b:42cb","dataproviderid":null,"authortype":"user","label":"post","snippet":"Zojuist is er een tas [gestolen] bij de Primera van een vrouw. De dader heeft vervolgens gepind bij de [Rabobank]. De... https:\/\/t.co\/U1ULiaE2yt","numposts":"","pagerank":1,"title":"","sourcetype":"twitter","followercount":1475,"authorid":"1371834230","authorrealname":"Leven in Nijverdal","likescount":0,"authorrank":1.3,"fbid":null,"ytid":null,"replytoid":null,"avatar":"https:\/\/pbs.twimg.com\/profile_images\/435686221109395456\/FCz3PoOo_normal.png","coordinates":null,"media":[],"links":["http:\/\/www.leveninnijverdal.nl\/nieuws\/27205\/tas-[gestolen]-bij-primera-en-dader-pint-bij-[rabobank]"],"message_id":"735051205836148736","found_conversation":false,"postid_orig":"735051205836148736","mentioned":[],"translated_sourcetype":"twitter"},{"id":"579771EFE5829B94F17B3F03E7AB1177","id_old":null,"favorited":null,"authorinfo":{"rank":10.8,"followercount":830},"sentiment":"pos","link":"https:\/\/twitter.com\/Paul_0110\/status\/735036812033396736","fulltext":"Potverdomme [@Rabobank], het programma voor [internetbankieren] hebben jullie toch wel retestrak en klantvriendelijk voor mekaar!","timestamp_link":"1464081406","timestamp_show":"1464081406","subsite":null,"author":"Paul_0110","postid":"97f9:1675:4c33:5489","dataproviderid":null,"authortype":"user","label":"post","snippet":"Potverdomme [@Rabobank], het programma voor [internetbankieren] hebben jullie toch wel retestrak en klantvriendelijk voor mekaar!","numposts":"","pagerank":1,"title":"","sourcetype":"twitter","followercount":830,"authorid":"507333420","authorrealname":"Paul Netten \u00a9","likescount":0,"authorrank":10.8,"fbid":null,"ytid":null,"replytoid":null,"avatar":"https:\/\/pbs.twimg.com\/profile_images\/678477714735685632\/_SmvdMWf_normal.jpg","coordinates":null,"media":[],"links":[],"message_id":"735036812033396736","found_conversation":false,"postid_orig":"735036812033396736","mentioned":[{"authortype":"user","authorid":7385462,"authorrealname":"Rabobank","author":"Rabobank"}],"translated_sourcetype":"twitter"}

I'd like a line break at every },{"id"
The old line should end with },
The new line should start with {"id"

Any help would greatly appreciated.

Tags (1)
0 Karma
Highlighted

Re: LINE_BREAKER trouble

Builder

I didn't get around to ensuring timestamps were correct which you may want to look into for this data, however the following props.conf should help you out.

[your_sourcetype_name]
LINE_BREAKER = .*}(,){.*
SHOULD_LINEMERGE = False
KV_MODE = json
0 Karma
Highlighted

Re: LINE_BREAKER trouble

SplunkTrust
SplunkTrust

I'd use something like this maybe...

[sourceTypeName]
INDEXED_EXTRACTIONS=json
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE = ',{"id":'
SEDCMD-RemoveComma = 's/^\,//g'

Not sure if the sedcmd will be needed or if anything beyond indexed_extractions is needed at all.

View solution in original post

0 Karma
Highlighted

Re: LINE_BREAKER trouble

Builder

I downvoted this post because i would stray away from using the breakonlybefore command due to performance. you'll actually get better performance using should_linemerge=false and then a linebreaker.

see a similar question asked here:

https://answers.splunk.com/answers/227121/what-is-the-difference-between-line-breaker-and-br.html

0 Karma
Highlighted

Re: LINE_BREAKER trouble

SplunkTrust
SplunkTrust

Downvotes are for when something is going to damage someones system... something like "hey try running sudo rm -Rf /" or "format c:". See this before downvoting please: https://answers.splunk.com/answers/244111/proper-etiquette-and-timing-for-voting-here-on-ans.html

0 Karma
Highlighted

Re: LINE_BREAKER trouble

Builder

Apologies, the only reason I downvoted it is because we want to get people in the habit of not using SHOULDLINEMERGE=true where possible. You'll see very significant performance improvements if you set SHOULDLINEMERGE to false and use a regex for your LINE_BREAKER.

When you don't use that setting you're essentially skipping a step in the data pipeline (http://wiki.splunk.com/Community:HowIndexingWorks) and according to the Consultant II class, you'll see very significant performance improvements.

0 Karma
Highlighted

Re: LINE_BREAKER trouble

SplunkTrust
SplunkTrust

If you remove code lines 3,4,5 from my answer and replace them with lines 2,& 3 from Ryan's answer, I think you'll be in a sweet spot for performance and still achieve what you want.

Indexed extractions could be of concern too because it uses more disk on indexers. Kv mode JSON on the search heads causes the JSON parsing at search time though and is less performant in many cases at search time. However indexed extractions is less performant at index time... It's a trade off and most people want to guarantee indexing over search which means Ryan's answer is better for most.

0 Karma
Speak Up for Splunk Careers!

We want to better understand the impact Splunk experience and expertise has has on individuals' careers, and help highlight the growing demand for Splunk skills.