Getting Data In

Json event breaking not working as expected

Path Finder

Original log:

[{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 08:26:23.547000+00:00", "context_ip": "xxx", "context_page_referrer": "xxx", "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}, {"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 12:53:32.350000+00:00", "context_ip": "xxx", "context_page_referrer": null, "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}]

Expected logs:

{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 08:26:23.547000+00:00", "context_ip": "xxx", "context_page_referrer": "xxx", "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null} 

{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 12:53:32.350000+00:00", "context_ip": "xxx", "context_page_referrer": null, "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}

Currently my used props.conf is:

[xxx]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
CHARSET=UTF-8
SEDCMD-remove_prefix=s/\[//g
SEDCMD-remove_suffix=s/\]//g
SEDCMD-removeeventcommas=s/}, {"username":/}{"username":/g
BREAK_ONLY_BEFORE=\{\"username\"                              <-- This one is not working

Output I am getting using above props.conf"

{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 08:26:23.547000+00:00", "context_ip": "xxx", "context_page_referrer": "xxx", "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 12:53:32.350000+00:00", "context_ip": "xxx", "context_page_referrer": null, "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}

I am doing these validation while uploading sample log file from WebUI and during 2nd configuration page of Add Data I am doing this testing.

What I am missing?

Tags (2)
0 Karma

Esteemed Legend

Use ONLY this (do not add any of the stuff that I dropped back in):

[xxx]
SHOULD_LINEMERGE=false
LINE_BREAKER = ((?:(?:^|\][\r\n]+)\[)|,\s+)\{"username"
NO_BINARY_CHECK=true
CHARSET=UTF-8
SEDCMD-remove_suffix=s/]//g

Never, EVER use SHOULD_LINEMERGE = true and the BREAK_* junk. I have only ever seen 1 time where it was necessary.

0 Karma

Ultra Champion

I'd recommend using explicit LINE_BREAKER and SHOULD_LINEMERGE=false. That is much more predictable and is also more performant.

Something like this should work for your data:

LINE_BREAKER = ([\r\n]*\[|,\s+)\{"username":
SHOULD_LINEMERGE=false

This also automatically takes care of stripping the leading [ or , in between records. Only SEDCMD needed is stripping of the trailing ]. See: https://regex101.com/r/8zGyMS/1

Note: SEDCMD applies after line breaking.

0 Karma