Splunk Search

Break JSON file into separate events, removing the header and footer

mblauw
Path Finder

I've just started using RegEx and I'm currently looking on a way to extract multiple events from my JSON flight information logs. Here is my props.conf:

[_json_flight_data]
BREAK_ONLY_BEFORE_DATE = false
BREAK_ONLY_BEFORE = ({|[\s+{)
MUST_BREAK_AFTER = (}|}\s+])
SEDCMD-remove_header = s/({\s+.+\s+.+\s+{\s+.+\s+.+\s+.+\s+}\s+.+\s+.+\s+.+\s+.+\s+.+\s+.+\s+.+\s+.+\s+.+\s+[)//g
SEDCMD-remove_trailing_commas = s/},/}/g
SEDCMD-remove_footer = s/(].\s+.+\s+.+\s+.+\s+.+\s+})//g

(I know the style is VERY ugly haha.. But it should still work, right?)

Here's a part from my log:

{
   "src": 1,
   "feeds": [
      {
         "id": 1,
         "name": "From Consolidator",
         "polarPlot": false
      }
   ],
   "srcFeed": 1,
   "showSil": true,
   "showFlg": true,
   "showPic": true,
   "flgH": 20,
   "flgW": 85,
   "acList": [
      {
         "Id": 4736016,
         "Rcvr": 1,
         "HasSig": false,
         "Icao": "484410",
         "Bad": false,
         "Reg": "PH-AOB",
         "FSeen": "/Date(1489141837845)/",
         "TSecs": 335,
         "CMsgs": 105,
         "Alt": 0,
         "GAlt": 434,
         "InHg": 30.35433,
         "AltT": 0,
         "Call": "KLM729",
         "Lat": 52.313339,
         "Long": 4.76521,
         "PosTime": 1489141920517,
         "Mlat": false,
         "PosStale": true,
         "Tisb": false,
         "Spd": 2,
         "Trak": 213,
         "TrkH": false,
         "Type": "A332",
         "Mdl": "Airbus A330 203",
         "Man": "Airbus",
         "CNum": "686",
         "From": "EHAM Amsterdam Airport Schiphol, Netherlands",
         "To": "TNCM Princess Juliana, Saint Martin, Sint Maarten",
         "Op": "KLM Royal Dutch Airlines",
         "OpIcao": "KLM",
         "Sqk": "",
         "VsiT": 0,
         "Dst": 0.49,
         "Brng": 14.6,
         "WTC": 3,
         "Species": 1,
         "Engines": "2",
         "EngType": 3,
         "EngMount": 0,
         "Mil": false,
         "Cou": "Netherlands",
         "HasPic": false,
         "Interested": false,
         "FlightsCount": 0,
         "Gnd": true,
         "SpdTyp": 0,
         "CallSus": false,
         "Trt": 2,
         "Year": "2005"
      },

(many more events ...)

      {
         "Id": 4735491,
         "Rcvr": 1,
         "HasSig": false,
         "Icao": "484203",
         "Bad": false,
         "Reg": "",
         "FSeen": "/Date(1489114921456)/",
         "TSecs": 27251,
         "CMsgs": 6334,
         "Alt": 0,
         "GAlt": 434,
         "InHg": 30.35433,
         "AltT": 0,
         "Call": "KV1",
         "Lat": 52.31559,
         "Long": 4.74158,
         "PosTime": 1489142165220,
         "Mlat": false,
         "Tisb": false,
         "Spd": 15,
         "Trak": 177,
         "TrkH": false,
         "Type": "-GND",
         "Mdl": "Ground Vehicle",
         "Man": "",
         "Sqk": "3220",
         "Help": false,
         "VsiT": 0,
         "Dst": 1.65,
         "Brng": 296.1,
         "WTC": 0,
         "Species": 7,
         "EngType": 0,
         "EngMount": 0,
         "Mil": false,
         "Cou": "Netherlands",
         "HasPic": false,
         "Interested": false,
         "FlightsCount": 0,
         "Gnd": true,
         "SpdTyp": 0,
         "CallSus": false,
         "Trt": 2
      }
   ],
   "totalAc": 6822,
   "lastDv": "636247117248607567",
   "shtTrlSec": 65,
   "stm": 1489142172423
}

When I try to index my log with these settings, the linebreaking is done right, but the header and footer removal is not being done at all.

Does anybody know where I'm doing this wrong?

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try. It's missing timestamp related configurations, add per your requirements)

[ <SOURCETYPE NAME> ]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
disabled=false
LINE_BREAKER=([\r\n]+)(?=\s*\{\s*[\r\n]*\s*\"Id\")
SEDCMD-removeheader=s/^(\s*\{\s*[\r\n]*\"src\"(.+[\r\n]*)+)//
SEDCMD-removefooter=s/(\s*\](.+[\r\n]*)+)//

View solution in original post

somesoni2
Revered Legend

Give this a try. It's missing timestamp related configurations, add per your requirements)

[ <SOURCETYPE NAME> ]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
disabled=false
LINE_BREAKER=([\r\n]+)(?=\s*\{\s*[\r\n]*\s*\"Id\")
SEDCMD-removeheader=s/^(\s*\{\s*[\r\n]*\"src\"(.+[\r\n]*)+)//
SEDCMD-removefooter=s/(\s*\](.+[\r\n]*)+)//

mblauw
Path Finder

It's working! Thank you so much! Where have you learned to build up those SED/RegEx commands?

0 Karma
Get Updates on the Splunk Community!

Observability Unlocked: Kubernetes Monitoring with Splunk Observability Cloud

  Ready to master Kubernetes and cloud monitoring like the pros?Join Splunk’s Growth Engineering team for an ...

Wrapping Up Cybersecurity Awareness Month

October might be wrapping up, but for Splunk Education, cybersecurity awareness never goes out of season. ...

🌟 From Audit Chaos to Clarity: Welcoming Audit Trail v2

&#x1f5e3; You Spoke, We Listened  Audit Trail v2 wasn’t written in isolation—it was shaped by your voices.  In ...