Splunk Search

How to extract rex specific data from Splunk log?

AnirbanG
Loves-to-Learn Lots

We have a rule engine that assigns category codes to items. The category codes are assigned per location. We want to extract a statistical data from the log to show how many messages were published for each location.

For example

We want to get a result like below from below message

Location code count
ABC 2
XYZ 1
DEF 1
IJK 2

 

message #1:
 {"Item Id": "1", "locationCategoryCodes": [{"categoryCodes": [{"categoryCode": "CAT_1", "ruleID": ["138563"]}], "locationCode": "ABC"}, {"categoryCodes": [{"categoryCode": "CAT_1", "ruleID": ["138563"]}], "locationCode": "XYZ"}, {"categoryCodes": [{"categoryCode": "CAT_2", "ruleID": ["138561"]}], "locationCode": "DEF"}, {"categoryCodes": [{"categoryCode": "CAT_3", "ruleID": ["138614"]}], "locationCode": "IJK"}], "timestamp": "2023-01-27T00:10:32.367 +0000"}

message #2:
 {"Item Id": "2", "locationCategoryCodes": [{"categoryCodes": [{"categoryCode": "CAT_1", "ruleID": ["138563"]}], "locationCode": "ABC"}, {"categoryCodes": [{"categoryCode": "CAT_3", "ruleID": ["138614"]}], "locationCode": "IJK"}], "timestamp": "2023-01-27T00:10:32.367 +0000"}

Thanks

Anirban

Labels (4)
Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

The sample messages you posted are conformant JSON.  I don't understand why you need to use rex to extract - or why you need extraction at all.  If these are _raw, Splunk would have automatically recognized it and give you a field named locationCategoryCodes{}.locationCode.  As I always say, do not treat structured data as text.  If for whatever reason Splunk doesn't give you that field, you can use spath to extract all fields ("nodes" in JSON lingo) in these messages.

Here is emulation using your sample data:

 

| makeresults
| fields - _time
| eval test=split("{\"Item Id\": \"1\", \"locationCategoryCodes\": [{\"categoryCodes\": [{\"categoryCode\": \"CAT_1\", \"ruleID\": [\"138563\"]}], \"locationCode\": \"ABC\"}, {\"categoryCodes\": [{\"categoryCode\": \"CAT_1\", \"ruleID\": [\"138563\"]}], \"locationCode\": \"XYZ\"}, {\"categoryCodes\": [{\"categoryCode\": \"CAT_2\", \"ruleID\": [\"138561\"]}], \"locationCode\": \"DEF\"}, {\"categoryCodes\": [{\"categoryCode\": \"CAT_3\", \"ruleID\": [\"138614\"]}], \"locationCode\": \"IJK\"}], \"timestamp\": \"2023-01-27T00:10:32.367 +0000\"}|||{\"Item Id\": \"2\", \"locationCategoryCodes\": [{\"categoryCodes\": [{\"categoryCode\": \"CAT_1\", \"ruleID\": [\"138563\"]}], \"locationCode\": \"ABC\"}, {\"categoryCodes\": [{\"categoryCode\": \"CAT_3\", \"ruleID\": [\"138614\"]}], \"locationCode\": \"IJK\"}], \"timestamp\": \"2023-01-27T00:10:32.367 +0000\"}", "|||")
| mvexpand test
| eval isgood = if(json_valid(test), "yes", "no") ``` quick test to see if message is conformant ```
| rename test as _raw
``` data emulation above ```
| spath ``` use this if Splunk doesn't give you locationCategoryCodes{}.locationCode ```
| stats count by locationCategoryCodes{}.locationCode
| rename locationCategoryCodes{}.locationCode as "Location Code"

 

Location Codecount
ABC2
DEF1
IJK2
XYZ1
Tags (1)
0 Karma

jotne
Builder

Here you go:

 

| makeresults 
| eval test="( {\"Item Id\": \"1\", \"locationCategoryCodes\": [{\"categoryCodes\": [{\"categoryCode\": \"CAT_1\", \"ruleID\": [\"138563\"]}], \"locationCode\": \"ABC\"}, {\"categoryCodes\": [{\"categoryCode\": \"CAT_1\", \"ruleID\": [\"138563\"]}], \"locationCode\": \"XYZ\"}, {\"categoryCodes\": [{\"categoryCode\": \"CAT_2\", \"ruleID\": [\"138561\"]}], \"locationCode\": \"DEF\"}, {\"categoryCodes\": [{\"categoryCode\": \"CAT_3\", \"ruleID\": [\"138614\"]}], \"locationCode\": \"IJK\"}], \"timestamp\": \"2023-01-27T00:10:32.367 +0000\"} ¤ {\"Item Id\": \"2\", \"locationCategoryCodes\": [{\"categoryCodes\": [{\"categoryCode\": \"CAT_1\", \"ruleID\": [\"138563\"]}], \"locationCode\": \"ABC\"}, {\"categoryCodes\": [{\"categoryCode\": \"CAT_3\", \"ruleID\": [\"138614\"]}], \"locationCode\": \"IJK\"}], \"timestamp\": \"2023-01-27T00:10:32.367 +0000\"})"
| makemv test delim="¤"
| mvexpand test
| rename test as _raw


| rex max_match=0 "locationCode.: .(?<location>[^\"]+)"
| mvexpand location
| stats count by location

Everything before the two empty lines are just to create sample data to make proof of concept.

 

 

 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

What have you tried so far?  How did that go? 

Is the locationCode field already extracted?  If so, then you can count them with the stats command.

| stats count by locationCode
| rename locationCode to "Location code"
| table "Location code" count
---
If this reply helps you, Karma would be appreciated.
0 Karma

AnirbanG
Loves-to-Learn Lots

Thanks @richgalloway  ..Unfortunately I could not extract locationCode fields dynamically. I tried to write some rex but those did not work.

Need to find a way to locationCodes (e.g.  ABC,XYZ,DEF) from the log 

 

{"Item Id": "1", "locationCategoryCodes": [{"categoryCodes": [{"categoryCode": "CAT_1", "ruleID": ["138563"]}], "locationCode": "ABC"}, {"categoryCodes": [{"categoryCode": "CAT_1", "ruleID": ["138563"]}], "locationCode": "XYZ"}, {"categoryCodes": [{"categoryCode": "CAT_2", "ruleID": ["138561"]}], "locationCode": "DEF"}, {"categoryCodes": [{"categoryCode": "CAT_3", "ruleID": ["138614"]}], "locationCode": "IJK"}], "timestamp": "2023-01-27T00:10:32.367 +0000"}

Tags (1)
0 Karma

richgalloway
SplunkTrust
SplunkTrust

What rex did you try?

Try this

| rex "locationCode\\\": \\\"(?<locationCode>[^\\\"]+)"
---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...