Knowledge Management

Extraction of Data in JSON Message Inexplicably Not Working

Sivrat
Path Finder

I'm at my wits end here, everything seems to indicate what I'm doing should work, yet it's not. 

I have Azure firewall logs feeding in through a storage account using the Microsoft Cloud Services app. These come in as standard JSON, which is being extracted fine by Splunk. There is a nested field in the JSON, "properties.msg", that has the actual firewall log message including source/destination information, IPs/ports, whether it was allowed/denied, and what firewall rule was referenced. 

For reference, this thread discusses a nearly similar case/problem -https://community.splunk.com/t5/Splunk-Search/Azure-Firewall-Log-Field-Extraction-Help/m-p/411148 

The added wrinkle I have is that I am trying to get the fields extracted to work with CIM data models, not just get  the extractions as results from a search. This honestly seemed easy enough, but for some reason none of my field extractions are working.

Here are some facts/things I have tried

  • This is in Splunk Cloud
  • I created a regex to extract all the fields from the properties.msg to named capture groups
  • The regex shows correct in Regex101
  • The regex extracts all the fields if used in the 'rex' command in search
  • Using the regex inside the Field Extractor tool and checking with preview function shows the fields extracted
  • I've saved the extraction as being shared Globally, Private, and App only (even tried different apps other than search)
  • I've tried saving as a inline extraction, and as a transform applying to both the _raw and individual properties.msg as SOURCE_KEY
  • I'm not seeing any errors or warnings when trying to do any of these changes that would make me thing something was wrong
  • None of this seems to work, none of the fields are extracted.
  • I tried doing a field alias for 'properties.msg' to 'msg', and that worked so it's not like its  (but didn't help me because I still can't extract the data from within that message.)

I honestly don't get how I can see the regex working in the Field Extractor, hit 'Save', see it saved in the configurations, but not extract fields.

EDIT:
Sample _raw log (more in updated link posted above)
{ "category": "AzureFirewallApplicationRule", "time": "2021-05-04T15:41:59.8967610Z", "resourceId": "/SUBSCRIPTIONS/REDACTED/RESOURCEGROUPS/REDACTED/PROVIDERS/MICROSOFT.NETWORK/AZUREFIREWALLS/SOMEFW", "operationName": "AzureFirewallApplicationRuleLog", "properties": {"msg":"HTTPS request from 192.168.0.1:8888 to subdomain.x99.blob.storage.azure.net:443. Action: Allow. Rule Collection: AllowOutbound. Rule: AllowOutbound-AA-AA-A"}}
{ "category": "AzureFirewallApplicationRule", "time": "2021-05-04T15:41:58.6369780Z", "resourceId": "/SUBSCRIPTIONS/REDACTED/RESOURCEGROUPS/REDACTED/PROVIDERS/MICROSOFT.NETWORK/AZUREFIREWALLS/SOMEFW", "operationName": "AzureFirewallApplicationRuleLog", "properties": {"msg":"HTTPS request from 192.168.0.1:8888 to subdomain.x99.blob.storage.azure.net:443. Action: Allow. Rule Collection: AllowOutbound. Rule: AllowOutbound-AA-AA-A"}}
{ "category": "AzureFirewallNetworkRule", "time": "2021-05-07T15:05:59.8277330Z", "resourceId": "/SUBSCRIPTIONS/REDACTED/RESOURCEGROUPS/REDACTED/PROVIDERS/MICROSOFT.NETWORK/AZUREFIREWALLS/SOMEFW", "operationName": "AzureFirewallNetworkRuleLog", "properties": {"msg":"TCP request from 192.168.0.1:8888 to 8.8.8.8:8888. Action: Deny. "}}

Regex
\"(?<protocol>\w+)\s[rR]equest\D+(?<src>[^\:]+)\:(?<src_port>\d+) to (?<dest>[^\:]+)\:((?<dest_port>\d+))?\.\sAction\: (?<action>\w+)\.(?: Rule Collection\: (?<cat>\w+)\. Rule\: (?<rule>[^\"]+))?

Labels (2)
0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi,

Is it possible you to provide some sample data (Please redact any sensitive data) and also provide regex which you are using.

Sivrat
Path Finder

I've added some samples and my regex to the original post, and updated the link to point to https://community.splunk.com/t5/Splunk-Search/Azure-Firewall-Log-Field-Extraction-Help/m-p/411148 which also has some additional examples if needed.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

For me it is working with Field Extraction and Field transformation. Main things you need to keep in mind that your sourcetype must have KV_MODE = json otherwise below configuration will not work.

Used your regex but removed starting \"

Regex:

(?<protocol>\w+)\s[rR]equest\D+(?<src>[^\:]+)\:(?<src_port>\d+) to (?<dest>[^\:]+)\:((?<dest_port>\d+))?\.\sAction\: (?<action>\w+)\.(?: Rule Collection\: (?<cat>\w+)\. Rule\: (?<rule>[^\"]+))?

 

Field TransformationField TransformationField ExtractionField Extraction

Sivrat
Path Finder

Thanks for your response.

I don't think I can confirm the KV_MODE of the sourcetype easily in Splunk Cloud, but I'll look. Definitely seems like it's doing automatic KV extraction, but that could be misleading.

However, according to this - https://docs.splunk.com/Documentation/SplunkCloud/latest/Knowledge/Searchtimeoperationssequence

The KV_MODE would apply after the both inline and transform based extractions, doesn't it? I had tried specifying properties.msg as the SOURCE_KEY before, and when it didn't work for me I assumed it was due to that, and tried just using _raw (which is why the "s were there to help the regex) to no avail.

0 Karma

Sivrat
Path Finder

Used API to set the KV Mode to JSON, put in details exactly as specified and appears to be working in other environments, still not working. 

Seeing same issue with another non-json source, where a single field extraction shows as extracted in the preview of the Field Extractor, but does not get extracted after saving that, regardless of sharing status.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

I am confused now, JSON events which was not extracting fields at search time is working now ??

Sivrat
Path Finder

I am confused as well, but apparently this is resolved.

The issue I was seeing with JSON events originally described also was with some other, non-JSON events.

I opened a ticket with support. They, like you, created one of the extractions without issue on the AdHoc Splunk Cloud SH. I was able to create another on the AdHoc SH, but to line up with the data models I had been trying to use the ES Search Head. 

After that, the previous extractions I had created on the ES Search Head seemed to start working. Not sure if something changed, or if I had just been impatient previously and not letting the extraction enough time to apply.

So things are resolved now, and I don't understand why. But things are working as intended.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...