Good Afternoon Splunkers,
Let me start by saying that I hope this is the right sub-forum for this question. I'm working on a dashboard within Splunk to visualize our AWS Web Application Firewall data. The purpose of this dashboard is to show general statistics and information about the requests our AWS WAF Solution is processing. Ultimately, we would like to use this dashboard to debug and tune our WAF solution as we move our WAF into enforcement mode.
One of the many charts / tables I'm trying to put together is a list of AWS WAF Rule-sets, and their sub-rules that have been triggered, by website our WAF is monitoring.
A concrete example of what I'm looking to create would be:
Webpage | WAF Rulegroups Triggered | Sub-Rules Triggered | Count |
SomeWebpage.com | AWSManagedCommonRuleSet | ||
GenericRFI_Body | 5 | ||
SomeOtherVuln | 10 | ||
NoUserAgent_HEADER | 15 | ||
AWSAnonymousIpList | |||
HostingProviderIpList | 20 |
the biggest issue I'm currently facing is that the AWS WAF data, while in JSON format from AWS, does not follow proper JSON, Key/Value pairings, and has nested arrays containing multiple types of information. Specifically the nested array that contains the rule evaluation information for a particular request contains all of the rules evaluated, even if the rules did not match, or no sub-rules were fired. Example below,
ruleGroupList: [ [-]
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesCommonRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesSQLiRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesLinuxRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesAdminProtectionRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesKnownBadInputsRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesAmazonIpReputationList
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesAnonymousIpList
terminatingRule: { [-]
action: BLOCK
ruleId: HostingProviderIPList
ruleMatchDetails: null
}
}
]
As you can see, even though only one AWS Rulegroup fired for this request "AWSManagedRulesAnonymousIpList", and within that group, the sub-rule "HostingProviderIpList" fired, all of the rule-groups assigned to the WAF are present within the array. Therefore, if I were to search for something like
$search stats count by nonTerminatingMatchingRules{}.ruleId, ruleGroupList{}.terminatingRule.ruleId | stats list(ruleGroupList{}.terminatingRule.ruleId), list(count), by nonTerminatingMatchingRules{}.ruleId
I would get back a list of each rule-set but I would also get back each sub-rule that has fired as well, even if the sub-rule is not part of the rule-set that fired.
What commands can I use to transform this data into proper key-value pairs on a per rule-group basis? Based on what I've read I think I want to use "Spath" and "mvexpand", I'm just not sure of the best path forward.
For full transparency, here's an entire WAF log in JSON format, so you can see all of the fields. Here's the guide for understanding these fields as well.
{ [-]
action: ALLOW
formatVersion: 1
httpRequest: { [-]
args:
clientIp: 8.8.8.8
country: CA
headers: [ [-]
{ [-]
name: Authorization
value: SomeToken
}
{ [-]
name: User-Agent
value: Site24x7
}
{ [-]
name: Cache-Control
value: no-cache
}
{ [-]
name: Accept
value: */*
}
{ [-]
name: Connection
value: Keep-Alive
}
{ [-]
name: Accept-Encoding
value: gzip
}
{ [-]
name: Content-Type
value: application/json; charset=UTF-8
}
{ [-]
name: X-Site24x7-Id
value: Redacted
}
{ [-]
name: Content-Length
value: 1396
}
{ [-]
name: Host
value: mywebpage.com
}
]
httpMethod: POST
httpVersion: HTTP/1.1
requestId: RedactedID
uri: /big/uri/path
}
httpSourceId: Redacted ID
httpSourceName: ALB
labels: [ [-]
{ [-]
name: awswaf:managed:aws:anonymous-ip-list:HostingProviderIPList
}
]
nonTerminatingMatchingRules: [ [-]
{ [-]
action: COUNT
ruleId: AWSCommonRuleSet
ruleMatchDetails: [ [-]
]
}
{ [-]
action: COUNT
ruleId: AWSAnonymousIpList
ruleMatchDetails: [ [-]
]
}
]
rateBasedRuleList: [ [-]
]
requestHeadersInserted: null
responseCodeSent: null
ruleGroupList: [ [-]
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesCommonRuleSet
terminatingRule: { [-]
action: BLOCK
ruleId: GenericRFI_BODY
ruleMatchDetails: null
}
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesSQLiRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesLinuxRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesAdminProtectionRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesKnownBadInputsRuleSet
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [-]
]
ruleGroupId: AWS#AWSManagedRulesAmazonIpReputationList
terminatingRule: null
}
{ [-]
excludedRules: null
nonTerminatingMatchingRules: [ [+]
]
ruleGroupId: AWS#AWSManagedRulesAnonymousIpList
terminatingRule: { [-]
action: BLOCK
ruleId: HostingProviderIPList
ruleMatchDetails: null
}
}
]
terminatingRuleId: Default_Action
terminatingRuleMatchDetails: [ [-]
]
terminatingRuleType: REGULAR
timestamp: 1629751363362
webaclId: RedactedID
}
Any answers on this? I am stuck too. The sourcetype=aws:waf seems to have mysteriously disappeared at some point over the last couple of years.
did you ever get this answered? i'm facing the same issue as you