i have all the below messages in the "response" field.
{"errors": ["Message: Payment failed. Reason: Hi, we attempted to process the transaction but it seems there was an error. Please check your information and try again. If the problem persists please contact your bank."]}
{"errors": ["Unable to retrieve User Profile with sub '2415d' as it does not exist"]}
{"errors": ["Unable to retrieve User Profile with sub 'dfadf' as it does not exist"]}
{"errors": ["Unable to retrieve User Profile with sub 'fdsgad' as it does not exist"]}
{"errors": ["Unallocated LRW seat not found with product id fdafdsaddsfa and start datetime utc 2024-01-06T05:30:00+00:00 and test location id dfafdfa"]}
{"errors": ["Unallocated LRW seat not found with product id sfgdfa and start datetime utc 2024-01-06T05:30:00+00:00 and test location id dsfadfsa"]}
I wanted to display the result with the count as
Message: Payment failed. Reason: Hi, we attempted to process the transaction but it seems there was an error. Please check your information and try again. If the problem persists please contact your bank.
Unable to retrieve User Profile with sub '***' as it does not exist
Unallocated LRW seat not found with product id *** and start datetime utc 2024-01-06T05:30:00+00:00 and test location id ***
If you do not need the values, you can simplify to
| spath input=response path=errors{} output=errors
| mvexpand errors
| rex field=errors mode=sed "s/(\bwith (sub|\w+ id)) (\S+)/\1 */ s/(and \w+ datetime) (\S+ \S+)/\1 */ s/\band test (\w+ id) (\S+)/and test \1 */"
| stats count by errors
Note:
Using sample input you illustrated, this is the output
errors | count |
Message: Payment failed. Reason: Hi, we attempted to process the transaction but it seems there was an error. Please check your information and try again. If the problem persists please contact your bank. | 1 |
Unable to retrieve User Profile with sub * as it does not exist | 3 |
Unallocated LRW seat not found with product id * and start datetime * and test location id * | 2 |
Here is an emulation you can play with and compare with real data
| makeresults
| eval response = split("{\"errors\": [\"Message: Payment failed. Reason: Hi, we attempted to process the transaction but it seems there was an error. Please check your information and try again. If the problem persists please contact your bank.\"]}
{\"errors\": [\"Unable to retrieve User Profile with sub '2415d' as it does not exist\"]}
{\"errors\": [\"Unable to retrieve User Profile with sub 'dfadf' as it does not exist\"]}
{\"errors\": [\"Unable to retrieve User Profile with sub 'fdsgad' as it does not exist\"]}
{\"errors\": [\"Unallocated LRW seat not found with product id fdafdsaddsfa and start datetime utc 2024-01-06T05:30:00+00:00 and test location id dfafdfa\"]}
{\"errors\": [\"Unallocated LRW seat not found with product id sfgdfa and start datetime utc 2024-01-06T05:30:00+00:00 and test location id dsfadfsa\"]}", "
")
| mvexpand response
``` data emulation above ```
You can try something like this.
<base_search>
| eval
error=coalesce(spath(response, "errors{}"), spath(response, "errors"))
| fields - response
``` extract variables from the error messages ```
| rex field=error "(?i)sub\s+\'(?<sub>[^\']+)\'"
| rex field=error "(?i)product\s+id\s+(?<product_id>[^\s]+)"
| rex field=error "(?i)location\s+id\s+(?<location_id>[^\s]+)"
| rex field=error "(?i)datetime\s+(?<start_datetime>\w+\s+\d{4}(?:\-\d{2}){2}T\d{2}(?:\:\d{2}){2}(?:\+|\-)\d{2}\:\d{2})"
``` replace variables in the error messages to get a standardized set of error messages to do counts against ```
| eval
error=replace(replace(replace(replace(error, "(?i)sub\s+\'([^\']+)\'", "sub '***'"), "(?i)product\s+id\s+([^\s]+)", "product id ***"), "(?i)location\s+id\s+([^\s]+)", "location id ***"), "(?i)datetime\s+(\w+\s+\d{4}(?:\-\d{2}){2}T\d{2}(?:\:\d{2}){2}(?:\+|\-)\d{2}\:\d{2})", "datetime ***")
``` stats aggregation to get counts of error messages ```
| stats
count as count,
values(sub) as sub,
values(product_id) as product_id,
values(location_id) as location_id,
values(start_datetime) as start_datetime
by error
Results should look something like this.
You can see the counts next to the standardized error messages. Also went ahead and carried over all the variables that were replaced in error messages for context.
You could also check out the cluster command as this will give you similar results without having to do all the extractions and replacements in inline SPL.
<base_search>
| table _time, response
| eval
error=coalesce(spath(response, "errors{}"), spath(response, "errors"))
| fields - response
| cluster field=error t=0.4 showcount=true countfield=count
Results will look like this.
The error messages aren't redacted but their counts do line up pretty well to the previous example so the clustering appears to work decently.
You can read up more on the cluster command here.
https://docs.splunk.com/Documentation/Splunk/9.1.2/SearchReference/Cluster
Hi @yuvaraj_m91
The Splunk command "spath" enables you to extract information from the structured data formats XML and JSON
Command Ref is given here:
https://docs.splunk.com/Documentation/Splunk/9.1.2/SearchReference/Spath
Pls let us know if you are able to use the spath command.
or you could use direct "rex" command extract field values and do the stats
or where like command also should be good i think.
but, the spath is the simplest option i think. pls let us know if you are ok with spath or not, thanks.