Hi, I'm trying to come up with a query to generate the count of strings in a json field in a log, across all events.
For example, say I have a search that displays say, 100,000 logs, with each log containing some JSON structured string [{"First Name": "Bob", "DOB":"1/1/1900", ..."Vendor":"Walmart"}]
I want to generate a table that lists all the unique Vendor values, and the count of the values. Something like,
Vendor | Count
Walmart 5
Target 3
ToysRUs. 100
etc...
Is something like this possible?
Yes, it's possible. I'll assume your next question will be about *how* it is possible. 😉
First, extract the vendor field from each event. There are a couple of ways to do that, but I like rex.
| rex "Vendor\\\":\\\"(?<Vendor>[^\"]+)"
(Yes, the triple-backslashes are intentional and necessary.)
Then count the vendor names.
| stats count by Vendor
Yes, it's possible. I'll assume your next question will be about *how* it is possible. 😉
First, extract the vendor field from each event. There are a couple of ways to do that, but I like rex.
| rex "Vendor\\\":\\\"(?<Vendor>[^\"]+)"
(Yes, the triple-backslashes are intentional and necessary.)
Then count the vendor names.
| stats count by Vendor
Wow, that's awesome! Thanks for the help! As a followup, is this able to be extended to extract more than just that one field? I'm seeing in other splunk community posts that rex is able to be used for multiple extractions, but I'm not too well aquainted with regular expressions.
For instance, extracting both the First Name and the Vendor as an example, that would produce something like
Name, Vendor | Count
Bob, Walmart 10
Billy, Walmart 13
Blake, Target 13
Billy, Target 11
The multiple backslashes and quotations are throwing me off in terms of how to add additional fields for extraction.
The easiest way to extract multiple fields is to use multiple rex commands. It's often more efficient to do that than to create a complex regex that extracts multiple fields in one go. It's easier to maintain and more flexible (in case field order changes), too.
Would that be piping the original rex command provided into another rex command with the new field I want, prior to piping into the stats count command with a by that comma separates the two fields I want?
Seems to be the case that it works as I assumed. Thanks for your help!