Splunk Search

How to count the number of unique values stored at multivalue field?

alex_firerat
Engager

My events are JSON based and look like this one:

{
  "severity": "DEBUG",
  "message": {
    "list": [
      [
        "email1@some.domain",
        "2020-04-21 14:27:19"
      ],
      [
        "email2@some.domain",
        "2020-04-21 16:48:43"
      ]
    ]
  },
  "Type": "someType"
}

The structure is not very good to perform operations with data at message.list, but I have to work with this one. The emails at
message.list field are unique for each event.

My task is to calculate the number of all unique email addresses for each type ( message.Type field) for all events I got with search.

I was able to calculate the number of emails for each type, but not unique email addresses. This is my search:

someMySearchConditions | spath | rename "message.list{}{}" as rows | rex field=rows max_match=0
 "(?<email>[\w\d\.\-]+\@[\w\d\.]+)" | eval numberOfEmails = mvcount(email) | search numberOfEmails > 0
  | stats sum(numberOfEmails) by message.Type

I tried to convert field multivalue email with help of split function to multiple event with a single value field but without any success (cleaned email addresses are separated by carriage return or something like this).

How I can "join" all emails (or maybe value of message.list field) to be able to get data in format where n is a number of unique email addresses for each type?

count | Type
n | someType1

Thank you.

0 Karma
1 Solution

manjunathmeti
SplunkTrust
SplunkTrust

Expand multi value fields then use stats command with dc(distinct_count) function.

someMySearchConditions 
| spath 
| rex field="message.list{}{}" max_match=0
    "(?<email>[\w\d\.\-]+\@[\w\d\.]+)" 
| mvexpand email 
| stats dc(email) as count by Type

View solution in original post

manjunathmeti
SplunkTrust
SplunkTrust

Expand multi value fields then use stats command with dc(distinct_count) function.

someMySearchConditions 
| spath 
| rex field="message.list{}{}" max_match=0
    "(?<email>[\w\d\.\-]+\@[\w\d\.]+)" 
| mvexpand email 
| stats dc(email) as count by Type

View solution in original post

Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!