Splunk Search

How to compare two field values and return a new field with a percent match between the two?

New Member

I would like to compare two field values and return a new field with a percent match between the two.
Current search:

index=dlp severity="1:High"  sender!="N/A" 
| table  _time, sender, recipients, Filename, Count, severity, incident_id, policy,   
| sort  -_time

For example, if part of my search returns

sender: John.Smith@Coolcompany.com
Recipients: JohnSmith546@mail.com

I would like a new field named PercentMatch to return
PercentMatch: 80% ( or whatever the actual calculation may be)

The goal is to help determine when users are sending themselves emails to their personal account. Thank you

0 Karma


Have you thought about a workaround using the cluster command?

"The cluster command groups events together based on how similar they are to each other"

Assuming the _time as unique identifier per mail I could think of something like:

 |  makeresults

 |  eval sender="John.Smith@Coolcompany.com"
 |  eval recipientes="JohnSmith546@mail.com"

 | eval combined = sender + "," + recipientes 
 | makemv delim="," combined 
 | stats values(combined) as combined BY _time
 |  stats count BY combined, _time

 |  cluster labelonly=true t=0.1 match=ngramset    field=combined
 |  stats, values(combined), dc(cluster_label) BY _time

This compares both adresses and gives them the same cluster_label, if they are similar. A final dc(clusterlabel)=1 means, that it might be the same person

Revered Legend

I don't think something like this (comparing two strings for similarities) is natively available. You might have to create some custom search command to achieve the same. Have a look at following post


0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!