Re: How to compare two field values and return a n...

grannnt · ‎07-28-2017

I would like to compare two field values and return a new field with a percent match between the two.
Current search:

index=dlp severity="1:High"  sender!="N/A" 
| table  _time, sender, recipients, Filename, Count, severity, incident_id, policy,   
| sort  -_time

For example, if part of my search returns

sender: John.Smith@Coolcompany.com
Recipients: JohnSmith546@mail.com

I would like a new field named PercentMatch to return
PercentMatch: 80% ( or whatever the actual calculation may be)

The goal is to help determine when users are sending themselves emails to their personal account. Thank you

HeinzWaescher · ‎07-31-2017

Have you thought about a workaround using the cluster command?

"The cluster command groups events together based on how similar they are to each other"
https://docs.splunk.com/Documentation/SplunkCloud/6.6.0/SearchReference/Cluster

Assuming the _time as unique identifier per mail I could think of something like:

 |  makeresults

 |  eval sender="John.Smith@Coolcompany.com"
 |  eval recipientes="JohnSmith546@mail.com"

 | eval combined = sender + "," + recipientes 
 | makemv delim="," combined 
 | stats values(combined) as combined BY _time
 |  stats count BY combined, _time

 |  cluster labelonly=true t=0.1 match=ngramset    field=combined
 |  stats, values(combined), dc(cluster_label) BY _time

This compares both adresses and gives them the same cluster_label, if they are similar. A final dc(clusterlabel)=1 means, that it might be the same person

somesoni2 · ‎07-28-2017

I don't think something like this (comparing two strings for similarities) is natively available. You might have to create some custom search command to achieve the same. Have a look at following post

https://answers.splunk.com/answers/5927/can-splunk-compare-two-strings-and-return-likeness-similarit...

How to compare two field values and return a new field with a percent match between the two?

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

New in Observability Cloud - Explicit Bucket Histograms