Deployment Architecture

How to use Splunk to find slight variations in email message_subjects and file_names?

Contributor

I am hoping to find a way to sift thru loads of emails to find emails with similar subjects or similar attachment names.

Currently I might search by subject or attachment name.

For example,

index=mail sourcetype="mail" 
    [search index=mail sourcetype="mail" message_subject = *<something>*  |stats count by internal_message_id | fields internal_message_id]
    |eval Time=strftime(_time, "%H:%M:%S") | eval Date=strftime(_time, "%A %F") 
    |stats list(message_subject) as subj list(sender) as sender list(recipient) as recp list(file_name) as AttachmentName list(attachment_type) as AttachmentType list(vendor_action) as status values(Time) as Time values(Date) as Date by internal_message_id 

or

 index=mail sourcetype="mail" 
        [search index=mail sourcetype="mail" file_name = *<something>*  |stats count by internal_message_id | fields internal_message_id]
        |eval Time=strftime(_time, "%H:%M:%S") | eval Date=strftime(_time, "%A %F") 
        |stats list(message_subject) as subj list(sender) as sender list(recipient) as recp list(file_name) as AttachmentName list(attachment_type) as AttachmentType list(vendor_action) as status values(Time) as Time values(Date) as Date by internal_message_id 

I am looking to find all variations or patterns of similar emails...
for example
subj = Order-008796, Order-008948, Order-009485, etc.
AttachmentName = Order#00879, Order-008948, Order#009485, etc (extns like .doc are already parsed out natively in the log)

Whats the best way to find similar patterns? Cluster? Any other ideas?

Thank you

0 Karma
1 Solution

SplunkTrust
SplunkTrust

There are a few ways to do that, depending on the patterns you want to match. One is to use wildcards in the base search

index=mail sourcetype="mail" message_subject ="Order-*" | ...

or use like

index=mail sourcetype="mail"  | where like(message_subject,"Order-%") | ...

or use regex

index=mail sourcetype="mail" | regex message_subject = "Order-\d{6}" | ...
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

There are a few ways to do that, depending on the patterns you want to match. One is to use wildcards in the base search

index=mail sourcetype="mail" message_subject ="Order-*" | ...

or use like

index=mail sourcetype="mail"  | where like(message_subject,"Order-%") | ...

or use regex

index=mail sourcetype="mail" | regex message_subject = "Order-\d{6}" | ...
---
If this reply helps you, an upvote would be appreciated.

View solution in original post

0 Karma

Contributor

Thank you Rich. Before I accept your answer, just wanted to get your opinion on using cluster. When would you typically use cluster?

Thank you

0 Karma

SplunkTrust
SplunkTrust

I haven't used the cluster command, but it could apply in this case. I wonder what you'd get from index=mail sourcetype="mail" | cluster field=message_subject | ...

---
If this reply helps you, an upvote would be appreciated.
0 Karma

Contributor

Thanks for the reply, I was thinking about cluster as more of an automatic check with less manual changes to the query.

I will experiment a bit, and post a new question in a while.

Thank you

0 Karma