Splunk Search

need help with eval query

weicheng98
Path Finder

hi want to compare the email header and count by dest_port =25. (Im trying to detect a phishing email via email title)
if the email header has the same title appears twice, I will return the number of count by dest_port= 25

source=* dest_port=25
| rex field=src_content max_match=0 "(?PSubject: Fw: Order Inquiry)"
| eval count=mvcount(occurredSubject)
| stats sum(count) as totalOccurrence

but it doesn't work. any help ?

Tags (2)
0 Karma
1 Solution

FrankVl
Ultra Champion

Assuming you have 1 email message per event:

Extract the subject, as already demonstrated by @auraria1 and do a count by subject and then filter for counts bigger than 1.

 source=* dest_port=25
 | rex field=_raw "Subject\:\s(?<subject>.+)"
 | stats count by subject
 | where count>1

If you want to retrieve the entire event, for those events that have subjects occuring more than once, then use eventstats instead of stats:

 source=* dest_port=25
 | rex field=_raw "Subject\:\s(?<subject>.+)"
 | eventstats count by subject
 | where count>1

View solution in original post

FrankVl
Ultra Champion

Assuming you have 1 email message per event:

Extract the subject, as already demonstrated by @auraria1 and do a count by subject and then filter for counts bigger than 1.

 source=* dest_port=25
 | rex field=_raw "Subject\:\s(?<subject>.+)"
 | stats count by subject
 | where count>1

If you want to retrieve the entire event, for those events that have subjects occuring more than once, then use eventstats instead of stats:

 source=* dest_port=25
 | rex field=_raw "Subject\:\s(?<subject>.+)"
 | eventstats count by subject
 | where count>1

weicheng98
Path Finder

@FrankVI @auraria1 @nittala_surya, Thank you so much for the answer !! I really appreciate it ! It worked !

0 Karma

sudosplunk
Motivator

Hello @weicheng98,

Is it possible to provide some sample events. I think there might be a mistake in your rex statement.

0 Karma

weicheng98
Path Finder

sample event from src_content:

MAIL FROM:
RCPT TO:
DATA

Date: Mon, 12 Mar 2018 15:47:20
From: Alice
User-Agent: Mozilla/5.0

To:Bob.@here.com
Subject: Fw: Order Inquiry
Content-Type: multipart/mixed;
Dear Alice

blah blah blah

0 Karma

weicheng98
Path Finder

Another sample event

MAIL FROM:<>
RCPT TO:
DATA
Received: from htgz ([131.131.131.131])

Message-ID: 20081229155033.5070401@rllss.com
Date: Mon, 29 Dec 2008 15:50:33 -0500
From: "Alice"
User-Agent: Thunderbird

To: chapman@progress1.com
Subject: Xmas of pleasure for your couple!
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

you have problems with your account

0 Karma

auraria1
Path Finder

Is subject in it's own field? if not this makes it a bit more difficult.

You can create a subject field using the following:

| rex field=_raw "Subject:\s(?.*)Content-Type" | stats count by Subject | sort - count

If so it'll make searching wayyyyyy easier, you can add this to a field extraction so this is done by splunk.

In regards to your other question, are you specifically looking for only emails with fw: Order Inquiry as a subject to compare number of emails coming in? Or all subjects?

0 Karma

weicheng98
Path Finder

Hi,as you can see from my sample events, the src content contains these stream of data so that’s why I have to use regex.

I’m trying to compare all subjects where those subjects appear more than once and it will return me the occurrence.

The hard coded Regex is just to show check if I can match that subject in my stream of events.

Is there anyway where I can compare events where the subject appears more than once ?

0 Karma

sudosplunk
Motivator

Thank for the events. Give this a try. The rex here creates a new field called "new_subject".

source=* dest_port=25
| rex field=_raw "Subject\:\s(?<new_subject>.+)"
| eval count=mvcount(new_subject)
| stats sum(count) as totalOccurrence
0 Karma

auraria1
Path Finder

Try this for your regex:

fw:\sorder\sinquiry

0 Karma

auraria1
Path Finder

Wouldn't it be easier to just do a where modifier and by stats?

Try the below, this will create a new field called subject, count based on the subject name, and show only results with more than 2 events.

source=* dest_port=25
| rex field=src_content max_match=0 "(?PSubject: Fw: Order Inquiry)"
| stats count by Subject
| where count > 2

0 Karma

weicheng98
Path Finder

Hi @auraria1, thank you so much ! But how do I improve my query such that my rex isn't a hardcoded match ? for example I want to compare whether two events contains the same title in the src_content, then I return the result ?

I really really appreciate your help as some of my previous questions posted online wasn't answered.

0 Karma

auraria1
Path Finder

Wait I think I misunderstood the original question, is the issue that the regex isn't matching properly?

Is that why you're having issues with the hardcoded regex?

Can you provide 2-3 example email subjects so I can take a look and see why it isn't working?

0 Karma

weicheng98
Path Finder

I also would like to point out that as you said, it will create a new field called subject. Although the number of occurence is correct, but why is it that when when I change the regex, it returned the regex results instead of the subject found in the src_content ?

for example: if I just put:
| rex field=src_content max_match=0 "(?PSubject: Fw: )"

in splunk stream src_content: "Subject: Fw Order Inquiry"
it will return me "Fw:" as the subject returned instead of the matched result in the src_content. Why is that so ?

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...