Getting Data In

Mail headers and received timestamps

Influencer

I'm running a mail delivery test from outside our network to watch for long delays on delivery. Splunk is all set to index the raw mbox files, and it's using the final timestamp as the index time. Currently I'm searching with the following to find messages delayed by more than 5 minutes:

sourcetype="mbox" | rex field=_raw "Date: (?<hdate>.*)\n" | convert timeformat="%a, %d %b %Y %T %z" mktime(hdate) as hdate2 | eval ddelay=_time-hdate2

So it compares the "Date:" header with the index timestamp (matching "From add@ress ").

In the interest of learning new Splunk tricks, would it be possible to compare each "Received from:" header within a message and set a field to show the mail host where the delay was greatest?

Tags (1)
0 Karma
1 Solution

Motivator

I hope, that this will get you in the right direction:

sourcetype="mbox" | rex field=_raw ";[\s\n\r]+(?<rec_date>[\s,:a-zA-Z0-9]*)" max_match=20 | mvexpand rec_date |convert timeformat="%a, %d %b %Y %H:%M:%S " mktime(rec_date) as rec_date2 | delta rec_date2 as delta_rec | delta _time as different_event

I am assuming that you have your mailheaders indexed as mutliline events with something like this in your props.conf

[mbox]
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = From: 
  • max_match will fill the field in rex with more than one value (up to the value you specify)
  • mvexpand will split the multivalue field into seperate events
  • the first delta will calculate the difference between the received times (this will give you a big value for the first received timestamp with every mail event this is a problem but read on)
  • the second delta is here to help filter out the beginning of an event, everytime a new mail event arrives this will be different to 0 so you can filter out the problematic delta values

So by adding:

| where delta_rec = [your threshold] AND different_event=0

You will get the results you want

(The regex to extract the rec_date is not perfect, you'll have to tweak that a bit)

View solution in original post

Motivator

I hope, that this will get you in the right direction:

sourcetype="mbox" | rex field=_raw ";[\s\n\r]+(?<rec_date>[\s,:a-zA-Z0-9]*)" max_match=20 | mvexpand rec_date |convert timeformat="%a, %d %b %Y %H:%M:%S " mktime(rec_date) as rec_date2 | delta rec_date2 as delta_rec | delta _time as different_event

I am assuming that you have your mailheaders indexed as mutliline events with something like this in your props.conf

[mbox]
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE = From: 
  • max_match will fill the field in rex with more than one value (up to the value you specify)
  • mvexpand will split the multivalue field into seperate events
  • the first delta will calculate the difference between the received times (this will give you a big value for the first received timestamp with every mail event this is a problem but read on)
  • the second delta is here to help filter out the beginning of an event, everytime a new mail event arrives this will be different to 0 so you can filter out the problematic delta values

So by adding:

| where delta_rec = [your threshold] AND different_event=0

You will get the results you want

(The regex to extract the rec_date is not perfect, you'll have to tweak that a bit)

View solution in original post

Motivator

Good luck, I changed the formatting, it looks better now

0 Karma

Influencer

Thanks, I'll try this out. (I'm assuming the forum software removed the angle brackets in your rex statement that were likely enclosing rec_date.)

0 Karma

Influencer
0 Karma