Splunk Search

How to correlate 2 sets of logs together in one transaction?

Path Finder

I have 3 mail servers like so, 2 postfix servers and the last one not important Exchange, like so:
Postfix1 -> Postfix2 -> Exchange

Both Postfix server logs are in splunk. I see that when postfix1 hands email to postfix2 it correlates mail with a queued_queue_id

How can I correlate the 2 sets of logs together in one transaction?

Sep  5 11:04:27 nagaviper postfix/smtpd[21218]: B570F448E65: client=unknown[192.168.1.65]
Sep  5 11:04:27 nagaviper postfix/cleanup[21216]: B570F448E65: message-id=<20140905150427.B570F448E65@nagaviper.123.com>
Sep  5 11:04:27 nagaviper postfix/qmgr[20283]: B570F448E65: from=<autonotify@test.com>, size=1771, nrcpt=1 (queue active)
Sep  5 11:04:27 nagaviper postfix/smtp[20729]: B570F448E65: to=<user@123.com>, relay=localhost[127.0.0.1]:10025, delay=0.18, delays=0.04/0/0.05/0.09, dsn=2.0.0, status=sent (250 OK: data received)
Sep  5 11:04:27 nagaviper postfix/qmgr[20283]: B570F448E65: removed

Sep  5 11:04:27 habanero postfix/smtpd[46622]: 8966F601B4: client=businessclass.comcast.net[11.22.33.44]
Sep  5 11:04:27 habanero postfix/cleanup[52613]: 8966F601B4: warning: header Subject: Speed Alert!!! from businessclass.comcast.net[11.22.33.44]; from=<autonotify@test.com> to=<user@123.com> proto=ESMTP helo=<businessclass.comcast.net>
Sep  5 11:04:27 habanero postfix/cleanup[52613]: 8966F601B4: message-id=<>
Sep  5 11:04:27 habanero postfix/qmgr[56439]: 8966F601B4: from=<autonotify@test.com>, size=1418, nrcpt=1 (queue active)
Sep  5 11:04:27 habanero postfix/smtp[52618]: 8966F601B4: to=<user@123.com>, relay=nagaviper.123.com[192.168.1.67]:25, delay=0.32, delays=0.27/0/0/0.04, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as B570F448E65)
Sep  5 11:04:27 habanero postfix/qmgr[56439]: 8966F601B4: removed

I initially had:

| transaction queue_id keepevicted=true maxspan=60s| transaction message_id keepevicted=true maxspan=120s 

Until I found not all mail servers give a message_id and when it is blank as it was in this example it failed to correlate the entry.

Thank you,
Brian

1 Solution

Motivator

In the events of the habanero instance there is a line with the queue id of the nagaviper instance:

Sep  5 11:04:27 habanero postfix/smtp[52618]: 8966F601B4: to=<user@123.com>, relay=nagaviper.123.com[192.168.1.67]:25, delay=0.32, delays=0.27/0/0/0.04, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as B570F448E65)

You could try the following:

| transaction queue_id keepevicted=true maxspan=60s| rex "queued\sas\s(?<matching_id>[^\)]+)" | eval matching_id=coalesce(matching_id,queue_id) | transaction matching_id keepevicted=true
  • The rex command will extract the id in the habanero events and will fail in the nagaviper events
  • The coalesce command will fill matching_id with the first value it will find. Which is matching_id for the habanero events and queue_id for the nagaviper events
  • Then you can use the final transaction cmd

View solution in original post

Motivator

In the events of the habanero instance there is a line with the queue id of the nagaviper instance:

Sep  5 11:04:27 habanero postfix/smtp[52618]: 8966F601B4: to=<user@123.com>, relay=nagaviper.123.com[192.168.1.67]:25, delay=0.32, delays=0.27/0/0/0.04, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as B570F448E65)

You could try the following:

| transaction queue_id keepevicted=true maxspan=60s| rex "queued\sas\s(?<matching_id>[^\)]+)" | eval matching_id=coalesce(matching_id,queue_id) | transaction matching_id keepevicted=true
  • The rex command will extract the id in the habanero events and will fail in the nagaviper events
  • The coalesce command will fill matching_id with the first value it will find. Which is matching_id for the habanero events and queue_id for the nagaviper events
  • Then you can use the final transaction cmd

View solution in original post

Path Finder

coalesce is the way to go. If you have the proper sourcetype, you may have queued_as field, so "| eval matching_id=coalesce(queued_as,queueid) |" will do and you can drop the previous "rex". Be aware that you may get the same queue_id value on both servers within the maxspan timeframe, so consider differentiating your transactions adding the host field too: "| transaction queue_id, host ..."

Splunk can tame the "Postfix logs" beast! 🙂

Motivator

You're welcome

0 Karma

Path Finder

Thank you!

It worked and thank you for the awesome explanation!

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!